The nf-core pipeline template is a standardized framework designed to streamline the development of Nextflow-based bioinformatics pipelines.
Creating a pipeline using the nf-core template is greatly simplified by the nf-core tooling. It will help you create a pipeline using the set framework that can be modified to suit your own purposes.
Here, you will use the nf-core template to kickstart your pipeline development using the latest version of Nextflow and the nf-core tooling.
The nf-core pipelines create command makes a new pipeline using the nf-core base template with a pipeline name, description, and author. It is the first and most important step for creating a pipeline that will integrate with the wider Nextflow ecosystem.
nf-corepipelinescreate
Running this command will open a Text User Interface (TUI) for pipeline creation.
Template features can be flexibly included or excluded at the time of creation. You can still use the CLI by providing all values as parameters.
Exercise
Follow these steps create your first pipeline using the nf-core pipelines create TUI:
Run the nf-core pipelines create command
Select Let's go! on the welcome screen
Select Custom on the Choose pipeline type screen
Enter your pipeline details, replacing < YOUR NAME > with your own name, then select Next
GitHub organisation: myorg
Workflow name: myfirstpipeline
A short description of your pipeline: My first pipeline
Name of the main author / authors: < YOUR NAME >
Select Continue on the Template features screen
Select Finish on the Final details screen
Wait for the pipeline to be created, then select Continue
Select Finish without creating a repo on the Create GitHub repository screen
Select Close on the HowTo create a GitHub repository page
If run successfully, you will see a new folder in your current directory named myorg-myfirstpipeline.
The nf-core pipeline template has a main.nf script that calls myfirstpipeline.nf from the workflows folder. The myfirstpipeline.nf file inside the workflows folder is the central pipeline file that is used to bring everything else together.
Instead of having one large monolithic pipeline script, it's broken up into smaller script components, namely, modules and subworkflows:
Modules: Wrappers around a single process
Subworkflows: Two or more modules that are packaged together as a mini workflow
Within your pipeline repository, modules and subworkflows are stored within local and nf-core folders. The nf-core folder is for components that have come from the nf-core GitHub repository while the local folder is for components that have been developed independently:
Modules from nf-core follow a similar same structure and contain a small number of additional files that are used for testing using nf-test and documentation about the module.
Note
Some nf-core modules are also split into command specific directories:
The nf-core pipeline template utilizes Nextflows flexible customization options and has a series of configuration files throughout the template.
In the template, the nextflow.config file is a central configuration file and is used to set default values for parameters and other configuration options. The majority of these configuration options are applied by default while others (e.g., software dependency profiles) are included as optional profiles.
There are several configuration files that are stored in the conf folder and are added to the configuration by default or optionally as profiles:
base.config: A 'blank slate' config file, appropriate for general use on most high performance compute environments.
igenomes.config: Defines reference genomes using iGenome paths.
igenomes_ignored.config: Empty genomes dictionary to use when igenomes is ignored
modules.config: Additional module directives and arguments.
test.config: A profile to run the pipeline with minimal test data.
test_full.config: A profile to run the pipeline with a full-sized test dataset.
The nextflow_schema.json is a file used to store parameter related information including type, description and help text in a machine readable format. The schema is used for various purposes, including automated parameter validation, help text generation, and interactive parameter form rendering in UI interfaces.
Automated workflows are an important part of the nf-core pipeline template.
By default, the template comes with several automated tests that utilize GitHub Actions, each of which are configured in the .github/workflows folder:
branch.yml: Sets the branch protection for the nf-core repository
ci.yml: Run small pipeline tests with the small test datasets
clean-up.yml: Automated testing for stale and closed GitHub issues and PRs in the nf-core repo
download_pipeline.yml: Test a pipeline download with nf-core pipelines download.
fix-linting.yml: Fix linting by adding a comment to a PR
linting_comment.yml: Triggered after the linting action and posts an automated comment to the PR, even if the PR is coming from a fork
linting.yml: Triggered on pushes and PRs to the repository and runs nf-core pipelines lint and markdown lint tests to ensure that the code meets the nf-core guidelines
release-announcements.yml: Automatic release toot and tweet announcements for nf-core pipeline releases
Many of these tests are only configured for the nf-core repo. However, they can be modified for your repository or ignored if they are superfluous to your requirements.