Part 8: Hello nf-core¶
nf-core is a community effort to develop and maintain a curated set of analysis pipelines built using Nextflow.
nf-core provides a standardized set of best practices, guidelines, and templates for building and sharing scientific pipelines. These pipelines are designed to be modular, scalable, and portable, allowing researchers to easily adapt and execute them using their own data and compute resources.
One of the key benefits of nf-core is that it promotes open development, testing, and peer review, ensuring that the pipelines are robust, well-documented, and validated against real-world datasets. This helps to increase the reliability and reproducibility of scientific analyses and ultimately enables researchers to accelerate their scientific discoveries.
nf-core is published in Nature Biotechnology: Nat Biotechnol 38, 276–278 (2020). Nature Biotechnology. An updated preprint is available at bioRxiv.
nf-core pipelines and other components¶
The nf-core collection currently offers over 100 pipelines in various stages of development, 72 subworkflows and over 1300 modules that you can use to build your own pipelines.
Each released pipeline has a dedicated page that includes 6 documentation sections:
- Introduction: An introduction and overview of the pipeline
- Usage: Descriptions of how to execute the pipeline
- Parameters: Grouped pipeline parameters with descriptions
- Output: Descriptions and examples of the expected output files
- Results: Example output files generated from the full test dataset
- Releases & Statistics: Pipeline version history and statistics
You should read the pipeline documentation carefully to understand what a given pipeline does and how it can be configured before attempting to run it.
Pulling an nf-core pipeline¶
One really cool aspect of how Nextflow manages pipelines is that you can pull a pipeline from a GitHub repository without cloning the repository. This is really convenient if you just want to run a pipeline without modifying the code.
So if you want to try out an nf-core pipeline with minimal effort, you can start by pulling it using the nextflow pull
command.
Tip
You can run this from anywhere, but if you feel like being consistent with previous exercises, you can create a hello-nf-core
directory under hello-nextflow
. If you were working through Part 7 (Hello nf-test) before this, you may need to go up one level first.
Whenever you're ready, run the command:
Nextflow will pull
the pipeline's default GitHub branch.
For nf-core pipelines with a stable release, that will be the master branch.
You select a specific branch with -r
; we'll cover that later.
Checking nf-core/demo ...
downloaded from https://github.com/nf-core/demo.git - revision: 04060b4644 [master]
To be clear, you can do this with any Nextflow pipeline that is appropriately set up in GitHub, not just nf-core pipelines. However nf-core is the largest open curated collection of Nextflow pipelines.
Tip
One detail that sometimes trips people up is that the pipelines you pull this way are stored in a hidden assets folder:
So you don't actually see them listed in your working directory.
However, you can view a list of your cached pipelines using the nextflow list
command:
Now that we've got the pipeline pulled, we can try running it!
Trying out an nf-core pipeline with the test profile¶
Conveniently, every nf-core pipeline comes with a test
profile.
This is a minimal set of configuration settings for the pipeline to run using a small test dataset that is hosted on the nf-core/test-datasets repository. It's a great way to try out a pipeline at small scale.
The test
profile for nf-core/demo
is shown below:
This tells us that the nf-core/demo
test
profile already specifies the input parameter, so you don't have to provide any input yourself.
However, the outdir
parameter is not included in the test
profile, so you have to add it to the execution command using the --outdir
flag.
Here, we're also going to specify -profile docker
, which by nf-core convention enables the use of Docker.
Lets' try it!
Changing Nextflow version
Depending on the nextflow version you have installed, this command might fail due to a version mismatch.
If that happens, you can temporarily run the pipeline with a different version than you have installed by adding NXF_VER=version
to the start of your command as shown below:
Here's the console output from the pipeline:
N E X T F L O W ~ version 24.09.2-edge
Launching `https://github.com/nf-core/demo` [naughty_bell] DSL2 - revision: 04060b4644 [master]
------------------------------------------------------
,--./,-.
___ __ __ __ ___ /,-._.--~'
|\ | |__ __ / ` / \ |__) |__ } {
| \| | \__, \__/ | \ |___ \`-._,-`-,
`._,._,'
nf-core/demo 1.0.1
------------------------------------------------------
Input/output options
input : https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/samplesheet/samplesheet_test_illumina_amplicon.csv
outdir : results
Institutional config options
config_profile_name : Test profile
config_profile_description: Minimal test dataset to check pipeline function
Core Nextflow options
revision : master
runName : naughty_bell
containerEngine : docker
launchDir : /workspace/gitpod/hello-nextflow
workDir : /workspace/gitpod/hello-nextflow/work
projectDir : /home/gitpod/.nextflow/assets/nf-core/demo
userName : gitpod
profile : docker,test
configFiles :
!! Only displaying parameters that differ from the pipeline defaults !!
------------------------------------------------------* The pipeline
https://doi.org/10.5281/zenodo.12192442
* The nf-core framework
https://doi.org/10.1038/s41587-020-0439-x
* Software dependencies
https://github.com/nf-core/demo/blob/master/CITATIONS.md
executor > local (7)
[0a/e694d8] NFCORE_DEMO:DEMO:FASTQC (SAMPLE3_SE) [100%] 3 of 3 ✔
[85/4198c1] NFCORE_DEMO:DEMO:SEQTK_TRIM (SAMPLE1_PE) [100%] 3 of 3 ✔
[d8/fe153e] NFCORE_DEMO:DEMO:MULTIQC [100%] 1 of 1 ✔
-[nf-core/demo] Pipeline completed successfully-
Completed at: 28-Oct-2024 03:24:58
Duration : 1m 13s
CPU hours : (a few seconds)
Succeeded : 7
Isn't that neat?
You can also explore the results
directory produced by the pipeline.
results
├── fastqc
│ ├── SAMPLE1_PE
│ ├── SAMPLE2_PE
│ └── SAMPLE3_SE
├── fq
│ ├── SAMPLE1_PE
│ ├── SAMPLE2_PE
│ └── SAMPLE3_SE
├── multiqc
│ ├── multiqc_data
│ ├── multiqc_plots
│ └── multiqc_report.html
└── pipeline_info
├── execution_report_2024-10-28_03-23-44.html
├── execution_timeline_2024-10-28_03-23-44.html
├── execution_trace_2024-10-28_03-14-32.txt
├── execution_trace_2024-10-28_03-19-33.txt
├── execution_trace_2024-10-28_03-20-57.txt
├── execution_trace_2024-10-28_03-22-39.txt
├── execution_trace_2024-10-28_03-23-44.txt
├── nf_core_pipeline_software_mqc_versions.yml
├── params_2024-10-28_03-23-49.json
└── pipeline_dag_2024-10-28_03-23-44.html
If you're curious about what that all means, check out the nf-core/demo pipeline documentation page!
And that's all you need to know for now. Congratulations! You have now run your first nf-core pipeline.
Takeaway¶
You have a general idea of what nf-core offers and you know how to run an nf-core pipeline using its built-in test profile.
What's next?¶
Celebrate and take another break! Next, we'll show you how to take advantage of Seqera Platform to launch and monitor your workflows more conveniently and efficiently on any compute infrastructure.