The definition of module libraries simplifies the writing of complex data analysis pipelines and makes re-use of processes much easier.
hello.nf example from earlier, we will convert the pipeline’s processes into modules, then call them within the workflow scope in a variety of ways.
Nextflow DSL2 allows for the definition of stand-alone module scripts that can be included and shared across multiple workflows. Each module can contain its own
9.1.1 Importing modules¶
Components defined in the module script can be imported into other Nextflow scripts using the
include statement. This allows you to store these components in a separate file(s) so that they can be re-used in multiple workflows.
hello.nf example, we can achieve this by:
- Creating a file called
modules.nfin the top-level directory.
- Copying and pasting the two process definitions for
- Removing the
processdefinitions in the
- Importing the processes from
hello.nfscript anywhere above the
In general, you would use relative paths to define the location of the module scripts using the
modules.nf file with the previously defined processes from
hello.nf. Then remove these processes from
hello.nf and add the
include definitions shown above.
hello.nf script should look similar like this:
You should have the following in the file
We now have modularized processes which makes the code reusable.
9.1.2 Multiple imports¶
If a Nextflow module script contains multiple
process definitions they can also be imported using a single
include statement as shown in the example below:
9.1.3 Module aliases¶
When including a module component it is possible to specify a name alias using the
as declaration. This allows the inclusion and the invocation of the same component multiple times using different names:
Save the previous snippet as
hello.2.nf, and try to guess what will be shown on the screen.
hello.2.nf output should look something like this:
N E X T F L O W ~ version 22.04.3 Launching `hello.2.nf` [goofy_goldstine] DSL2 - revision: 449cf82eaf executor > local (6) [e1/5e6523] process > SPLITLETTERS_one (1) [100%] 1 of 1 ✔ [14/b77deb] process > CONVERTTOUPPER_one (1) [100%] 2 of 2 ✔ [c0/115bd6] process > SPLITLETTERS_two (1) [100%] 1 of 1 ✔ [09/f9072d] process > CONVERTTOUPPER_two (2) [100%] 2 of 2 ✔ WORLD! HELLO WORLD! HELLO
You can store each process in separate files within separate sub-folders or combined in one big file (both are valid). You can find examples of this on public repos such as the Seqera RNA-Seq tutorial or within nf-core pipelines, such as nf-core/rnaseq.
9.2 Output definition¶
Nextflow allows the use of alternative output definitions within workflows to simplify your code.
In the previous basic example (
hello.nf), we defined the channel names to specify the input to the next process:
We have moved the
greeting_ch into the workflow scope for this exercise.
We can also explicitly define the output of one channel to another using the
.out attribute, removing the channel definitions completely:
If a process defines two or more output channels, each channel can be accessed by indexing the
.out attribute, e.g.,
.out, etc. In our example we only have the
Alternatively, the process
output definition allows the use of the
emit statement to define a named identifier that can be used to reference the channel in the external scope.
For example, try adding the
emit statement on the
CONVERTTOUPPER process in your
Then change the workflow scope in
hello.nf to call this specific named output (notice the added
9.2.1 Using piped outputs¶
Another way to deal with outputs in the workflow scope is to use pipes
Try changing the workflow script to the snippet below:
Here we use a pipe which passed the output as a channel to the next process without the need of applying
.out to the process name.
9.3 Workflow definition¶
workflow scope allows the definition of components that define the invocation of one or more processes or operators:
For example, the snippet above defines a
my_pipeline, that can be invoked via another
Make sure that your
modules.nf file is the one containing the
emit on the
A workflow component can access any variable or parameter defined in the outer scope. In the running example, we can also access
params.greeting directly within the
9.3.1 Workflow inputs¶
workflow component can declare one or more input channels using the
take statement. For example:
take statement is used, the
workflow definition needs to be declared within the
The input for the
workflow can then be specified as an argument:
9.3.2 Workflow outputs¶
workflow can declare one or more output channels using the
emit statement. For example:
As a result, we can use the
my_pipeline.out notation to access the outputs of
my_pipeline in the invoking
We can also declare named outputs within the
The result of the above snippet can then be accessed using
9.3.3 Calling named workflows¶
main.nf script (called
hello.nf in our example) we also can have multiple workflows. In which case we may want to call a specific workflow when running the code. For this we use the entrypoint call
The following snippet has two named workflows (
You can choose which workflow to run by using the
9.3.4 Parameter scopes¶
A module script can define one or more parameters or custom functions using the same syntax as with any other Nextflow script. Using the minimal examples below:
|Module script (|
|Main script (|
hello.nf should print:
As highlighted above, the script will print
Hola mundo! instead of
Hello world! because parameters inherited from the including context are overwritten by the definitions in the script file where they're being included.
To avoid being ignored, pipeline parameters should be defined at the beginning of the script before any
addParams option can be used to extend the module parameters without affecting the external scope. For example:
Executing the main script above should print:
9.4 DSL2 migration notes¶
To view a summary of the changes introduced when Nextflow migrated from DSL1 to DSL2 please refer to the DSL2 migration notes in the official Nextflow documentation.