9. Modularization¶
The definition of module libraries simplifies the writing of complex data analysis workflows and makes re-use of processes much easier.
Using the hello.nf
example from earlier, you can convert the workflow’s processes into modules, then call them within the workflow scope.
9.1 Modules¶
Nextflow DSL2 allows for the definition of stand-alone module scripts that can be included and shared across multiple workflows. Each module can contain its own process
or workflow
definition.
9.1.1 Importing modules¶
Components defined in the module script can be imported into other Nextflow scripts using the include
statement. This allows you to store these components in one or more file(s) that they can be re-used in multiple workflows.
Using the hello.nf
example, you can achieve this by:
- Creating a file called
modules.nf
in the top-level directory. - Copying and pasting the two process definitions for
SPLITLETTERS
andCONVERTTOUPPER
intomodules.nf
. - Removing the
process
definitions in thehello.nf
script. - Importing the processes from
modules.nf
within thehello.nf
script anywhere above theworkflow
definition:
hello.nf | |
---|---|
Note
In general, you would use relative paths to define the location of the module scripts using the ./
prefix.
Exercise
Create a modules.nf
file with the SPLITLETTERS
and CONVERTTOUPPER
processes from hello.nf
. Then remove these processes from hello.nf
and include them in the workflow using the include
definitions shown above.
Solution
The hello.nf
script should look similar like this:
Your ./modules.nf
file should look similar to this:
9.1.2 Multiple imports¶
If a Nextflow module script contains multiple process
definitions they can also be imported using a single include
statement as shown in the example below:
hello.nf | |
---|---|
9.1.3 Module aliases¶
When including a module component it is possible to specify a name alias using the as
declaration. This allows the inclusion and the invocation of the same component multiple times using different names:
Note how the SPLITLETTERS
and CONVERTTOUPPER
processes are imported twice, each time with a different alias, and how these aliases are used to invoke the processes:
N E X T F L O W ~ version 23.10.1
Launching `hello.nf` [crazy_shirley] DSL2 - revision: 99f6b6e40e
executor > local (6)
[2b/ec0395] process > SPLITLETTERS_one (1) [100%] 1 of 1 ✔
[d7/be3b77] process > CONVERTTOUPPER_one (1) [100%] 2 of 2 ✔
[04/9ffc05] process > SPLITLETTERS_two (1) [100%] 1 of 1 ✔
[d9/91b029] process > CONVERTTOUPPER_two (2) [100%] 2 of 2 ✔
WORLD!
HELLO
HELLO
WORLD!
Tip
You can store each process in separate files within separate sub-folders or combined in one big file (both are valid). You can find examples of this on public repos such as the Seqera RNA-Seq tutorial or within nf-core workflows, such as nf-core/rnaseq.
9.1.4 Output definition¶
Nextflow allows the use of alternative output definitions within workflows to simplify your code.
In the previous example (hello.nf
), you defined the channel names to specify the input to the next process:
hello.nf | |
---|---|
You can also explicitly define the output of one channel to another using the .out
attribute, removing the channel definitions completely:
hello.nf | |
---|---|
If a process defines two or more output channels, each channel can be accessed by indexing the .out
attribute, e.g., .out[0]
, .out[1]
, etc. In the example below, the [0]'th
output is shown:
hello.nf | |
---|---|
Alternatively, the process output
definition allows the use of the emit
statement to define a named identifier that can be used to reference the channel in the external scope.
In the example below, an emit
statement has been added to the CONVERTTOUPPER
process and is then used in the workflow definition:
9.1.5 Using piped outputs¶
Another way to deal with outputs in the workflow scope is to use pipes |
.
Exercise
Try changing the workflow script to the snippet below:
Here, a pipe passes the output as a channel to the next process without the need of applying .out
to the process name.
Summary
In this step you have learned:
- How to import modules
- How to import multiple modules
- How to use module aliases
- How to use alternative output definitions
- How to use piped outputs
9.2 Workflow definition¶
The workflow
scope allows the definition of components that define the invocation of one or more processes or operators:
For example, the snippet above defines a workflow
named my_workflow
, that is invoked via another workflow
definition.
Note
Make sure that your modules.nf
file is the one containing the emit
on the CONVERTTOUPPER
process.
Warning
A workflow component can access any variable or parameter defined in the outer scope. In the running example, you can also access params.greeting
directly within the workflow
definition.
9.2.1 Workflow inputs¶
A workflow
component can declare one or more input channels using the take
statement. For example:
hello.nf | |
---|---|
Note
When the take
statement is used, the workflow
definition needs to be declared within the main
block.
The input for the workflow
can then be specified as an argument:
9.2.2 Workflow outputs¶
A workflow
can declare one or more output channels using the emit
statement. For example:
As a result, you can use the my_workflow.out
notation to access the outputs of my_workflow
in the invoking workflow
.
You can also declare named outputs within the emit
block.
The result of the above snippet can then be accessed using my_workflow.out.my_data
.
9.2.3 Calling named workflows¶
Within a main.nf
script (called hello.nf
in our example) you can also have multiple workflows. In which case you may want to call a specific workflow when running the code. For this you could use the entrypoint call -entry <workflow_name>
.
The following snippet has two named workflows (my_workflow_one
and my_workflow_two
):
You can choose which workflow to run by using the entry
flag:
Summary
In this step you have learned:
- How to define workflow inputs
- How to define workflow outputs
- How to use named workflows
9.3 DSL2 migration notes¶
To view a summary of the changes introduced when Nextflow migrated from DSL1 to DSL2 please refer to the DSL2 migration notes in the official Nextflow documentation.