10. Nextflow configuration¶
A key Nextflow feature is the ability to decouple the workflow implementation by the configuration setting required by the underlying execution platform.
This enables portable deployment without the need to modify the application code.
10.1 Configuration file¶
When a pipeline script is launched, Nextflow looks for a file named
nextflow.config in the current directory and in the script base directory (if it is not the same as the current directory). Finally, it checks for the file:
When more than one of the above files exists, they are merged, so that the settings in the first override the same settings that may appear in the second, and so on.
The default config file search mechanism can be extended by providing an extra configuration file by using the command line option:
-c <config file>.
10.1.1 Config syntax¶
A Nextflow configuration file is a simple text file containing a set of properties defined using the syntax:
Please note that string values need to be wrapped in quotation characters while numbers and boolean values (
false) do not. Also, note that values are typed, meaning for example that,
1 is different from
'1', since the first is interpreted as the number one, while the latter is interpreted as a string value.
10.1.2 Config variables¶
Configuration properties can be used as variables in the configuration file itself, by using the usual
In the configuration file it’s possible to access any variable defined in the host environment such as
10.1.3 Config comments¶
Configuration files use the same conventions for comments used in the Nextflow script:
10.1.4 Config scopes¶
Configuration settings can be organized in different scopes by dot prefixing the property names with a scope identifier or grouping the properties in the same scope using the curly brackets notation. This is shown in the following example:
10.1.5 Config params¶
params allows the definition of workflow parameters that override the values defined in the main workflow script.
This is useful to consolidate one or more execution parameters in a separate file.
Save the first snippet as
nextflow.config and the second one as
params.nf. Then run:
Execute is again specifying the
foo parameter on the command line:
Compare the result of the two executions.
10.1.6 Config env¶
env scope allows the definition of one or more variables that will be exported into the environment where the workflow tasks will be executed.
Save the above snippet as a file named
my-env.config. Then save the snippet below in a file named
Finally, execute the following command:
10.1.7 Config process¶
Process directives allow the specification of settings for the task execution such as
container, and other resources in the pipeline script.
This is useful when prototyping a small workflow script.
However, it’s always a good practice to decouple the workflow execution logic from the process configuration settings, i.e. it’s strongly suggested to define the process settings in the workflow configuration file instead of the workflow script.
process configuration scope allows the setting of any
process directives in the Nextflow configuration file. For example:
The above config snippet defines the
container directives for all processes in your workflow script.
The process selector can be used to apply the configuration to a specific process or group of processes (discussed later).
Memory and time duration units can be specified either using a string-based notation in which the digit(s) and the unit can be separated by a blank or by using the numeric notation in which the digit(s) and the unit are separated by a dot character and are not enclosed by quote characters.
|String syntax||Numeric syntax||Value|
||-||1 hour and 25 seconds|
The syntax for setting
process directives in the configuration file requires
= (i.e. assignment operator), whereas it should not be used when setting the process directives within the workflow script.
This is especially important when you want to define a config setting using a dynamic expression using a closure. For example, in a workflow script:
And the equivalent in the configuration file, if you choose to set it there:
Directives that require more than one value, e.g. pod, in the configuration file need to be expressed as a map object.
Finally, directives that are to be repeated in the process definition, in the configuration files need to be defined as a list object. For example:
10.1.8 Config Docker execution¶
The container image to be used for the process execution can be specified in the
The use of unique "SHA256" Docker image IDs guarantees that the image content does not change over time, for example:
10.1.9 Config Singularity execution¶
To run a workflow execution with Singularity, a container image file path is required in the Nextflow config file using the container directive:
The container image file must be an absolute path: it must start with a
The following protocols are supported:
library://download the container image from the Singularity Library service.
shub://download the container image from the Singularity Hub.
docker://download the container image from the Docker Hub and convert it to the Singularity format.
docker-daemon://pull the container image from a local Docker installation and convert it to a Singularity image file.
shub:// is no longer available as a builder service. Though existing images from before 19th April 2021 will still work.
By specifying a plain Docker container image name, Nextflow implicitly downloads and converts it to a Singularity image when the Singularity execution is enabled.
The above configuration instructs Nextflow to use the Singularity engine to run your script processes. The container is pulled from the Docker registry and cached in the current directory to be used for further runs.
Alternatively, if you have a Singularity image file, its absolute path location can be specified as the container name either using the
-with-singularity option or the
process.container setting in the config file.
Try to run the script as shown below, changing the
nextflow.config file to the one above using
Nextflow will pull the container image automatically, it will require a few seconds depending on the network connection speed.
10.1.10 Config Conda execution¶
The use of a Conda environment can also be provided in the configuration file by adding the following setting in the
You can specify the path of an existing Conda environment as either directory or the path of Conda environment YAML file.