Skip to content

5. Demystifying configuration options

Each nf-core pipeline comes with a set of “sensible defaults”. While the defaults are a great place to start, you will almost certainly want to modify these to fit your own purposes and system requirements.

You do not need to edit the pipeline code to configure nf-core pipelines.

When a pipeline is launched, Nextflow will look for configuration files in several locations. As each source can contain conflicting settings, the sources are ranked to decide which settings to apply.

Configuration sources are reported below and listed in order of priority:

  1. Parameters specified on the command line (--parameter)
  2. Parameters that are provided using the -params-file option
  3. Config file that are provided using the -c option
  4. The config file named nextflow.config in the current directory
  5. The config file named nextflow.config in the pipeline project directory
  6. The config file $HOME/.nextflow/config
  7. Values defined within the pipeline script itself (e.g., main.nf)

While some of these files are already included in the nf-core pipeline repository (e.g., the nextflow.config file in the nf-core pipeline repository), some are automatically identified on your local system (e.g., the nextflow.config in the launch directory), and others are only included if they are specified using run options (e.g., -params-file, and -c).

Understanding how and when these files are interpreted by Nextflow is critical for the accurate configuration of a pipelines execution.

5.1 Parameters

Parameters are pipeline specific settings that can be used to customise the execution of a pipeline.

At the highest level, parameters can be customised using the command line. Any parameter can be configured on the command line by prefixing the parameter name with a double dash (--):

--<parameter>

When to use -- and -

Nextflow options are prefixed with a single dash (-) and pipeline parameters are prefixed with a double dash (--).

Depending on the parameter type, you may be required to add additional information after your parameter flag. For example, for a string parameter, you would add the string after the parameter flag:

nextflow nf-core/demo -r dev --<parameter> string

Every nf-core pipeline has a full list of parameters on the nf-core website. When viewing these parameters, you will also be shown a description and the type of the parameter. Some parameters will have additional text to help you understand how a parameter should be used.

Parameters and their descriptions can also be viewed in the command line using the run command with the --help parameter.

Exercise

View the parameters for the nf-core/demo pipeline using the command line:

nextflow run nf-core/demo -r dev --help

5.2 Default configuration files

All parameters will have a default setting that is defined using the nextflow.config file in the pipeline project directory. By default, most parameters are set to null or false and are only activated by a profile or configuration file.

There are also several includeConfig statements in the nextflow.config file that are used to include additional .config files from the conf/ folder. Each additional .config file contains categorized configuration information for your pipeline execution, some of which can be optionally included:

  • base.config
    • Included by the pipeline by default.
    • Generous resource allocations using labels.
    • Does not specify any method for software management and expects software to be available (or specified elsewhere).
  • igenomes.config
    • Included by the pipeline by default.
    • Default configuration to access reference files stored on AWS iGenomes.
  • modules.config
    • Included by the pipeline by default.
    • Module-specific configuration options (both mandatory and optional).
  • test.config
    • Only included if specified as a profile.
    • A configuration profile to test the pipeline with a small test dataset.
  • test_full.config
    • Only included if specified as a profile.
    • A configuration profile to test the pipeline with a full-size test dataset.

Notably, some configuration files contain the definition of profiles.

Profiles used by nf-core pipelines can be broadly categorised into two groups:

  • Software management profiles
    • Profiles for the management of software using software management tools, e.g., docker, singularity, and conda.
  • Test profiles
    • Profiles to execute the pipeline with a standardized set of test data and parameters, e.g., test and test_full.

nf-core pipelines are required to define software containers and environments that can be activated using profiles. Although it is possible to run the pipelines with software installed by other methods (e.g., environment modules or manual installation), using Docker or Singularity is more convenient and reproducible.

5.3 Shared configuration files

An includeConfig statement in the nextflow.config file is also used to include custom institutional profiles that have been submitted to the nf-core config repository. At run time, nf-core pipelines will fetch these configuration profiles from the nf-core config repository and make them available.

For shared resources such as an HPC cluster, you may consider developing a shared institutional profile.

You can follow this tutorial for more help setting up an institutional profile.

5.4 Custom parameter and configuration files

Nextflow will also look for files that are external to the pipeline project directory. These files include:

  • The config file $HOME/.nextflow/config
  • A config file named nextflow.config in your current directory
  • Custom files specified using the command line
    • A parameter file that is provided using the -params-file option
    • A config file that are provided using the -c option

You don't need to use all of these files to execute your pipeline.

Parameter files

Parameter files are .json files that can contain an unlimited number of parameters:

my-params.json
{
    "<parameter1_name>": 1,
    "<parameter2_name>": "<string>",
    "<parameter3_name>": true
}

You can override default parameters by creating a custom .json file and passing it as a command-line argument using the -param-file option.

nextflow run nf-core/demo -r dev -profile singularity -param-file <path/to/params.json>

Exercise

Add the input and outdir parameters to a params file.

Give your "input" the complete path to your sample sheet and give your outdir the name results_customparams.

Start by creating mycustomparams.json and adding your parameters using the format described above:

code mycustomparams.json

Then, add your input and output parameters.

mycustomparams.json
{
"input": "/workspace/gitpod/nf-customize/samplesheet.csv",
"outdir": "results_customparams"
}

Finally, include the custom mycustomparams.json file in your execution command with the -params-file option:

nextflow run nf-core/demo -r dev -profile singularity -params-file mycustomparams.json

Configuration files

Configuration files are .config files that can contain various pipeline properties. Custom paths passed in the command-line using the -c option:

nextflow run nf-core/demo -r dev -profile singularity -params-file mycustomparams.json -c <path/to/custom.config>

Custom configuration files are the same format as the configuration file included in the pipeline directory.

Configuration properties are organised into scopes by dot prefixing the property names with a scope identifier or grouping the properties in the same scope using the curly brackets notation. For example:

custom.config
alpha.x  = 1
alpha.y  = 'string value'

Is equivalent to:

custom.config
alpha {
    x = 1
    y = 'string value'
}

Scopes allow you to quickly configure settings required to deploy a pipeline on different infrastructure using different software management.

For example, the executor scope can be used to provide settings for the deployment of a pipeline on a HPC cluster. Similarly, the singularity scope controls how Singularity containers are executed by Nextflow.

A common scenario is for users to write a custom configuration file specific to running a pipeline on their infrastructure.

Warning

Do not use -c to specify parameters as this will result in errors. Custom config files specified with -c must only be used for tuning process resource specifications, other infrastructural tweaks (such as output directories), or module arguments (args).

Multiple scopes can be included in the same .config file using a mix of dot prefixes and curly brackets.

mycustom.config
executor.name = "sge"

singularity {
    enabled    = true
    autoMounts = true
}

A full list of scopes is described in detail here.

Exercise

Create a custom configuration file and enable singularity and singularity auto mounts using the singularity scope.

Start by creating mycustomconfig.config:

code mycustomconfig.config

Next, add your configuration to the singularity scope:

mycustomconfig.config
singularity {
    enabled    = true
    autoMounts = true
}

Finally, include mycustomconfig.config file in your execution command with the -c option:

nextflow run nf-core/demo -r dev -params-file mycustomparams.json -c mycustomconfig.config

Multiple config files

Multiple custom .config files can be included at execution by separating them with a comma (,).

The process scope allows you to configure pipeline processes and is used extensively to define resources and additional arguments for modules.

By default, process resources are allocated in the conf/base.config file using the withLabel selector:

conf/base.config
process {
    withLabel: BIG_JOB {
        cpus = 16
        memory = 64.GB
    }
}

Similarly, the withName selector enables the configuration of a process by name. By default, module parameters are defined in the conf/modules.config file:

conf/modules.config
process {
    withName: MYPROCESS {
        cpus = 4
        memory = 8.GB
    }
}

While some tool arguments are included as a part of a module. To make modules sharable across pipelines, most tool arguments are defined in the conf/modules.conf file in the pipeline code under the ext.args entry.

Importantly, having these arguments outside of the module also allows them to be customized at runtime.

For example, if you wanted to add arguments to the MULTIQC process in the nf-core/demo pipeline, you could use the process scope and the withName selector:

mycustomconfig.config
process {
    withName : "MULTIQC" {
        ext.args   = { "<your custom parameter>" }
    }

If a process is used multiple times in the same pipeline, an extended execution path of the module may be required to make it more specific:

custom.config
process {
    withName: "NFCORE_DEMO:DEMO:MULTIQC" {
        ext.args = "<your custom parameter>"
    }
}

The extended execution path is built from the pipelines, subworkflows, and module used to execute the process.

Exercise

Modify your existing mycustomconfig.config by adding a process scope with the withName selector. Modify the publishDir path to create a multiqc folder directly inside your working directory:

Start by opening mycustomconfig.config that contains your singularity scope:

code mycustomconfig.config

Next, using a process scope and using the withName selector for MULTIQC, change the publishDir to "multiqc".

custom.config
process {
    withName: 'MULTIQC' {
        publishDir = [
            path: { "multiqc" }
        ]
    }
}

Finally, execute your run command again:

nextflow run nf-core/demo -r dev -profile test,singularity -params-file mycustomparams.json -c mycustomconfig.config

View the multiqc folder inside your working directory:

ls

5.5 Mixing configuration files

It is important to consider how the different configuration options interact during each execution and how you can apply these to minimize mistakes and extra configuration.

Exercise

Execute the nf-core/demo pipeline with your mycustomparams.json file, your mycustomconfig.config file, and a command line flag --outdir finalexecution:

nextflow run nf-core/demo -r dev -params-file mycustomparams.json -c mycustomconfig.config --outdir finalexecution

Note how you now have a new output directory named finalexecution despite the directory being named results_customparams in your custom parameters file.