Please read this through completely before starting on our cluster systems!!

Disclaimer: We are neither nextflow users, nor developers. We can help with the integration of nextflow into the LRZ HPC/Slurm workflows. The rest we must map back to the nextflow user forum(s), or to the nextflow developers through you - who is interested in using this tool!

What is Nextflow?

Nextflow is a tool for complex workflow control, and task farming - meant as simplification and standardization for users.

Starting with such high ideals, the reality is probably more differentiated.
Nextflow allows interfacing to diverse resource manager and schedulers. It interfaces conda and container environments. The workflow tree description happens in a domain specific language (DSL) in Groovy.
This way, it is really versatile and flexible. But it is also somewhat obscuring the underlying complexity related to the handling of schedulers, and environments/containers, or just the interaction of the tasks with the hardware on an HPC compute node.
All this also matters when one wants to be successful.

Installation

As shown below, we provide some nextflow modules. Please check with module av nextflow.

Should you need some different version, installation of nextflow is rather simple. Follow this link, select the desired version, download nextflow-<version>-dist or nextflow-<version>-all. Make it executable, chmod u+x nextflow-<version>-dist. Ready.
Try ./nextflow-<version>-dist -v. Should java be missing, just load the openjdk module version 11 or higher. For convenience, one can create an alias like alias nextflow='<path to>/nextflow-<version>-dist' in you ~/.bashrc or ~/.profile.

Interfacing to flux, conda, charliecloud, apptainer/singularity, ... might require you to load additional modules.

Conceptual Use on LRZ Clusters

Basic interactive Usage

We don't want nextflow production runs on Login Nodes!!

The simple background here is that nextflow easily overwhelms a system when it is wrongly configured.
Either, the login node is spilled with local tasks, which disturbs other users. Login nodes are shared nodes!
Or, even if nextflow is correctly configured, it produces many often very small tasks spilling the Slurm server, which is not really contemporary HPC. Or, Slurm servers are regularly polled for status information on a short-period basis. This causes problems on the system side.

Therefore, we offer interactive nodes, where nextflow workflows can be tested. If you are running (for tests, often sufficient) just on a single node, please use the following recipe:

login-node> salloc -N 1
compute-node> module load nextflow
# possibly load other modules here, or setup your work environment
compute-node> nextflow run test.nf

Go sure to use the 'local' executor here (is the default, usually)!

For testing on more than one node, a scheduler is needed. Indeed there exists Slurm-support by nextflow. Unfortunately, only via sbatch as script submitting tool.
The compute nodes at LRZ are NO Slurm submit hosts for security reasons. sbatch does not work on them! Please don't use slurn executor!

On a single node, please conveniently use local executor. On several nodes, please use the flux executor, and use the flux-framework as shown below. (Older versions of nextflow don't support it directly. Please use our modules, therefore!)

login-node> module load nextflow                                  # java and flux are (possibly) loaded as prerequisites
login-node> srun -N 2 -M inter -p cm4_inter --pty flux start      # srun is needed for multi-node operation
compute-node> flux resource info
2 Nodes, 224 Cores, 0 GPUs
compute-node> nextflow run test.nf

 N E X T F L O W   ~  version 24.04.2

Launching `my_script.nf` [astonishing_hypatia] DSL2 - revision: 3b790bbc15

executor >  flux (9)
...

compute-node> <Ctrl+D>

Please use nextflow versions later than 24.10.0! Older versions were (maybe) not modified for correctly working with flux as described above!

In order to set the executor to flux, there are different ways. In a nextflow process, one can set executor 'flux'. Or, you can create a nextflown.config (for project) or a ~/.nextflow/config (globally). Please consult nextflow's documentation on that!

As a complete example, we also show a file test.nf, which contains as executor 'flux'! Furthermore, we decided to use 8 CPUs for the foo process, which is given then to the program via ${task.cpus}, such that it knows how many threads to create.

test.nf (example)

process foo {
  cpus 8
  executor 'flux'

  input:
  val x

  output:
  path 'x.txt'

  """
  /lrz/sys/tools/placement_test_2021/bin/placement-test.omp_only -t ${task.cpus} -d 10 &> x.txt
  """
}

workflow {
  channel.from( 1..100 ) | foo | view { "Result: ${it}" }
}

As common, the results of the nextflow run are located inside the work directory.

Usage in non-interactive Batch Scripts

Having prepared a nextflow file as above the integration into a Slurm job script is then usually simple. To reuse the example from above:

Slurm script with flux and nextflow

#!/bin/bash
#SBATCH -o log.%x.%j.%N.out
#SBATCH -D . 
#SBATCH -J nf_test
#SBATCH --get-user-env 
#SBATCH -M cm4                                       # better start with inter        : if works, go to cm4_tiny
#SBATCH -p cm4_std                                   # better start with cm4_inter    : if works, go to cm4_tiny
#SBATCH --qos=cm4_std                                # adapt, too
#SBATCH --nodes=2                                    # please check! maybe 1 node is already sufficient!
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=112
#SBATCH --hint=nomultithread
#SBATCH --mail-type=none 
#SBATCH --export=NONE 
#SBATCH --time=00:01:00                              # for tests

module load slurm_setup

module load nextflow                                 # java and flux are loaded as prerequisites if necessary 

module load intel                                    # required by placement-test; setup your environment as you need it!
export KMP_AFFINITY=granularity=thread,compact,1,0   # required by placement-test - to get nice thread placement 

cat > my_script.nf << EOT
process foo {
  cpus 8
  executor 'flux'

  input:
  val x

  output:
  path 'x.txt'

  """
  /lrz/sys/tools/placement_test_2021/bin/placement-test.omp_only -t \${task.cpus} -d 10 &> x.txt
  """
}

workflow {
  channel.from( 1..100 ) | foo | view { "Result: \${it}" }
}
EOT

cat > workflow.sh << EOT
nextflow run -resume my_script.nf
flux jobs -a
EOT
chmod u+x workflow.sh

srun --export=all --mpi=none flux start ./workflow.sh

That script should be fully functional, and can be used as a starting point for testing own workflows.

The first section is just the normal Slurm header. Using flux, only --nodes=... and --ntasks-per-node=1 are to be specified. Next, mandatory and optional modules are loaded in order to set the environment.

In the last line (srun ... ), flux is started with some workflow script (that's prescribed by flux-framework). Within this script, we use here now nextflow in order to create nextflow tasks via flux submit (under the hood).

The actual nextflow file can, of course, already be prepared before job submission. We use a HERE document only for a self-contained representation (please consult any bash manual about that).

Remarks:

-resume is only necessary, when the job is resubmitted/requeued. As with all HPC jobs, also a nextflow job may fail (node failure), or simply run into the wallclock time. No reason for panic! On resubmitting the job, nextflow can start from were it stopped before (approximately).
Warning! Some tasks may be excessive in resource requirements ... like runtime, or memory. By this, essentially all parallelism might be prevented, or the whole Slurm job even fail!
Therefore: Test and Know your tasks well before production!! Probably not everything can be handled on our systems!
flux has no mean to express memory requirements, yet (June'24) in a task. One can however achieve a similar thing by specifying cpus in a process' directive section. Knowing that on CoolMUC-2 there are 2 GB per CPU, one can specify several CPUs to get a total amount of necessary memory. If the task's program then does not parallelize on that many threads, start it with less threads (or one, if serial). It is in the user's responsibility to do a correct resource management! Flux only supports the scheduling and distribution of processes to the hardware. But the nextflow user must get the parallelism of the tasks right on its own.

Troubleshooting and FAQs

Q: How do I see that all the simultaneous tasks are really running on different CPUs? (What if I see strong performance drops?)
A: That's definitely not easy, and one usually has to rely on that flux does a good job here. However, the placement-test mentioned several times shows you its own cpu-occupation, and also runtime time-stamps. That's not meant as a permanent monitor, however! But only for tests! Just replace your task's executable by the placement test, let it run, and check it's output. Some day, flux is maybe capable to provide a similar functionality as Slurm's sacct.
Q: Many of my tasks fail! What should I do?
A: When things go wrong, one needs to debug. Look for errors in the task's output! Let the task run independently! Instrument the task with monitors (e.g. wrap it in \time -v ... in order to see excessive memory uses, etc.)
Q: Conda/Container Usage?
A: Conda environments and containers (charliecloud or singularity are supported on LRZ HPC clusters only, currently) are convenient means to install prebuild software. Be aware that these generally don't exploit the cluster hardware performance (e.g. vectorization)! So, are maybe not really performing, i.e. efficient!
nextflow wraps the installation and running of conda environments and containers, which can be a challenge to debug on its own should go something wrong.
Some executable may require MPI parallelism, where the conda installer/container builder installs maybe MPI implementations that are not immediately easy to use on our cluster. Supported is Intel MPI and OpenMPI (but not any version!).
Our general recommendation currently is to avoid conda environments/containers altogether if possible! Better install the software natively. If help is needed, please ask us via the Service Desk!
Still, we are working here also on improvements.
Q: Is there a different version available?
A: We surely cannot install each version. If you really require a specific version, please install it yourself - e.g. via conda, or compile directly. The latter procedure is not that hard:
Installation Procedure Quelle erweitern
```
module load openjdk/11					# or later, if needed
git clone https://github.com/nextflow-io/nextflow.git
cd nextflow
git tag									# check, which version you want
git checkout <version tag>
make compile
make pack
```
Inside build/releases, you then have the executable nextflow-<version>-all or nextflow-<version>-dist. You can set a link to it (ln -s nextflow-<version>-dist nextflow) for convenience, and add this path to PATH environment variable.
Be aware that older versions do not provide the necessary support for flux!

Documentation

Nextflow Documentation

Final Plea

If this documentation requires changes in your opinion, because it is for instance unclear, ambiguous, plain wrong, or in any other way not working or acceptable, please help us to improve it! We are grateful for any remarks, wishes or just comments from your side, and we depend here on your active collaboration.