Under Construction

The Linux Cluster documentation is work in progress and will be updated incrementally!

The content of this page will be moved and this page will be deleted soon!

Introductory remarks

The job scripts for SLURM partitions are provided as templates which you can adapt for your own settings. In particular, you should account for the following points:

Some entries are placeholders, which you must replace with correct, user-specific settings. In particular, path specifications must be adapted. Always specify the appropriate directories instead of the names with the three periods in the following examples!
For recommendations on how to do large-scale I/O please refer to the description of the file systems available on the cluster. It is recommended to keep executables within your HOME file system, in particular for parallel jobs. The example jobs reflect this, assuming that files are opened with relative path names from within the executed program.
Because you usually have to work with the environment modules package in your batch script, sourcing the file /etc/profile.d/modules.sh is included in the example scripts.

Shared Memory jobs

This job type uses a single shared memory node of the designated SLURM partition. Parallelization can be achieved either via (POSIX) thread programming or directive-based OpenMP programming.

In the following, example scripts for starting an OpenMP program are provided. Please note that these scripts are usually not useful for MPI applications; scripts for such programs are given in subsequent sections.

On the CoolMUC-4 cluster

#!/bin/bash
#SBATCH -J job_name
#SBATCH -o ./%x.%j.%N.out
#SBATCH -D ./
#SBATCH --get-user-env
#SBATCH --clusters=cm4 
#SBATCH --partition=cm4_tiny
#SBATCH --nodes=1
#SBATCH --cpus-per-task=112
# using hyperthreading, 224 is the maximum reasonable value for CoolMUC-4
#SBATCH --export=NONE
#SBATCH --time=08:00:00 

module load slurm_setup
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

./my_openmp_program.exe

MPI jobs

For MPI documentation please consult the MPI page on the LRZ web server. On current cluster systems, Intel MPI is used as the default environment.

MPI jobs may be jobs that use MPI only for parallelization ("MPP-style"), or jobs that combine usage of MPI and OpenMP ("hybrid")

On the CoolMUC-4 cluster

CoolMUC-4 MPI-only job

CoolMUC-4 hybrid MPI+OpenMP job

#!/bin/bash
#SBATCH -J job_name
#SBATCH -o ./%x.%j.%N.out
#SBATCH -D ./
#SBATCH --get-user-env
#SBATCH --clusters=cm4
#SBATCH --partition=cm4_std 
#SBATCH --qos=cm4_std
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=112 
#SBATCH --export=NONE
#SBATCH --time=08:00:00
 
module load slurm_setup

mpiexec -n $SLURM_NTASKS ./my_mpi_program.exe

The example will start 448 MPI tasks distributed over 4 nodes.

#!/bin/bash
#SBATCH -J job_name
#SBATCH -o ./%x.%j.%N.out
#SBATCH -D ./
#SBATCH --get-user-env
#SBATCH --clusters=cm4
#SBATCH --partition=cm4_std 
#SBATCH --qos=cm4_std
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=112
#SBATCH --export=NONE
#SBATCH --time=08:00:00
 
module load slurm_setup
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

mpiexec -n $SLURM_NTASKS ./my_hybrid_program.exe

The example will start 4 MPI tasks with 112 OpenMP threads each.
In this example, the involvement of hyperthreads via OpenMP could be done by setting OMP_NUM_THREADS to 224

CoolMUC-4 MPI-only TINY job on a single node

#!/bin/bash
#SBATCH -J job_name
#SBATCH -o ./%x.%j.%N.out
#SBATCH -D ./
#SBATCH --get-user-env
#SBATCH --clusters=cm4
#SBATCH --partition=cm4_tiny
#SBATCH --qos=cm4_tiny
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=112 
#SBATCH --export=NONE
#SBATCH --time=08:00:00
 
module load slurm_setup

mpiexec -n $SLURM_NTASKS ./my_mpi_program.exe

Notes

A setup as for the hybrid job can also serve to provide more memory per MPI task without using OpenMP (e.g., by setting OMP_NUM_THREADS=1). Note that this will leave cores unused!
Very small jobs (1 node) must use cm4_tiny instead, larger jobs (up to 4 nodes) must use cm4_std.
Hybrid MPI+OpenMP jobs can also run on a single cm4_tiny node!

General comments

For some software packages, it is also possible to use SLURM's own srun command; this will however not work well in all situations for programs compiled against Intel MPI.
It is also possible to use the --ntasks keyword in combination with --cpus-per-task to configure parallel jobs; this specification replaces the --nodes/--tasks-per-node combination given in the scripts above.

Special job configurations

Job Farming (starting multiple serial jobs on a shared memory system)

Please use this with care! If the serial jobs are imbalanced with respect to run time, this usage pattern can waste CPU resources. At LRZ's discretion, unbalanced jobs may be removed forcibly. The example job script illustrates how to start up multiple serial jobs within a shared memory parallel SLURM script. Note that the various subdirectories subdir_1, ..., subdir_112 must exist and contain the needed input data.

Multi-Serial Example using a single node

#!/bin/bash
#SBATCH -J job_name
#SBATCH -o ./%x.%j.%N.out
#SBATCH -D ./
#SBATCH --get-user-env
#SBATCH --clusters=cm4
#SBATCH --partition=cm4_tiny
#SBATCH --qos=cm4_tiny
#SBATCH --nodes=1
#SBATCH --export=NONE
#SBATCH --time=08:00:00
 
module load slurm_setup

MYPROG=path_to_my_exe/my_serial_program.exe

# Start as many background serial jobs as there are cores available on the node
for ((i=1; i<=$SLURM_NTASKS; i++)); do 
  cd subdir_${i} 
  $MYPROG & 
  cd .. 
done 
wait # for completion of background tasks

For more complex setups, please read the detailed job farming document (it is in the SuperMUC-NG section, but for the most part it applies for the Cluster environment as well) or General Considerations to Job-/Task-Farming.