Resource limits for parallel jobs on Linux Cluster

This subdocument contains a description of constraints under which parallel jobs execute on the cluster systems: maximum run times, maximum memory and other SLURM-imposed parameters.

Resource limits for interactive jobs

Notes:

Please do not use resources in this partition to run regular production jobs! This partition is meant for testing!
A given user account cannot run more than one job at a time.

Partition

Core counts and remarks

Run time limit (hours)

Memory limit (GBytes)

interactive nodes on CooLMUC-2

Maximum number of nodes in a job: 4

2

(default is 15 minutes)

56 per node

interactive nodes on CooLMUC-3

Maximum number of nodes in a job: 3

2

(default is 15 minutes)

~90 DDR per node, plus 16 HBM per node

Resource Limits for batch jobs

The following is an overview of the resource limits imposed for various classes of jobs. These are comprised of run time limits, limits on core counts for parallel jobs, and memory limits. Please consult the SLURM specifications subdocument for a more detailed explanation of parallel environments, in particular how to correctly specify memory requirements. With respect to run time limits it is recommended to always specify a target run time via the --time switch; this in particular for smaller jobs may allow the scheduler to perform backfilling.

The designation "shared memory" for parallel jobs assumes that a number of cores assigned by SLURM will be used by threads; typically a command like export OMP_NUM_THREADS=<number> should be issued to achieve this.
The designation "distributed memory" for parallel jobs assumes that MPI is used to start one single-threaded MPI task per core assigned by SLURM. In principle it is also possible to run hybrid MPI + threaded programs, in which case the number of cores assigned by the system will be equal to the product (# of MPI tasks) * (# of threads), rounded up if necessary.

Job Type	SLURM Cluster	SLURM Partition	Node range	Run time limit (hours)	Memory limit (GByte)
CoolMUC-2: 28-way Haswell-EP nodes with Infiniband FDR14 interconnect and 2 hardware threads per physical core (see also example job scripts)
Small distributed memory parallel (MPI) job	--clusters=cm2_tiny	--partition=cm2_tiny	1-4	72	56 per node
Standard distributed memory parallel (MPI) job	--clusters=cm2	--partition=cm2_std	3-24	72	56 per node
Large distributed memory parallel (MPI) job	--clusters=cm2	--partition=cm2_large	25-64	48	56 per node
Shared memory parallel job	--clusters=cm2_tiny	--partition=cm2_tiny	1	72	56
CoolMUC-3: 64-way Knight's Landing 7210F nodes with Intel Omnipath 100 interconnect and 4 hardware threads per physical core (see also example job scripts)
Distributed memory parallel job	--clusters=mpp3	--partition=mpp3_batch (optional)	1-32	48	~90 DDR per node, plus 16 HBM per node
Teramem: HP DL580 shared memory system (see also example job scripts)
Shared memory thread-parallel job	--clusters=inter	Specify --partition=teramem_inter as well as the number of cores needed by the executable(s) to be started.	1 (up to 64 logical cores)	48 (default 8)	~60 per physical core (each physical core has 2 hyperthreads)

If a job appears to not use resources properly, it will be terminated at LRZ staff's or surveillance system's discretion.

Resource limits on housed systems

The clusters and partitions listed in this section are only available for institutes that have a housing contract with LRZ.

Job Type	Architecture	Core counts and remarks	Run time limit (hours)	Memory limit (GByte)
Distributed memory parallel (MPI) jobs	28-way Haswell-EP nodes with Infiniband FDR14 interconnect	Please specify the cluster --clusters=tum_chem and one of the partitions --partition=[tum_chem_batch, tum_chem_test] Up to 392 core jobs are possible (56 in the test queue). Dedicated to TUM Chemistry.	384 (test queue: 12)	2 per task (in MPP mode, using 1 physical core/task)
Distributed memory parallel (MPI) jobs	28-way Haswell-EP nodes with Infiniband FDR14 interconnect	Please specify the cluster --clusters=hm_mech Up to 336 core jobs are possible (if hyperthreading is exploited, double that number) Dedicated to Hochschule München Mechatronics	336	18 per task (in MPP mode, using 1 physical core/task)
Serial or shared memory jobs	28-way Haswell-EP nodes with Ethernet interconnect	Please specify the cluster --clusters=tum_geodesy Dedicated to TUM Geodesy	240	2 per task / 60 per node
Shared memory parallel job	Intel- or AMD-based shared memory systems	Please specify the cluster --clusters=myri as well as one of the partitions --partition=myri_[p,u] Dedicated to TUM Mathematics	144	3.9 per core

Details on Policies

Policies for interactive jobs

Limitations

On login shells, parallel programs should not be started directly. Please always use the salloc command to initialize a time-limited interactive parallel environment. Note that the shell initialized by the salloc command will still run on the login node, but executables started with srun (or mpiexec) will be started up on the interactive partitioned which was assigned.

Policies for queued batch jobs

General restrictions

The job name should not exceed 10 characters. If no job name is specified, please do not use excessively long script names.
Do not use the xargs command to generate command line arguments at submission time. Instead, generate any necessary arguments inside your script.

Scheduling

For parallel jobs, it is recommended to explicitly specify the run time limit. This may shorten the waiting time, since the job might be run in backfill mode (in other words: use resources that are free while the scheduler tries to fit another large job into the system). Your specification gives the scheduler the information required to organize this.

Jobs in Hold

Jobs in user hold will be removed at the LRZ administrators' discretion if older than 8 weeks.

Job Submissions

Submission of large numbers of jobs (>100, including array jobs) with very short run time (< 1min) is considered a misuse of resources. It causes both waste of computational resources and - if mail notifications are used - disruption of the notification system. Users that submit such jobs will be banned from further use of the batch system. Bundle the individual jobs into a much bigger one!
There are maximum numbers of jobs that can be submitted by a user. These limits are different for each cluster and may change over time, depending on the cluster load.

Cluster	Limit on job submission	Limit on running jobs
inter	2	1
rvs	1	1
mpp3	unlimited	50
serial	250	100
cm2_tiny	50	10
cm2	50	4 / 2 *

*4 jobs on cm2_std, 2 jobs on cm2_large

Memory use

Jobs exceeding the physical memory available on the selected node(s) will be removed, either by SLURM itself, or the OOM ("out of memory") killer in the operating system kernel, or at LRZ's discretion since such a usage typically has a negative impact on system stability.

Limits on queued jobs

In order to prevent monopolization of the clusters by a single user, a limit of 50 queued jobs is imposed on both CooLMUC-2 and CooLMUC-3 These limits may change over time, dependent on the cluster load.

Software licenses

Many commercial software packages have been licensed for usage on the cluster; most of these require the use of so-called floating licenses, only a limited amount of which are typically available. Since it is not possible to check whether a license is available before a batch job starts, LRZ cannot provide any guarantees that a batch job requesting use of such a license will run successfully.