Large Memory Teramem
Hardware | |
Model | Lenovo ThinkSystem SR850 V2 |
Processors | 4 x Intel Xeon Platinum 8360HL |
Number of nodes | 1 |
Cores per node | 96 |
Hyperthreads per core | 2 |
Core nominal frequency | 3.0 GHz (Range 0.8-4.2 GHz) |
Effective per-core memory | 64 GB |
Memory per Node | 6,144 GB DDR4 |
Software (OS and development environment) | |
Operating system | SLES15 SP4 Linux |
MPI | Intel MPI, alternatively OpenMPI |
Compilers | Intel OneAPI |
Performance libraries | MKL, TBB, IPP |
Tools for performance and correctness analysis | Intel Cluster Tools |
Teramem System for Applications with Extreme Memory Requirements
The node teramem2
is a single node with 6 TBytes main memory. It is part of the normal Linux Cluster infrastructure at LRZ which means that users can access their $HOME and $PROJECT directories as on every other node in the cluster. However, its mode of operation slightly differs from the remaining cluster nodes which can only be used in batch mode. As the teramem2
is the only system at LRZ, which can currently satisfy memory requirements beyond 1 TByte in a single node, users can choose between using the system in batch or interactive mode depending on their specific needs. Both options are described below.
Interactive SLURM shell
An interactive SLURM shells can be generated to execute tasks on the new multi-terabyte HP DL580 system "teramem". The following procedure can be used on one of the login nodes of CoolMUC-2 (note that loading salloc_conf/teramem
will unload any previously loaded non-system modules):
module load salloc_conf/teramem salloc --cpus-per-task=32 --mem=2000000 -M inter module load intel-mpi srun ./my_shared_memory_program.exe
The above commands execute the binary "my_shared_memory_program.exe
" using 32 threads and up to 2 TBytes of memory (the units are MBytes). Additional tuning and resource settings (e.g. OpenMP environment variables) can be explicitly performed before executing the srun command. Please note that the DL580 can also be used by script-driven jobs (see the examples document linked below).
Setting up the environment
Except for the baseline LRZ environment, no modules for compilers or libraries are loaded. If you need a default Intel-based development environment, please issue the command
module load intel-oneapi-compilers intel-mkl intel-mpi
in your interactive shell or shell script.
Batch SLURM script
Shared memory job on teramem2
(Using here 32 logical cores. Note that this system is targeted not for best performance, but for high memory usage. This SLURM script assumes its execution in the desired working directory → usage of relative pathnames.)
#!/bin/bash #SBATCH -o ./myjob.%j.%N.out #SBATCH -D ./ #SBATCH -J My_Jobname #SBATCH --get-user-env #SBATCH --clusters=inter #SBATCH --partition=teramem_inter #SBATCH --mem=2600000mb #SBATCH --cpus-per-task=32 #SBATCH --mail-type=end #SBATCH --mail-user=xyz@xyz.de #SBATCH --export=NONE #SBATCH --time=18:00:00 #-------------------------------------- module load slurm_setup # Load other intended modules here.... export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK ./myprog.exe