Large Memory Teramem

Hardware

Model

Lenovo ThinkSystem SR850 V2

Processors

4 x Intel Xeon Platinum 8360HL

Number of nodes

1

Cores per node

96

Hyperthreads per core

2

Core nominal frequency

3.0 GHz (Range 0.8-4.2 GHz)

Effective per-core memory

64 GB

Memory per Node6,144 GB DDR4

Software (OS and development environment)

Operating system

SLES15 SP4 Linux

MPI

Intel MPI, alternatively OpenMPI

Compilers

Intel OneAPI

Performance libraries

MKL, TBB, IPP

Tools for performance and correctness analysis

Intel Cluster Tools

Teramem System for Applications with Extreme Memory Requirements

The node teramem2 is a single node with 6 TBytes main memory. It is part of the normal Linux Cluster infrastructure at LRZ which means that users can access their  $HOME and $PROJECT directories as on every other node in the cluster.  However, its mode of operation slightly differs from the remaining cluster nodes which can only be used in batch mode. As the teramem2 is the only system at LRZ, which can currently satisfy memory requirements beyond 1 TByte in a single node, users can choose between using the system in batch or interactive mode depending on their specific needs. Both options are described below.

Interactive SLURM shell

An interactive SLURM shells can be generated to execute tasks on the new multi-terabyte HP DL580 system "teramem". The following procedure can be used on one of the login nodes of CoolMUC-2 (note that loading salloc_conf/teramem will unload any previously loaded non-system modules):

module load salloc_conf/teramem
salloc --cpus-per-task=32 --mem=2000000 -M inter
module load intel-mpi
srun ./my_shared_memory_program.exe

The above commands execute the binary "my_shared_memory_program.exe" using 32 threads and up to 2 TBytes of memory (the units are MBytes). Additional tuning and resource settings (e.g. OpenMP environment variables) can be explicitly performed before executing the srun command. Please note that the DL580 can also be used by script-driven jobs (see the examples document linked below).

Setting up the environment

Except for the baseline LRZ environment, no modules for compilers or libraries are loaded. If you need a default Intel-based development environment, please issue the command

module load intel-oneapi-compilers intel-mkl intel-mpi

in your interactive shell or shell script.

Batch SLURM script

Shared memory job on teramem2

(Using here 32 logical cores. Note that this system is targeted not for  best performance, but for high memory usage. This SLURM script assumes its execution in the desired working directory → usage of relative pathnames.)

#!/bin/bash
#SBATCH -o ./myjob.%j.%N.out
#SBATCH -D ./
#SBATCH -J My_Jobname 
#SBATCH --get-user-env
#SBATCH --clusters=inter
#SBATCH --partition=teramem_inter
#SBATCH --mem=2600000mb
#SBATCH --cpus-per-task=32
#SBATCH --mail-type=end
#SBATCH --mail-user=xyz@xyz.de
#SBATCH --export=NONE
#SBATCH --time=18:00:00 
#--------------------------------------
module load slurm_setup
# Load other intended modules here....
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
./myprog.exe