PRACE Course: HPC Code Optimisation Workshop 2021

LRZ PRACE  intel  

Learning Goals

Through a sequence of simple, guided examples of code modernisation, the attendees will develop awareness on features of multi and many-core architecture which are crucial for writing modern, portable and efficient applications.

The workshop is a PRACE training event organised by LRZ in cooperation with Intel and NHR@FAU .

Preliminary Agenda


1st day morning

Intro (Volker Weinberg)
Intro to LRZ HPC Systems and Software Stack (Gerald Mathias)
Intel oneAPI Introduction (Edmund Preiss)

1st day afternoon

Intel Compiler & Vectorization (Igor Vorobtsov /Alina Shadrina)

2nd day morning

Roofline Model (Jonathan Coles)
Intel Advisor (Dmitry Tarakanov)

2nd day afternoon

VTune (Michael Steyer)
APS (Dmitry Tarakanov)
IMPI focus on tuning (Michael Steyer)

3rd day morning

LikWid (Carla Guillen/Thomas Gruber)
HPC report (Carla Guillen)

3rd day afternoon

Performance Optimization of CPMD (Gerald Mathias)
Wrapup (Volker Weinberg)


Jonathan Coles, Gerald Mathias, Carla Guillen (LRZ)

Thomas Gruber (NHR@FAU)

Edmund Preiss, Alina Shadrina, Michael Steyer, Dmitry Tarakanov, Igor Vorobtsov (Intel)


  • Momme Allalen (LRZ)

Slides and Exercises

Day 1

Day 2



Day 3

Recommended Access Tools

Login under Windows:

  • Start xming and after that PUTTY
  • Enter host name into the putty host field and click Open.
  • Accept & save host key [only first time]
  • Enter user name and password (provided by LRZ staff) into the opened console.

Login under Mac:

  • Install X11 support for MacOS XQuartz:
  • Open Terminal
  • ssh -Y -l username
  • Use user name and password (provided by LRZ staff)

Login under Linux:

  • Open xterm
  • ssh -Y -l username
  • Use user name and password (provided by LRZ staff)

How to use the CoolMUC-2 System

Login Nodes:


Reservation is only valid during the workshop, for general usage on our Linux Cluster remove the "--reservation=hcow1w21"

  • Submit a job:
    sbatch --reservation=hcow1w21
  • List own jobs:
    squeue –M cm2_tiny
  • Cancel jobs:
    scancel -M cm2_tiny jobid
  • Show reservations:
    sinfo -M cm2_tiny  --reservation
  • Interactive Access:
module load salloc_conf/cm2_tiny
salloc --partition=cm2_tiny --time=00:30:00 --reservation=hcow1w21
srun --reservation=hcow1w21 ./myprog.exe
or: srun --reservation=hcow1w21 --pty bash

Resource limits:

Example OpenMP Batch File

#SBATCH -o /dss/dsshome1/0B/a2c06ae/test.%j.%N.out
#SBATCH -D /dss/dsshome1/0B/a2c06ae
#SBATCH -J test
#SBATCH --clusters=cm2_tiny
#SBATCH --partition=cm2_tiny
#SBATCH --nodes=1-1
#SBATCH --cpus-per-task=28
#SBATCH --get-user-env
#SBATCH --reservation=hcow1w21
#SBATCH --time=02:00:00
module load slurm_setup

Intel Software Stack

The Intel software stack is automatically loaded at login. The Intel compilers are called icc (for C), icpc (for C++) and ifort (for Fortran). They behave similar to the GNU compiler suite (option –help shows an option summary). For reasonable optimisation including SIMD vectorisation, use options -O3 -xavx (you can use -O2 instead of -O3 and sometimes get better results, since the compiler will sometimes try be overly smart and undo many of your hand-coded optimizations).

By default, OpenMP directives in your code are ignored. Use the -qopenmp option to activate OpenMP.

Use mpiexec -n #tasks to run MPI programs. The compiler wrappers' names follow the usual mpicc, mpifort, mpiCC pattern.

Intel OneAPI

The most recent version of the Intel software stack "Intel OneAPI" can be loaded with

Intel OneAPI software stack
uid@cm2login1:~> module load intel-oneapi
intel-oneapi-mpi: using intel wrappers for mpicc, mpif77, etc

Loading intel-oneapi/2021.4
  Unloading conflict: intel-mpi/2019-intel intel/19.0.5 intel-mkl/2019
  Loading requirement: intel-oneapi-compilers/2021.4.0 intel-oneapi-mkl/2021 
                       intel-oneapi-mpi/2021-intel intel-oneapi-itac/2021.4.0
uid@cm2login1:~> module list
Currently Loaded Modulefiles:
 1) admin/1.0   2) tempdir/1.0   3) lrz/1.0   4) spack/21.1.1   5) intel-oneapi-compilers/2021.4.0   
 6) intel-oneapi-mkl/2021   7) intel-oneapi-mpi/2021-intel   8) intel-oneapi-itac/2021.4.0   
 9) intel-oneapi/2021.4  
uid@cm2login1:~> module av intel-oneapi
-------------- /lrz/sys/spack/.oneapi_rebuild/modules/x86_64/linux-sles15-x86_64 ---------------
intel-oneapi-advisor/2021.4.0    intel-oneapi-ipp/2021.4.0    intel-oneapi-mkl/2021.3.0    
intel-oneapi-ccl/2021.4.0        intel-oneapi-ippcp/2021.4.0  intel-oneapi-mkl/2021.4.0    
intel-oneapi-clck/2021.4.0       intel-oneapi-itac/2021.4.0   intel-oneapi-mpi/2021-gcc    
intel-oneapi-compilers/2021.4.0  intel-oneapi-mkl/2021        intel-oneapi-mpi/2021-intel  
intel-oneapi-dal/2021.4.0        intel-oneapi-mkl/2021-gcc8   intel-oneapi-tbb/2021.4.0    
intel-oneapi-dnn/2021.4.0        intel-oneapi-mkl/2021-seq    intel-oneapi-vpl/2021.6.0    
intel-oneapi-dpcpp-ct/2021.4.0   intel-oneapi-mkl/2021.1.1    intel-oneapi-vtune/2021.7.1  
intel-oneapi-inspector/2021.4.0  intel-oneapi-mkl/2021.2.0    

Upon loading the main intel-oneapi module, the default modules intel, intel-mpi, and intel-mkl are unloaded and replaced by the intel-oneapi-* variants. Further intel-oneapi-xxx modules are available via the module command.

PRACE Survey

Please fill out the PRACE online survey under

This helps us and PRACE to

increase the quality of the courses,

design the future training programme at LRZ and in Europe according to your needs and wishes,

get future funding for training events.