Intel Performance Libraries (MKL, TBB, IPP, DAAL)

This document gives a short introduction to the usage of highly optimized scientific library routines on the Linux-based HPC systems at LRZ. These performance libraries are provided by the spack software stack.

Version, Platforms, and Licensing

LRZ has licensed the performance libraries from Intel for use on the LRZ HPC systems (Linux Cluster and National Supercomputing System). These libraries are comprised of

  • Math Kernel Libraries (MKL), containing well-optimized implementations of the BLAS and LAPACK interfaces, sparse solvers, support for interval arithmetic, FFT routines, and other functionality.

  • ScaLAPACK and distributed FFT implementations for various MPI flavors.

  • Threading Building Blocks (TBB), which enable the C++ programmer to relatively easily integrate (shared-memory) parallel capability into her/his code. In particular, this package has support for scalable threaded containers (note that the C++ Standard Template Library, STL, is typically not thread-safe).

  • the Integrated Performance Primitives (IPP) containing highly optimized primitive operations used for digital filtering, audio and image processing, etc.

  • the Data Analytics Library (DAAL)

The presently installed versions are listed in the following table (default versions are in bold).

Product

Module name

Versions available

MKL

intel-mkl

2018, 2019, 2020

TBB

intel-tbb

2020

IPP

intel-ipp

2020

DAAL

intel-daal2020
Parallel-studio

intel-parallel-studio

2018, 2019, 2020

For details and up-to-date list use the following command on one of the HPC systems:

> module av intel-mkl intel-tbb intel-ipp intel-daal intel-parallel-studio
------------------------------------------ /lrz/sys/spack/XXX/XXX/modules/x86_64/linux-sles15-x86_64 ------------------------------------------
intel-mkl/2018       intel-mkl/2018-seq    intel-mkl/2019       intel-mkl/2019-seq    intel-mkl/2020       intel-mkl/2020-seq    
intel-mkl/2018-gcc8  intel-mkl/2018.4.274  intel-mkl/2019-gcc8  intel-mkl/2019.5.281  intel-mkl/2020-gcc8  intel-mkl/2020.1.217  

intel-tbb/2020.3  

intel-ipp/2020.2.254  

intel-daal/2020.2.254

intel-parallel-studio/2018  intel-parallel-studio/2020            intel-parallel-studio/cluster.2019.5
intel-parallel-studio/2019  intel-parallel-studio/cluster.2018.4  intel-parallel-studio/cluster.2020.2

Alias names point to a specific release and build version, e.g. intel-mkl/2019 -> intel-mkl/2019.5.281 (check with module alias command). Suffixes indicate support for specific features (see table below).

featuredefault
(no suffix)
suffixesvariants
compiler selectionintel-gcc, -gcc8, -gcc9GNU compiler
7.x.x, 8.x.x, 9.x.x

parallel/serial linkage

OpenMP parallel-seqserial linkage
4/8 byte integerslp64 (4 byte integer)-i8ilp64 (8 byte integer)
cluster librariesnot supported-clusterBLACS, ScaLAPACK, CDFT, CPARDIASO support
MPI libraries (only with -cluster)Intel MPI-openmpiOpenMPI
fftw interfacenot supported-fftwfftw3 interface

If you need a different combination than the default ones provided by the environment modules, you cat easily create your own. You have to provide a ~/.modulerc file in your home directory containing the appropriate alias name, e.g.

> cat ~/.modulerc    
#%Module
# user specific configurations
#            new alias name               target module
module-alias intel-mkl/2020-gcc8-seq-i8   intel-mkl/2020.1.217

> module av intel-mkl
----------------- global/user modulerc -----------------------
intel-mkl/2020-gcc8-seq-i8

Usage

Before using the MKL please load the environment module intel-mkl:

 module load intel-mkl 

Before using the TBB, load the module tbb

 module load intel-tbb  

Before using the IPP, load the module ipp

 module load intel-ipp 

Before using the DAAL, load the module daal

 module load intel-daal

MKL Usage

Linking with Intel MKL

When you are using an Intel compiler, it is usually sufficient to specify the option -mkl[=<serial|parallel|sequential>] in the compile and link command (see compiler documentation). For special cases and for linking with other compilers you can use the environment variables provided by the intel-mkl module. If you need optimized BLAS; LAPACK or other routines provided by MKL, please provide the library location when linking your executable;

Requirement

Linking prescription

static linkage

ifort -parallel -o myprog.exe myprog.o mysub1.o ... $MKL_LIB

dynamic linkage

ifort -parallel -o myprog.exe myprog.o mysub1.o ... $MKL_SHLIB


The intel-mkl environment module provides environment variables that can be used for handling the compilation and linkage process:

variable

supports

example

MKL_INCC header directory include$CC -c -o foo.o ... $MKL_INC foo.c
MKL_LIB

static linkage (C)

$CC -o bar.exe foo.o ... $MKL_LIB
MKL_SHLIB

dynamic linkage (C)

$CC -o bar.exe foo.o... $MKL_SHLIB
MKL_LIB_Fstatic linkage (F77)$FF -o bar.exe foo.o... $MKL_SHLIB_F
MKL_SHLIB_Fdynamic linkage (F77)$FF -o bar.exe foo.o... $MKL_SHLIB_F
MKL_INC_F90Fortran90/95 header directory$F90 -c -o foo.o ... $MKL_F90_INC foo.f90
MKL_LIB_F95Fortran95 static linkage$F90 -o bar.exe foo.o ... $MKL_LIB_F95
MKL_SHLIB_F95Fortran95 dynamic linkage$F90 -o bar.exe foo.o ... $MKL_SHLIB_F95

Here CC corresponds to gcc, icc, or mpicc; F77 and F90 correspond to gfortran, ifort, mpif77 or mpif90, respectively.

If you need an even more specific linking command for the Intel MKL, e.g. to support the Intel TBB threading model or use the PGI compiler, we suggest using the Intel MKL link-line advisor or the MKL link tool command-line tool (see mkl_link_tool -h for further details).

Multi-Threading in MKL

If linked with parallel support, Intel MKL can make use of shared memory parallelism; by default, only a single thread is used. If you wish to use multiprocessing, there are the following possibilities:

  • If you wish to use OpenMP in your own program, but the MKL calls should run single-threaded, please perform the settings

    export OMP_NUM_THREADS=8   # example to set up for execution with 8 threads
    export MKL_SERIAL=yes
    

    If you use the Intel compilers, setting MKL_SERIAL will not be necessary since in this case, the MKL will automatically detect whether it is called from within a parallel region. If a serial MKL version is loaded (see above), this will also enforce single-threaded execution.

  • If MKL should run multi-threaded, please perform the settings

    export OMP_NUM_THREADS=8    # example to set up for execution with 8 threads
    unset MKL_SERIAL
    
  • If your application uses its own (non-OpenMP) threading, it is recommended that MKL calls run single-threaded:

    export MKL_SERIAL=yes  
    

Fortran 90 modules and C interfacing

The MKL contains functionality encapsulated within Fortran 90 modules (the DFTI API). In this case, it is necessary to write an appropriate module reference into the Fortran source code. In the case of DFTI, this would for example be a line of the form

use mkl_dfti 

When compiling your code, you then also need to add the include path for the module information file:

ifort -c -o foo.o ... $MKL_INC foo.f90

The analogous procedure applies for C interfaces, again illustrated for the DFTI example: Specify

#include <mkl_dfti.h> 

and compile with

icc -c -o cfoo.o ... $MKL_INC cfoo.c 

Please check the MKL documentation as well as the directory $MKL_BASE/include for available modules and include files.

 TBB Usage

Compiling and Linking with the TBB

This package can only be used for C++ code; for compilation a command of the form

icpc -c -o cfoo.o ... $INTEL_TBB_INC cfoo.cpp 

is required. Linkage is only possible against shared libraries:

icpc -o myprog.exe ... main.o cfoo.o ... $INTEL_TBB_SHLIB

The following environment variables are provided for debugging and the scalable memory allocator:

TBB_SHLIBTBB library for top performance
TBB_SHLIB_MALLOCoptimized scalable memory allocator library
TBB_SHLIB_DEBUGdebug version of TBB library; build source with -DTBB_DO_ASSERT=1
TBB_SHLIB_MALLOC_DEBUGdebug version of scalable memory allocator library; build source with -DTBB_DO_ASSERT=1

IPP Usage

Compiling and Linking with the IPP

The intel-ipp environment module provides environment variables in a similar manner as the intel-mkl module. Note that a Fortran interface is not available; hence you need to write a C interop (or !DEC$ directive) based interface block yourself if you need to call IPP routines from Fortran. For C, the compilation command is

icc -c -o cfoo.o ... $IPP_INC cfoo.c 

This presupposes that you have inserted appropriate #include entries into your source.

Requirement

Linking prescription

static linkage

icc -o myprog.exe myprog.o mysub1.o ... $INTEL_IPP_LIB

dynamic linkage

icc -o myprog.exe myprog.o mysub1.o ... $INTEL_IPP_SHLIB

Troubleshooting and Feedback

If you find problems with any of the libraries please get in touch with the Service Desk. Here are a few remarks on how to solve certain known problems:

Undefined symbol: _MKL_SERV_lsame (or so):

This may happen if LRZ changes the default library version to a newer release, and the dynamically linked binary cannot cope. Binary compatibility is apparently not always fully supported. Here are your options:

  1. Re-link your executable with the present default MKL version
  2. Do a module switch intel-mkl intel-mkl/x.y, which presupposes that you know the version x.y you originally used
  3. Use the static library in the first place

Documentation

Links to documentation on the Intel website

TBB examples and documentation

When the intel-tbb module is loaded, some example codes are available under $TBB_BASE/examples. PDF and HTML documentation is available in the folder $TBB_DOC. Since the TBB is also available as open-source software, a lot of information is available at the threadingbuildingblocks website (redirects to github).