HDF5

HDF5 (Hierarchical Data Format Version 5) is a general purpose library and file format for storing scientific data. HDF5 can store two primary objects: datasets and groups. A dataset is essentially a multidimensional array of data elements, and a group is a structure for organizing objects in an HDF5 file. Using these two basic objects, one can create and store almost any kind of scientific data structure, such as images, arrays of vectors, and structured and unstructured grids. You can also mix and match them in HDF5 files according to your needs.

Installation and Use of HDF5 on LRZ platforms

Linux based HPC Systems

As of April 2022, there is a new software stack 22.2.1 available on CoolMUC2 and SuperMUC-NG. We provide at least one minor version of HDF5 1.8 and 1.10, where you need to be careful as these versions have different formats/APIs.

For the available hdf5 modules you can check yourself via

module avail hdf5


On spack stack 22.2.1 we provide the following modules:

Serial HDF5HDF5 MPI parallel (with Intel-MPI)

hdf5/1.8.22-gcc11

hdf5/1.8.22-intel21

hdf5/1.10.7-gcc11

hdf5/1.10.7-intel19   

hdf5/1.8.22-gcc11-impi

hdf5/1.8.22-intel21-impi

hdf5/1.10.7-gcc11-impi  

hdf5/1.10.7-intel21-impi

The suffixes "-gcc11" and "-intel21" represent the used compilers and the corresponding compiler modules should be loaded when using the modules. The suffix "-impi" stands for the MPI parallel version built with the Intel-MPI standard module.

All packages are built with C, C++ and Fortran  support. To make use of HDF5, please load the appropriate Environment Module

For the parallel version with Intel compiler, e.g. use

module load hdf5/1.10.7-intel21-impi

Then, compile your code with

[mpicc|mpicxx|mpif90] -c $HDF5_INC foo.[c|cc|f90]

and link it with

[mpicc|mpicxx|mpif90] -o myprog foo.o <further objects> [$HDF5_F90_SHLIB|$HDF5_CPP_SHLIB] $HDF5_SHLIB


For a serial version (with Intel compiler), e,g, use

module load hdf5/1.10.7-intel21

Then, compile your code with

[icc|icpc|ifort] -c $HDF5_INC foo.[c|cc|f90]

and link it with

[icc|icpc|ifort] -o myprog.exe foo.o <further objects> [$HDF5_F90_SHLIB|$HDF5_CPP_SHLIB] $HDF5_SHLIB

One of the language support libraries $HDF5_F90_SHLIB or $HDF5_CPP_SHLIB is only required if either Fortran or C++ are used for compiling and linking your application.
For static linking, use $HDF5_..._LIB versions instead of $HDF5_..._SHLIB, but this not recommended. 

Utilities

Loading an HDF5 module typically will also make available command-line utilities e.g., h5copy, h5debug, h5dump etc. It may be advisable to run these utilities using a serial (as opposed to MPI parallel) HDF5 version, since a linked-in MPI library may not work properly in purely interactive usage.

h5utils

h5utils (Github) is a set of utilities for the visualisation and conversion of scientific data in HDF5 format. Besides providing a simple tool for batch visualisation as PNG images, h5utils also includes programs to convert HDF5 datasets into the formats required by other free visualization software (e.g. plain text, Vis5d, and VTK).

h5utils is not part of the HDF5 module, nor is it available directly in the LRZ provided software stack. The recommended procedure to install this software on SuperMUC-NG, CoolMUC-2 and other LRZ managed clusters is to install it via user-spack:

module load user_spack

# Install
spack info h5utils
spack install h5utils

# Load to search path
spack load h5utils
# Unload
spack unload h5utils

Documentation

Please refer to the HDF5 Web Site for documentation of the interface.



H5py (Pythonic Interface to HDF5)

There are several options to install h5py on LRZ systems. One option is using "pip" or "Conda" (see https://doku.lrz.de/display/PUBLIC/Python+for+HPC for details). The other option (and probably preferable) is the installation in your $HOME folder via "user_spack", which is the LRZ adaption of the Spack package management tool. The installation procedure is similar on all systems.


To create a module for h5py you need to specify the hdf5 module you want to work with. Let us assume you want to use the module hdf5/1.10.7-gcc8-impi on CoolMUC2 which is built with GCC as a compiler. For all other hdf5 modules the installation is analogue. To build h5py with this hdf5 version, we need the hash of the Spack installation which can be obtained using "module show":

cm2login3:~> module show hdf5/1.10.7-gcc8-impi | grep BASE

setenv        HDF5_BASE /dss/dsshome1/lrz/sys/spack/release/21.1.1/opt/haswell/hdf5/1.10.7-gcc-2iitq6x

The hash consists of the last seven characters: 2iitq6x Please note: The hashes of the installations differ on all systems. So using the hash from above for an installation on e.g. SuperMUC-NG will fail.

The installation (which you only need to do once if it worked) is then done via the following steps:

Installation

Prerequesites

First we need to create our own spack repository in our home directory in $HOME/spack/repos and copy the modified package.py there.

module unload intel-mkl intel-mpi intel gcc # unload all compiler modules as they are not needed at this point
module switch spack/21.1.1 # unnecessary as soon as the module user_spack/release/22.2.1 is available
module load user_spack

Installation

Now we need the compiler used to build the hdf5 module and the hash of the installation (see above). The general installation command looks like this:

spack install py-h5py%COMPILER ^hdf5/HASH_OF_INSTALLATION

where COMPILER stands for the compiler of the hdf5 module which can be gcc or intel (note: no version numbers needed here) and HASH_OF_INSTALLATION is the installation hash (see above). So for our example this would be

spack install py-h5py%gcc ^hdf5/2iitq6x

Module Creation

If the steps above were successful, we need to create the module for the h5py installation:

spack module tcl refresh -y

The module is then generated in the directory $HOME/spack/modules/x86_avx2/linux-sles15-haswell/ .

Note: The subfolder x86_avx2 in the path $HOME/spack/modules/x86_avx2/linux-sles15-haswell/ to the module differs on other systems. On e.g. SuperMUC-NG the path would be $HOME/spack/modules/x86_avx512/linux-sles15-skylake_avx512/ .

Using the Module

To use the the h5py module, you need to make the module available to the module system and you also need to load the corresponding hdf5 module. The following four lines are the lines that you need to put in your SLURM script:

module use -p $HOME/spack/modules/x86_avx2/linux-sles15-haswell/
module load python/3.8.8-extended
module load hdf5/1.10.7-gcc8-impi
module load py-h5py

 If you encounter any problems, please to contact our Servicedesk