Darshan is a scalable HPC I/O characterization tool that is designed to capture an accurate picture of application I/O behavior with a minimum overhead. This includes properties such as patterns of access within files, number of I/O operations, size of operations, etc.
Darshan on LRZ platforms
Darshan is enabled to trace MPI applications’ dynamic executables on SuperMUC Phase 1-2 and Linux Clusters. It is enabled to trace applications that use IBM MPI and Intel MPI.
Enabling Darshan to trace MPI applications
To make use of Darshan, please load the appropriate module. Currently, the default version is 3.1.4 but it is possible to use the last version 3.1.6 as test module.
Set up the variable FORTRAN_PROG in “true” if your program is a Fortran program and false is not is it.
Upload the appropriate library.
Set up the Darshan job identifier with Loadleveler job identifier in SuperMUC Phase 1-2 and set up the environment variable DARSHAN_JOBID to the environment variable name that contain the job identifier of Loadleveler.
Two last steps are not needed in Linux-Cluster because the SLURM job identifier is recognized by Darshan automatically.
Darshan is configured to can select the log path by the environment variable LOGPATH_DARSHAN_LRZ. We recomend you to use the $SCRACTH for Darshan logs. Using the script "darshan-logpath.sh" logs are written to the $SCRATCH/.darshan folder.
Examples for job command files for the SuperMUC
This example is for a Fortran program compiled with IBM MPI.
Extracting the I/O characterization
If the program finishes correctly a log file is located in:
If it requires a detail of the I/O counters and the I/O performance in a text file you can use the following command.
Utility darshan-parser provides full I/O information of the performance and operations.
Option --perf provides output related with performance metrics I/O timing and aggregate bandwidth. Metrics for shared files are reported when all the processes of the parallel application perform I/O. Unique file metrics are reported when each MPI process accesses to a file or a subset of MPI processes accesses to a file.
Aggregate bandwidth agg_perf_by_slowest is the most accurate for shared and unique files. For the I/O timing generally the most accurate is slowest_rank_io_time for unique files and the time_by_slowest for shared files.
Furthermore, timing for metadata, read and write operations can be obtained through the counters CP_F_POSIX_META_TIME, CP_F_POSIX_READ_TIME, CP_F_POSIX_WRITE_TIME at posix level and counters CP_F_MPI_META_TIME, CP_F_MPI_READ_TIME, CP_F_MPI_WRITE_TIME at MPI level.
Please refer to Darshan Web Site for more information about the meaning of I/O counters, other utilities of Darshan, and static tracing.