VTune Playbook with templates for Intel tools usage ==================================================== The Playbook contains command lines starting with $ Please change $PRG, $ARG into the path,name and parameters of your program! Version 19.4.2026 Please send feedback/questions to heinrich.bockhorst@intel.com 0. Environment -------------- load environment by using the module: LRZ: $ module switch stack stack/24.6.0; module load intel-toolkit/2025.3.0 $ source /opt/intel/oneapi/setvars.sh check for important executables $ which vtune check version $ vtune -version recent 2025.7 1. Compile for VTune ======================== no extra compilation necessary but "-g" Flag helps for displaying function names and source code. 2. Open GUI ============ $ vtune-gui & All configurations collections can be done with the GUI. This might be more convenient. On clusters it is sometimes necessary to use the command line interface. Starting with the GUI the command line can be generated by the GUI. 3. Command Line Interface - check the MPI instructions in 3b. if necessary ========================= Project can be defined and run from GUI. See workflow in PDF presentation. For more complex codes on clusters the command line interface can be used. $ vtune -help shows basic help menu with hints for more detailed information $ vtune -help collect shows analysis types and short description $ vtune -help collect hotspots details about hotspots knobs 3.a Hotspots =============== This is probably the best collection for testing! $ vtune -c hotspots -r HOT -- $PRG $ARG -c : analysis type -r : result directory $PRG : your program $ARG : program parameters using HW drivers $ vtune -c hotspots -knob sampling-mode=hw -r HOTHW -- $PRG $ARG 3.b HPC-Performance with OpenMP ------------------------------- $ vtune -c hpc-performance -r HPC -- $PRG $ARG further knobs (options): -knob sampling-interval=0.1 (higher sampling frequency, larger output) -knob enable-stack-collection=true (collects stack information, good for unknown programs) -knob collect-affinity=true reports affinity settings on command line: vtune -report affinity -r Usage with Intel MPI Programs ================================ $ export I_MPI_GTOOL="vtune -c hpc-performance -r HPC:0" for analysis on rank #0. run MPI program as usual. $ mpirun -n Analysis on all ranks with :all I_MPI_GTOOL can be used for all types of analysis and also for Advisor! 3.c Memory Access ================= $ vtune -c memory-access -r MA -- $PRG $ARG more detailed information on allocation etc of arrays: $ vtune --collect memory-access -knob analyze-mem-objects=true -r ME -- $PRG $ARG 3.d GPU Analysis -- only intel GPUs =================================== openMP offloading + Dpc++ analysis offload analysis times host api $ vtune -collect gpu-offload -r GPU_OFF $PRG $ARG GPU hotspots for details on kernels and timeline $ vtune -collect gpu-hotspots -r GPU_HOT $PRG $ARG Detailed instrumentation (high overhead) $ vtune -collect gpu-hotspots -knob characterization-mode=instruction-count -r GPU_HOT_INST $PRG $ARG Source line with basic block timing (medium overhead) $ vtune -collect gpu-hotspots -knob profiling-mode=source-analysis -r GPU_HOT_SRC $PRG $ARG Source line with memory instruction timing $ vtune -collect gpu-hotspots -knob profiling-mode=source-analysis -knob source-analysis=mem-latency -r GPU_HOT_MEM $PRG $ARG 4. View results =============== Open GUI $ vtune-gui & and navigate to results directory or do command line analysis $ vtune -report summary -r is the directory generated by hpc-performance see $ vtune -help report for more options see the VTune User Guide: https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top.html