Labs for VTune and Advisor -------------------------- 1. Run your code with APS. check for MPI usage What is the percentage of MPI usage? Which MPI function is using most of the time Is it a Collective? What is the imbalance shown? does you App use also OMP? What is the OMP imbalance How much serial code is shown? Do you know where OMP bottlenecks are expected (if it applies try later VTune for Details) 2. Run Program with VTune. For MPI program use VTune on single rank e.g. rank 0. Run Hotspot analysis - if possible, use the HW drivers. Check GUI for the "traditional" profiling Check Flame Graph output. 3. Run HPC performance analysis. Start with single Rank only. Does your application use OMP. Check of OMP details. 4. Generate Advisor roofline chart Do you see a loop that is a good candidate for Optimization. Did you notice this loop already in VTune experiments? Check if roofline analysis in Advisor GUI shows more information that the roofline.html Alternative ----------- Explore some Gromacs APS output generated on GNR with 86c processor Play with Poisson miniapp : Generate APS,VTUNE,Roofline