Tuning and Optimization for HPC
Compiler
Node Level Optimization
- see: Intel Software Documentation Library
- Quick-Reference Guide to Optimization with Intel® Compilers
- Tutorial: Using Auto Vectorization
- A Guide to Vectorization with Intel® C++ Compilers
- Guide to Auto-Vectorization
- Requirements for Vectorizable Loops
- Tutorial: Finding Hotspots - Fortran Sample Application, Linux*
- Get Started with Intel® VTune™ Amplifier
- Developing Multithreaded Applications: A Platform Consistent Approach
- Measuring and Understanding Memory Bandwidth
MPI
OpenMP
Improving OpenMP Scaling
- OpenMP home page. The central source of information about OpenMP
- LRZ/RRZE Courses: OpenMP_2day_course.pdf
Tools
- Intel Performance Tools
- Information on Hardware and Topology
- Timing and Profiling
- Hardware Perfomance Counters
- MPI, OpenMP, Parallelization, Vectorization, SIMD Analysis
- Memory Leaks