Programming Models - Examples
Introduction
This page is supposed to illustrate some of the possible programming models on Intel PVC GPUs. All examples were created and tested with Intel HPC Toolkit 2025.2.0 (intel-toolkit/2025.2.0 from stack/24.5.0).
The recommended way to use them at the moment is therefore
> module sw stack/24.5.0 > module load intel-toolkit/2025.2.0
The examples are only small numerical or computational physics applications - most probably not very efficiently implemented. The goal is only to illustrate the technical approach of GPU, multi-GPU, and multi-node multi-GPU programming and deployment.
OpenMP only - single-node multi-GPU
MPI Offload + (OpenMP/SYCL/DPC++/Kokkos) - multi-node multi-GPU
Learning Material
OpenMP
- Timothy G. Mattson, Beverly A. Sanders, Berna Massingill, "Patterns for Parallel Programming", 2004
- Ruud Van Der Pas, Eric Stotzer, Christian Terboven, "Using OpenMP-The Next Step: Affinity, Accelerators, Tasking, and SIMD", 2017
- Timothy G. Mattson, Yun (Helen) He, Alice E. Koniges, "The OpenMP Common Core: Making OpenMP Simple Again", 2019
- Tom Deakin, Timothy G. Mattson, "Programming Your GPU with OpenMP: Performance Portability for GPUs", 2023
SYCL/DPC++
Data Parallel C++ - Programming Accelerated Systems Using C++ and SYCL (book; open access)
Kokkos
Kokkos Lecture Series (videos, slides)