Fundamentals of Accelerated Computing with CUDA C/C++

Description

The CUDA computing platform enables the acceleration of CPU-only applications to run on the world’s fastest massively parallel GPUs. In this course you experience C/C++ application acceleration by:

  • Accelerating CPU-only applications to run their latent parallelism on GPUs

  • Utilising essential CUDA memory management techniques to optimise accelerated applications

  • Exposing accelerated application potential for concurrency and exploiting it with CUDA streams

  • Leveraging command line and visual profiling to guide and check your work

Upon completion, you’ll be able to accelerate and optimise existing C/C++ CPU-only applications using the most essential CUDA tools and techniques. You’ll understand an iterative style of CUDA development that will allow you to ship accelerated applications fast.

The course is co-organised by LRZ, NHR@FAU and NVIDIA Deep Learning Institute (DLI). All instructors are NVIDIA certified University Ambassadors.

NVIDIA Deep Learning Institute

The NVIDIA Deep Learning Institute delivers hands-on training for developers, data scientists, and engineers. The program is designed to help you get started with training, optimizing, and deploying neural networks to solve real-world problems across diverse industries such as self-driving cars, healthcare, online services, and robotics.

Training Setup

To get started, follow these steps:

  1. Create an NVIDIA Developer account at http://courses.nvidia.com/join Select "Log in with my NVIDIA Account" and then '"Create Account".
  2.  Make sure that WebSockets works for you:
    • Test your Laptop at http://websocketstest.com
    • Under ENVIRONMENT, confirm that '"WebSockets" is checked yes.
    • Under WEBSOCKETS (PORT 80]. confirm that "Data Receive", "Send", and "Echo Test" are checked yes.
  3. lf there are issues with WebSockets, try updating your browser.
    We recommend Chrome, Firefox, or Safari for an optimal performance.
  4. Visit http://courses.nvidia.com/dli-event and enter the event code provided by the instructor.
  5. You're ready to get started. Please complete the survey at the end of the course to share your feedback.

To be able to visualise Nsight System profiler output during the course, please install Nsight System latest version on your local system before the course. The software can be downloaded from https://developer.nvidia.com/nsight-systems.

Agenda

All Times are in CET.

09:00-09:20    Intro

09:20-09:45    Intro CUDA C/C++

09:45-10:45    Accelerating Applications with CUDA C/C++

10:45-11:00   Coffee Break

11:00-12:00    Accelerating Applications with CUDA C/C++

12:00-13:00   Lunch Break

13:00-14:15    Managing Accelerated Application Memory with CUDA Unified Memory and nsys

14:15-14:30   Coffee break

14:30-16:15    Asynchronous Streaming and Visual Profiling for Accelerated Applications with CUDA C/C++

16:15-16:30    Q&A, Final Remarks

Lecturers

Dr. Momme Allalen (LRZ and NVIDIA University Ambassador), Dr. Sebastian Kuckuk (NHR@FAU and NVIDIA University Ambassador)

Slides

intro-2022-v1.pdf

Intro-CUDA

AC_CUDA_C-whole.pdf

NVPROF_UM_all.pdf

NVVP-Streams-UM.pdf

Documentation

Survey

  • Please fill out the online survey under https://survey.lrz.de/index.php/822277?lang=en
  • This helps us to
    • increase the quality of the courses,
    • design the future training programme at LRZ, NHR@FAU and in Europe according to your needs and wishes,
    • get future funding for training events.

Next Steps

Visit the NVIDIA Deep Learning lnstitute's website at https://www.nvidia.com/en-us/training/ to access more training and resources.

  • Start online, self-paced training in deep learning and accelerated computing (using the account you created today).
  • View upcoming workshops around the world and request an onsite workshop at your company or organization.
  • Learn about the University Ambassador Program.


Screen Shot 2017-12-13 at 12.24.46