Online CourseAccelerated Computing with OpenACC and Deep Learning (register via VSC)
Places available65
Date27.04.2021 – 29.04.2021
Price€ 0.00
Registration deadline20.04.2021 23:30


Please register via the VSC page

Registration deadline is Wednesday, March 24, 2021, with priority rules. Acceptance will be approved on March 25, 2021. As long as seats are available there will be an extended registration period without priority rules.


Learn how to accelerate your applications with OpenACC, how to train and deploy a neural network to solve real-world problems, and how to effectively parallelize training of deep neural networks on Multi-GPUs.

The workshop combines a lecture about Accelerated Computing with OpenACC with lectures about Fundamentals of Deep Learning for single and for Multi-GPUs.

You can attend the whole 3-days workshop, but registration for selected days is also possible.

This workshop is organized in cooperation with LRZ (Germany), IT4Innovations (Czechia), and Nvidia.

NVIDIA Deep Learning Institute (DLI) offers hands-on training for developers, data scientists, and researchers looking to solve challenging problems with deep learning.

All instructors are NVIDIA certified University Ambassadors.


1st day, April 27, 2021: Fundamentals of Accelerated Computing with OpenACC

On the 1st day you learn the basics of OpenACC, a high-level programming language for programming on GPUs. Discover how to accelerate the performance of your applications beyond the limits of CPU-only programming with simple pragmas. You’ll learn:

  • How to profile and optimize your CPU-only applications to identify hot spots for acceleration
  • How to use OpenACC directives to GPU accelerate your codebase
  • How to optimize data movement between the CPU and GPU accelerator

Upon completion, you'll be ready to use OpenACC to GPU accelerate CPU-only applications.

2nd day, April 28, 2021: Fundamentals of Deep Learning

Explore the fundamentals of deep learning by training neural networks and using results to improve performance and capabilities.

During this day, you’ll learn the basics of deep learning by training and deploying neural networks. You’ll learn how to:

  • Implement common deep learning workflows, such as image classification and object detection
  • Experiment with data, training parameters, network structure, and other strategies to increase performance and capability
  • Deploy your neural networks to start solving real-world problems

Upon completion, you’ll be able to start solving problems on your own with deep learning.

3rd day, April 29, 2021: Fundamentals of Deep Learning for Multi-GPUs

The computational requirements of deep neural networks used to enable AI applications like self-driving cars are enormous. A single training cycle can take weeks on a single GPU or even years for larger datasets like those used in self-driving car research. Using multiple GPUs for deep learning can significantly shorten the time required to train lots of data, making solving complex problems with deep learning feasible.

On the 3rd day we will teach you how to use multiple GPUs to train neural networks. You'll learn:

  • Approaches to multi-GPUs training
  • Algorithmic and engineering challenges to large-scale training
  • Key techniques used to overcome the challenges mentioned above

Upon completion, you'll be able to effectively parallelize training of deep neural networks using TensorFlow.


For the 1st day basic C/C++ or Fortran programming skills.

For the 2nd and 3rd days a technical background and basic understanding of machine learning concepts.

For the 2nd day, basics in Python will be helpful. Since Python 2.7 is used, the following tutorial can be used to learn the syntax:

For the 3rd day, familiarity with TensorFlow (1.x) and Keras will be a plus as used in the hands-on sessions. For those who did not use these before, you can find tutorials here: Even though the course still uses Tensorflow 1.x, it is not critical for the content. We'll be using Horovod mainly, which would be applicable the same way for Tensorflow 2.x.


The lectures are interleaved with many hands-on sessions using Jupyter Notebooks. The exercises will be done on a fully configured GPU-accelerated workstation in the cloud.




Juan Durillo Barrionuevo (LRZ), Volker Weinberg (LRZ), and Georg Zitzlsberger (IT4Innovations)


The workshop is free of charge for all academic participants.

Please note, that the workshop is exclusively for verifiable students, staff, and researchers from any academic institution (for industrial participants, please contact NVIDIA for industrial specific training).

  • No labels