2022-04-19 Data Analytics, Big Data & AI Training Week (hdta1s22)

Online CourseData Analytics, Big Data & AI Training Week
Numberhdta1s22
Places available276
Date19.04.2022 – 22.04.2022
Price€ 0.00
PlaceONLINE


Room
Registration deadline06.04.2022 23:55
E-maileducation@lrz.de

Contents

This course series on Data Analytics, Big Data & AI Training offers the following course modules which (in parts, see requirements below) build on each other and can be selected individually during registration depending on the previous knowledge and experience of the participants.


19.04.202220.04.202221.04.202222.04.2022
09:00-12:30 CESTModule: Introduction to GNU/Linux and SSHModule: Introduction to Container Technology & Application to AI at LRZIntel® AI Workshop Module #1: Accelerated Machine Learning with Intel®Module: A Closer Look at the LRZ Linux Cluster and Compute Cloud
13:30-17:00 CESTModule: Introduction to Multiuser Cluster Systems at LRZModule: Introduction to the LRZ AI SystemsIntel® AI Workshop Module #2: Accelerated Deep Learning with Intel®Module: High Performance Data Analytics Using R at LRZ



Module: Introduction to GNU/Linux and SSH - Q&A

Date: 19.04.2022, 09:00-12:30 CEST
Lecturers: Dr. Johannes Albert-von der Gönna (LRZ), Dr. Martin Ohlerich (LRZ)

This course module provides an introduction to GNU/Linux, the Unix shell and how to work on remote systems using Secure Shell (SSH). GNU/Linux is a family of open source operating systems, powering all different kinds of hardware: wearable and mobile devices, desktop and notebook computers, the majority of web servers and cloud instances as well as most high performance computing clusters and supercomputes. The typical command line interface is a Unix-like shell. It serves as interactive command and scripting language environment, allowing users to control the system and to automate tasks of varying complexity. SSH is a cryptographic network protocol which is typically used to login and execute commands on remote GNU/Linux systems.

This will be an interactive Q&A session. Registered participants will receive preparatory tutorial material in advance of the training event. They are expected to work with these self-study materials prior to the actual session, which will then be used to highlight certain content, for individual hands-on setup and to answer any remaining questions.

The course material provides a short historical overview of GNU/Linux and some common concepts and terminology will be explained. Then the focus is directed toward working with the Unix shell on a remote system by firstly guiding participants to install and configure a SSH client on their local systems. Different applications for remote access and file transfer will be introduced. Shell commands will then be used to navigate the file system and directories of (remote) systems, then the mechanisms of file manipulation and ownership will be explored. This is followed by the presentation of additional useful commands and concepts, as well as a discussion of certain characteristics of the shell environment. A conceptual and practical introduction to SSH keys will also be given.

Participants will gain essential knowledge and skills necessary to successfully connect to remote systems using SSH and to interact with the command line interface of a GNU/Linux system, a basic requirement when using the LRZ supercomputing and cloud infrastructure for their own projects.


Module: Introduction to Multiuser Cluster Systems at LRZ

Date: 19.04.2022, 13:30-17:00 CEST
Lecturers: Dr. Johannes Albert-von der Gönna (LRZ), Florent Dufour (LRZ)

It has been about 350 years that separate the original 1673 Leibniz mechanical calculator from today's Leibniz Supercomputing Centre's (LRZ) facilities located on the Campus Garching. And yet, the spirit has not changed. To quote the German mathematician:

"It is beneath the dignity of excellent men to waste their time in calculation when any peasant could do the work just as accurately with the aid of a machine."

This course module will allow participants to live up to their dignity by providing a comprehensive walkthrough and usage guide to such contemporary types of these machines that potentially fill whole buildings.
In a general overview, historical and current developments and trends in the space of scientific computing and cluster systems will be presented. This will, amongst others, address the following questions: How do modern cluster systems work and how are they architected? How did we come to High Performance Computing, High Performance Data Analytics and High Performance AI? What makes a system adequate to a specific workload? How are these systems operated and how are they made available to their users?
In addition, typical interaction methods and usage patterns will be covered, including various possibilites of setting up user environments (e.g. environment variables and modules, user space package managers, containers) as well as tools for resource allocation (i.e. Slurm Workload Manager) and efficient parallelization (MPI, OpenMP, ...). Finally, an overview of different compute clusters as well as their background storage systems operated by LRZ will be provided. The requirements for acquiring access to these systems will be covered as well.

Participants will gain a good understanding of the characteristics of multiuser cluster systems in general and will practise basic methods of typical interaction. They will familiarize themselves with the landscape of cluster systems available at LRZ and this will allow them to choose the right system for their own compute projects.

Prerequisites:

  • Module: Introduction to GNU/Linux and SSH (or comparable previous knowledge)


Module: Introduction to Container Technology & Application to AI at LRZ

Date: 20.04.2022, 09:00-12:30 CEST
Lecturers: Florent Dufour (LRZ), Dr. Johannes Albert-von der Gönna (LRZ)

Since the introduction of Docker back in 2013, container technology has become the industry standard for software packaging, distribution, and deployment.

Creating a container consists of bundling an application, its runtime, dependencies, libraries, settings etc. in one single unit that can later run independently of the underlying infrastructure. Unlike virtual machines, containers are lightweight and yield higher performances while providing greater versatility and interoperability. As containers accommodate an easy, safe, reliable, and scalable way to run applications and pipelines, they are an attractive candidate for high performance computing and artificial intelligence workloads.

With this module, we will showcase the most enticing features and niceties offered by containers. Not only will we explore their history and implementations, but we will also dive into actual and cutting edge uses with a particular emphasis on artificial intelligence tasks, reproducible biomedical pipelines, and automated workflows.

Participants will roll up their sleeves and get their hands on virtual machines in the LRZ Compute Cloud to set containers in action. By the end of the course module, participants will be able to transfer their experience and knowledge to their specific use-cases and requirements.

Prerequisites:

  • Module: Introduction to GNU/Linux and SHH (or comparable previous knowledge)


Module: Introduction to the LRZ AI Systems

Date: 20.04.2022, 13:30-17:00 CEST
Lecturers: Maja Piskac (LRZ), Florent Dufour (LRZ)

The aim of this course module is to give an overview of the LRZ AI Systems, and provide participants with the knowledge and skills necessary to efficiently utilise them. The course module consists of mini lectures, demos and hands on sessions (breaks included) covering the following topics:

  • Resources overview of the LRZ AI Systems infrastructure

  • Fundamentals of ML training

  • Distributed ML training

Prerequisites:

  • Module: Introduction to GNU/Linux and SHH (or comparable previous knowledge)
  • Module: Introduction to Multiuser Cluster Systems at LRZ (or comparable previous knowledge)
  • Module: Introduction to Container Technology & Application to AI at LRZ (or comparable previous knowledge)


Intel® AI Workshop Module #1: Accelerated Machine Learning with Intel®

Date: 21.04.2022, 09:00-12:30 CEST
Lecturers: Roy Allela (Intel), Tobias Andreasen (SigOpt/Intel), Dr. Séverine Habert (Intel)

This workshop session lead by Intel® experts will feature sessions covering the following topics, Intel® tools & technologies:

  • Hardware acceleration for AI and Intel® oneAPI AI Analytics Toolkit: Introduction to Intel® hardware AI features and the Intel® oneAPI AI Analytics Toolkit
  • How to accelerate Classical Machine Learning on Intel Architecture: Intel® Distribution for Python and its optimizations, including Modin, Intel® Extension for Scikit-learn and XGBoost.
  • Enhance your Experimentation with SigOpt: A platform that empowers AI modelers to design experiments by asking the right questions, explore experiments to understand their modeling problems, and optimize their experiments to get the best results.


Intel® AI Workshop Module #2: Accelerated Deep Learning with Intel®

Date: 21.04.2022, 13:30-17:00 CEST
Lecturers: Dr. Séverine Habert (Intel), Vladimir Kilyazov (Intel), John Palazza (Intel), Walter Riviera (Intel)

This workshop lead by Intel® experts will feature sessions covering the following topics, Intel® tools & technologies:

  • Optimize Deep Learning on Intel – Same code just faster! Deep Learning with the highly-optimized Intel® oneDNN library in order to get the best-in-class performance on Intel hardware, including Intel-optimized TensorFlow, Intel-optimized PyTorch and the Intel® Extension for PyTorch (IPEX) as well as Deep Learning quantization using Intel® Neural Compressor.
  • Federated Learning with Intel®: Build a real federation that is able to leverage distributed data to train a shared model and to solve potential data access problems due to physcial constraints (i.e. remote locations) or regulations in place (i.e. GDPR, HIPAA, POPIA)
  • Easily speed up Deep Learning inference – Write once deploy anywhere! Intel® Distribution of OpenVINO™ Toolkit allows to optimize for high-performance inference models trained with TensorFlow* or with PyTorch*
  • Simplify your AI Journey with cnvrg.io: Unify and manage all data science in one place using Intel® optimized containers with cnvrg.io. Communicate and reproduce results with interactive workspaces, dashboards, experiment tracking, and more. Simplify building end to end production ready AI pipelines.


Module: A Closer Look at the LRZ Linux Cluster and Compute Cloud

Date: 22.04.2022, 09:00-12:30 CEST
Lecturers: Dr. Johannes Albert-von der Gönna (LRZ), Florent Dufour (LRZ)

In this course module an overview of two additional systems operated by the Leibniz Supercomputing Centre (LRZ) will be provided: the LRZ Linux Cluster as well as the LRZ Compute Cloud.

Firstly, focus will be directed at the different compute and storage components that constitute the LRZ Linux Cluster. These will be explored in a dedicated hands-on session which will cover the characteristics of the system, including details of the environment module system and the Slurm Workload Manager in operation. This will prepare participants to succesfully run their own compute jobs on LRZ HPC systems.

Additionally, the LRZ Compute Cloud will be introduced as general purpose cloud infrastructure. A brief introduction on the fundamentals of cloud computing and Infrastructure as a Service (IaaS) clouds will be followed by an overview of the LRZ Compute Cloud hardware resources. Finally, use of the LRZ Compute Cloud via web interface and command line will be demonstrated and participants will have the opportunity for hands-on interaction with the system. This will provide participants with the knowledge and skills necessary to efficiently utilize the LRZ Compute Cloud infrastructure for their own projects.

Prerequisites:

  • Module: Introduction to GNU/Linux and SSH (or comparable previous knowledge)
  • Module: Introduction to Multiuser Cluster Systems at LRZ (or comparable previous knowledge)


Module: High Performance Data Analytics Using R at LRZ

Date: 22.04.2022, 13:30-17:00 CEST
Lecturers: Dr. Johannes Albert-von der Gönna (LRZ), Maja Piskac (LRZ)

R is a highly popular and powerful programming language for data analysis and graphics, used in many research domains. The Leibniz Supercomputing Centre (LRZ) is addressing the needs of R users by facilitating various ways of working with R on LRZ systems.

R can be employed on the majority of LRZ compute systems like the HPC systems Linux Cluster and SuperMUC-NG as well as on specialized and GPU-accelerated machine learning & AI systems. Additionally, the use of RStudio IDE environments is facilitated, which provide a powerful interactive data analytics platform familiar to many R users.

In this course, the different possibilities of using R at LRZ for high performance data analytics and machine learning projects will be demonstrated and excercised in hands-on sessions. Guidelines and best practice examples for running R applications on the various systems will be provided. Special attention will be paid to different ways of parallelizing R code in order to utilize LRZ's HPC & AI infrastructure.

Prerequisites:

  • Basic knowledge of R
  • Module: Introduction to GNU/Linux and SSH (or comparable previous knowledge)
  • Module: Introduction to Multiuser Cluster Systems at LRZ (or comparable previous knowledge)
  • Module: Introduction to Container Technology & Application to AI at LRZ (or comparable previous knowledge)
  • Module: Introduction to the LRZ AI Systems (or comparable previous knowledge)
  • Module: A Closer Look at the LRZ Linux Cluster and Compute Cloud (or comparable previous knowledge)

Hands-On

Will be utilizing the LRZ AI Systems, Linux Cluster and Compute Cloud.

Language

English

Lecturers

Dr. Johannes Albert-von der Gönna (LRZ), Roy Allela (Intel), Tobias Andreasen (SigOpt/Intel), Florent Dufour (LRZ), Dr. Séverine Habert (Intel), Vladimir Kilyazov (Intel), Dr. Martin Ohlerich (LRZ), John Palazza (Intel), Maja Piskac (LRZ), Walter Riviera (Intel)

Prices and Eligibility

The course is open and free of charge for academic participants from Germany.

Registration

Please register with your official e-mail address to prove your affiliation. You can select the course modules you wish to attend during registration.

Withdrawal Policy

See Withdrawal

Legal Notices

For registration for LRZ courses and workshops we use the service edoobox from Etzensperger Informatik AG (www.edoobox.com). Etzensperger Informatik AG acts as processor and we have concluded a Data Processing Agreement with them.

See Legal Notices


No.DateTimeLeaderLocationRoomDescription
119.04.202209:00 – 12:30Johannes Albert-von der Gönna
Martin Ohlerich
ONLINE
Module: Introduction to GNU/Linux and SSH
219.04.202213:30 – 17:00Johannes Albert-von der Gönna
Florent Dufour
ONLINE
Module: Introduction to Multiuser Cluster Systems at LRZ
320.04.202209:00 – 12:30Johannes Albert-von der Gönna
Florent Dufour
ONLINE
Module: Introduction to Container Technology & Application to AI at LRZ
420.04.202213:30 – 17:00Florent Dufour
Maja Piskac
ONLINE
Module: Introduction to the LRZ AI Systems
521.04.202209:00 – 12:30Johannes Albert-von der GönnaONLINE
Intel® AI Workshop Module #1: Accelerated Machine Learning with Intel®
621.04.202213:30 – 17:00Johannes Albert-von der GönnaONLINE
Intel® AI Workshop Module #2: Accelerated Deep Learning with Intel®
722.04.202209:00 – 12:30Johannes Albert-von der Gönna
Florent Dufour
ONLINE
Module: A Closer Look at the LRZ Linux Cluster and Compute Cloud
822.04.202213:30 – 17:00Johannes Albert-von der Gönna
Maja Piskac
ONLINE
Module: High Performance Data Analytics Using R at LRZ