LRZ AI Systems
MAINTENANCE ANNOUNCEMENT
The LRZ AI Systems (including the MCML system segment) will undergo a maintenance procedure between July 1st and 3rd, 2024. On these days, the system will not be available to users. Normal user operation is expected to resume during the course of July 3rd.
NOTICE
This system is currently in pilot operation.
Multiple system components have been updated and there are various user-facing changes that were introduced during the maintenance procedure on March 11th-14th, 2024. Take note of the following breaking change. For the full list of changes see Maintenance 2024-01 Changelog
Breaking:
enroot start
currently cannot be used directly with a sqsh container image. Instead, it requires an existing container. The following commands show an example of how to create a container and useenroot start
:enroot import <container-tag> # when importing from a registry; skip if local image file is available enroot create --name <container-name> <image-file> # -n; this step may have been skipped previously enroot start <container-name>
Alternatively, use the Pyxis--container-image
option when usingsrun
or in the preamble of your batch script (for additional details see Removed section below).
SYSTEM ACCESS
Access to this system is only granted to existing Linux Cluster accounts upon additional request (see 3. Access and Getting Started). If you have not requested access, you will not be able to use the system. Additionally, the LRZ AI Systems are currently only reachable from within the Munich Scientific Network ("Münchner Wissenschaftsnetz", MWN; including VPN).
JOB SUBMISSION: --gres=gpu:X required
You must always indicate the --gres=gpu option when requesting a GPU resources allocation.
e.g., if you want to use 2 GPUs on a system, you must add --gres=gpu:2 when allocating resources
Documentation
- 1. General Description and Resources
- 2. Storage on the LRZ AI Systems
- 3. Access and Getting Started
- 4. Introduction to Enroot: The Software Stack Provider for the LRZ AI Systems
- 5. Using NVIDIA NGC Containers on the LRZ AI Systems
- 6. Running Applications as Interactive Jobs on the LRZ AI Systems
- 7. Running Applications as Batch Jobs on the LRZ AI Systems
- 8. Multi-GPU Jobs on the LRZ AI Systems
- 9. Creating and Reusing a Custom Enroot Container Image
- 10. Interactive Web Servers on the LRZ AI Systems
- 11. Public Datasets and Containers on the LRZ AI Systems
- 99. AI Systems Announcements