LRZ AI Systems


MAINTENANCE ANNOUNCEMENT

The LRZ AI Systems (including the MCML system segment) will undergo a maintenance procedure between July 1st and 3rd, 2024. On these days, the system will not be available to users. Normal user operation is expected to resume during the course of July 3rd.

NOTICE

This system is currently in pilot operation.

Multiple system components have been updated and there are various user-facing changes that were introduced during the maintenance procedure on March 11th-14th, 2024. Take note of the following breaking change. For the full list of changes see Maintenance 2024-01 Changelog

Breaking:

  • enroot start currently cannot be used directly with a sqsh container image. Instead, it requires an existing container. The following commands show an example of how to create a container and use enroot start:

    enroot import <container-tag>  # when importing from a registry; skip if local image file is available
    enroot create --name <container-name> <image-file>  # -n; this step may have been skipped previously
    enroot start <container-name>

    Alternatively, use the Pyxis --container-image option when using srun or in the preamble of your batch script (for additional details see Removed section below).


SYSTEM ACCESS

Access to this system is only granted to existing Linux Cluster accounts upon additional request (see 3. Access and Getting Started). If you have not requested access, you will not be able to use the system. Additionally, the LRZ AI Systems are currently only reachable from within the Munich Scientific Network ("M√ľnchner Wissenschaftsnetz", MWN; including VPN).

JOB SUBMISSION: --gres=gpu:X required

You must always indicate the --gres=gpu option when requesting a GPU resources allocation.

e.g., if you want to use 2 GPUs on a system, you must add --gres=gpu:2 when allocating resources

Documentation