System Status (see also: Access and Overview of HPC Systems)
GREEN = fully operational YELLOW = operational with restrictions (see messages below) RED = not available
|login nodes: skx.supermuc.lrz.de|
|archive nodes: skx-arch.supermuc.lrz.de|
|Globus Online File Transfer:|
|Detailed node status|
Add new user? click here
Add new IP? click here
|serial partitions: serial|
|parallel partitions cm2_(std,large)|
|interactive partition: cm2_inter|
parallel partition: mpp3_batch
interactive partition: mpp3_inter
Compute Cloud and
Compute Cloud: (https://cc.lrz.de)
detailed status and free slots: https://cc.lrz.de/lrz
|LRZ AI Systems|
END OF LIFE
|Messages for SuperMUC-NG|
The problem with the software stack has been resolved.
as part of the recent maintenance, we shifted the software stack to a new filesystem. Yesterday, during the late afternoon a degradation occurred that led to segfaulting of applications. The root cause is still under investigation.
As a temporary remedy, we switched back to the software stack on the previous filesystem yesterday evening (May 19). Today, we confirmed that most applications seem to work well.
We will keep you informed about the progress to a complete solution of the problem.
Apologies for any inconveniences.
|Messages for Linux Clusters|
A security maintenance impacting all cluster systems has been scheduled to start . Please read https://www.lrz.de/aktuell/ali00940.html for details
There are 4 "new" Remote Visualization (RVS_2021) nodes available. The machines are in production mode. Nodes are operated under Ubuntu OS and NoMachine. Usage is limited to 2 hours and if you need a longer period of time, please file an LRZ Service Request. For more details please refer to the documentation.
Temporarily and as an "experiment" the maximum usage time of a "new" RVS node has been increased to 8 hours.
Messages for Cloud and other HPC Systems
We have observed and addressed an issue with the LRZ AI Systems that concerned some running user jobs. As of now, newly started jobs should not be affected anymore.
The work on the LRZ AI Systems to address the recently observed stability issues has been concluded. All users are invited to continue their work. We closely monitor system operation and will provide additional updates if needed. Thank you for your patience and understanding.
We have identified the likely root cause for the ongoing issues with the LRZ AI and MCML Systems following the latest maintenance downtime. We continue work towards a timely resolution and can currently not guarantee uninterrupted & stable system availability. For further details, please see LRZ AI Systems
The LRZ AI and MCML Systems did undergo a maintenance procedure from April 25th to April 27th (both inclusive.) During this period, the system was not available to users. Normal user operation did resume on 2022-04-27 16:30.