High Performance Computing
<< Zurück zur Dokumentationsstartseite
High Performance Computing
Forgot your Password? click here
Add new user (only for SuperMUC-NG)? click here
Add new IP(only for SuperMUC-NG)? click here
How to write good LRZ Service Requests? click here
How to setup two-factor authentication (2FA) on HPC systems? click here
End of Life: CoolMUC-2 and CoolMUC-3 will be switched off on Friday December 13th
New: Virtual "HPC Lounge" to ask question and get advice. Every Wednesday, 2:00pm - 3:00pm
For details and Zoom Link see: HPC Lounge
System Status (see also: Access and Overview of HPC Systems)
GREEN = fully operational YELLOW = operational with restrictions (see messages below) RED = not available = see messages below
Höchstleistungsrechner (SuperMUC-NG) | |
login nodes: skx.supermuc.lrz.de LOGIN | |
archive nodes: skx-arch.supermuc.lrz.de ARCHIVE | |
File Systems | |
Partitions/Queues: FAT TEST | |
Detailed node status | |
Details:
| |
Submit an Incident Ticket for the SuperMUC-NG Add new user? click here Add new IP? click here Questions about 2FA on SuperMUC-NG? click here |
Linux Cluster | |||
CoolMUC-2 | see messages below | ||
lxlogin(1,2,3,4).lrz.de | ISSUES |
| |
serial partition serial_std | DOWN |
| |
serial partition serial_long | DOWN | ||
parallel partitions cm2_(std,large) | DOWN | ||
cluster cm2_tiny | DOWN | ||
interactive partition: cm2_inter | DOWN | ||
c2pap | MOSTLY UP |
| |
C2PAP Work filesystem: /gpfs/work | DOWN | ||
CoolMUC-3 lxlogin(8,9).lrz.de parallel partition: mpp3_batch interactive partition: mpp3_inter | 2FA ISSUES DOWN UP | ||
CoolMUC-4 lxlogin5.lrz.de interactive partition: cm4_inter_large_mem | UP UP | ||
others | |||
teramem_inter | UP |
| |
kcs | MOSTLY UP |
| |
biohpc | UP |
| |
hpda | UP |
| |
File Systems HOME | ISSUES | | |
Details: | |||
|
Compute Cloud and | ||
---|---|---|
Compute Cloud: (https://cc.lrz.de) detailed status: Status | UP | |
LRZ AI Systems | UP | |
Details: | ||
DSS Storage systems |
---|
For the status overview of the Data Science Storage please go to https://doku.lrz.de/display/PUBLIC/Data+Science+Storage+Statuspage |
Messages
see also: Aktuelle LRZ-Informationen / News from LRZ
Messages for all HPC System |
A new software stack (spack/23.1.0) is available on the CoolMUC- 2 and SuperMUC-NG. Release Notes of Spack/23.1.0 Software Stack |
Messages for SuperMUC-NG |
9:00 Maintenance of SuperMUC-NGA hardware failure in the enclosure of the WORK file system requires a maintenance on Tuesday, October 29, 9:00 a.m. Login nodes will be closed before the start of the maintenance. We have set up a reservation in the SLURM scheduler to suspend job processing. All running jobs will terminate regularly beforehand. The system should be back online late afternoon. Maintenance finished. System is back in operation. |
Messages for Linux Clusters |
Legacy SCRATCH File System of CoolMUC-2/3 Broken On severe hardware failures occured on the CoolMUC clusters (SCRATCH filesystem, switches). The old SCRATCH file system ( |
End-of-Life Announcement for CoolMUC-2After 9 years of operation the hardware of CoolMUC-2 can no longer offer reliable service. The system is targeted to be turned off latest Friday . Due to network degradation we can only support single node jobs on a best-effort basis until then. In case of further hardware problems, the shutdown date might be much earlier. |
End-of-Life Announcement of CoolMUC-3Hardware and software support for the Knights Landing nodes and the Omni Path network on CoolMUC-3 (mpp3_batch) has ended several years ago and needs to be decommissioned. The system is targeted to be turned off Friday along with CoolMUC-2. Housing segments attached to CoolMUC-3 will stay in operation. |
New Cluster Segment CoolMUC-4Hardware for a new cluster system, CoolMUC-4, has been delivered and is currently being installed and tested. The cluster comprises some ~12.000 cores based on Intel® Xeon®Platinum 8480+ (Sapphire Rapids). We expect start of user operation beginning of December 2024. |
Messages for Compute Cloud and other HPC Systems |
The AI Systems will be affected by an infrastructure power cut scheduled in November 2024. The following system partitions will become unavailable for 3 days during the specified time frame. We apologise for the inconvenience associated with that. Calendar Week 46, 2024-11-11 - 2024-11-13
The AI Systems (including the MCML system segment) are under maintenance between September 30th and October 2nd, 2024. On these days, the system will not be available to users. Normal user operation is expected to resume during the course of Wednesday, October 2nd. The previously announced scheduled downtime between 2024-09-16 and 2024-09-27 (Calendar Week 38 & 39) has been postponed until further notice. The system will remain in user operation up to the scheduled maintenance at the end of September. |
HPC Services
Attended Cloud Housing |
More Links