You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 538 Next »

<< Zurück zur Dokumentationsstartseite

High Performance Computing

 

Forgot your Password? click here
Add new user (only for SuperMUC-NG)?
click here

Add new IP(only for SuperMUC-NG)?
click here
How to write good LRZ Service Requests? click here


System Status (see also: Access and Overview of HPC Systems)

GREEN = fully operational YELLOW = operational with restrictions (see messages below) RED = not available



Höchstleistungsrechner (SuperMUC-NG)

System: 


login nodes: skx.supermuc.lrz.de

UP

archive nodes: skx-arch.supermuc.lrz.de

UP

File Systems
HOME
WORK
SCRATCH
DSS
DSA


UP

UP
UP
UPUP

Partitions/Queues: 
micro, general, large

fat, test


UP

UP

 Globus Online File Transfer: 

UP

Detailed node status


Details:

Submit an Incident Ticket for the SuperMUC-NG

Add new user? click here

Add new IP? click here


Linux Cluster

CoolMUC-2
lxlogin(1,2,3,4).lrz.de

UP

serial partitions: serial

UP

parallel partitions cm2_(std,large)

UP

cluster cm2_tiny

UP

interactive partition: cm2_inter

UP

c2pap

MAINT

CoolMUC-3

lxlogin(8,9).lrz.de

parallel partition: mpp3_batch

interactive partition: mpp3_inter


MAINT

MAINT

MAINT

teramem, kcs

teramem_inter

kcs

MAINT

MAINT

File Systems

HOME
SCRATCH
DSS
DSA


UP
UP
UP
UP

Detailed node status
Detailed queue status


Details:

Submit an Incident Ticket for the Linux Cluster

Messages for SuperMUC-NG

The problem with the software stack has been resolved.

Access to license servers from SNG compute nodes is back again (ANSYS, StarCCM+, NUMECA, Tecplot,...).

as part of the recent maintenance, we shifted the software stack to a new filesystem. Yesterday, during the late afternoon a degradation occurred that led to segfaulting of applications. The root cause is still under investigation. 

As a temporary remedy, we switched back to the software stack on the previous filesystem yesterday evening (May 19). Today, we confirmed that most applications seem to work well.

We will keep you informed about the progress to a complete solution of the problem.

Apologies for any inconveniences.

Messages for Linux Clusters

A security maintenance impacting all cluster systems has been scheduled to start . Please read https://www.lrz.de/aktuell/ali00940.html for details

There are 4 "new" Remote Visualization (RVS_2021) nodes available. The machines are in production mode. Nodes are operated under Ubuntu OS and NoMachine. Usage is limited to 2 hours and if you need a longer period of time, please file an LRZ Service Request. For more details please refer to the documentation.

Messages for Cloud and other HPC Systems

We have observed and addressed an issue with the LRZ AI Systems that concerned some running user jobs. As of now, newly started jobs should not be affected anymore. 
The work on the LRZ AI Systems to address the recently observed stability issues has been concluded. All users are invited to continue their work. We closely monitor system operation and will provide additional updates if needed. Thank you for your patience and understanding.

We have identified the likely root cause for the ongoing issues with the LRZ AI and MCML Systems following the latest maintenance downtime. We continue work towards a timely resolution and can currently not guarantee uninterrupted & stable system availability. For further details, please see LRZ AI Systems

The LRZ AI and MCML Systems did undergo a maintenance procedure from April 25th to April 27th (both inclusive.) During this period, the system was not available to users. Normal user operation did resume on 2022-04-27 16:30.

  • No labels