Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

HideElements
breadcrumbtrue
titletrue
spacelogotrue

<< Zurück zur Dokumentationsstartseite

Lrz box
Picture/images/lrz/Icon_HPC.png
Heading1High Performance Computing

Forgot your Password? click here
Add new user (only for SuperMUC-NG)?
click here

Add new IP(only for SuperMUC-NG)?
click here
How to write good LRZ Service Requests? click here


System Status (see also:   Access and Overview of HPC Systems)

Status
colourGreen
= fully operational
Status
colourYellow
= operational with restrictions (see messages below)
Status
colourRed
= not available

Please read

https://www.lrz.de/aktuell/ali00856.html

for news on the security issue and

ssh - Secure Shell on LRZ HPC Systems

for updated access information



Höchstleistungsrechner (SuperMUC-NG

Due to a security issue no login is possible at the moment

)

System: 


login nodes: skx.supermuc.lrz.de

Status
colour

Red

Yellow
title

DOWN

up

login
archive nodes: skx-arch.supermuc.lrz.de

Status
colour

Red

Yellow
title

offline

up

File Systems

: 


HOME

:


WORK

:


SCRATCH

:


DSS

:


DSA


Status
colourGreen
titleup

Status
colour
Red
Green
title
DOWN
up

Status
colour
Red
Green
title
DOWN
up

Status
colour
Red
Green
title
DOWN
up
Status
colour
Red
Green
title
DOWN
up

Partitions/Queues: 
micro

, fat

, general, large

fat, test


Status
colourGreen
titleup

Status
colour

Red

Green
title

DOWN

up

 Globus Online File Transfer: 

Status
colour

Red

Green
title

offline

up

Detailed node status
click here 


Details:

Submit an Incident Ticket for the SuperMUC-NG

Add new user? click here

Add new IP? click here



Linux Cluster

Due to a security issue no login is possible at the moment

CoolMUC-2
lxlogin(1,2,3,4).lrz.de

Status
colourGreen
title

UP

up

serial partitions: serial

Status
colourGreen
title

UP

up

parallel partitions cm2_(std,large)

Status
colourGreen
title

UP

up

cluster cm2_tiny

Status
colourGreen
title

UP

up

interactive partition: cm2_inter

Status
colourGreen
title

UP

up

c2pap

Status
colourGreen
titleup

CoolMUC-3

login nodes:

lxlogin(8,9).lrz.de

parallel partition: mpp3_batch

interactive

part.

partition: mpp3_inter


Status
colour

Red

Green
title

offline

up

Status
colour

Red

Green
title

offline

up

Status
colour

Red

Green
title

offline

up

teramem
, ivymuc
, kcs

login node: lxlogin10.lrz.de

ivymuc

teramem_inter

kcs

Status
colour

Red

Green
title

offline

UP

Status
colour

Red

Green
title

offline

up

File Systems

HOME
SCRATCH
DSS
DSA


Status
colour

Red

Green
title

offline

up

Status
colour

Red

Green
title

offline

File Systems

HOME, DSS

SCRATCH

up

Status
colourGreen
title

UP

up

Status
colourGreen
title

UP

up

Detailed node status

click here


Detailed queue status


Details:

Submit an Incident Ticket for the Linux Cluster



Compute Cloud and
other HPC Systems

Compute Cloud: (https://cc.lrz.de)

detailed status and free slots: https://cc.lrz.de/lrz

Status
colourGreen
titleup

GPU Cloud (https://datalab.srv.lrz.de)
LRZ AI Systems

Status
colourGreen
titleUP

DGX-1

Status
colourGreen
titleup

DGX-1v

Status
colourGreen
titleUP

RStudio Server

(https://www.rstudio.lrz.de)

Status
colour

Green

Red
title

UP

End of LIfe

Details:

Dokumentation
RStudio Server (LRZ Service)

Submit an Incident Ticket for the Compute Cloud

Submit an Incident Ticket for RStudio Server



Messages for SuperMUC-NG
Messages for Linux ClusterMessages for Cloud and other HPC Systems

as part of the recent maintenance, we shifted the software stack to a new filesystem. Yesterday, during the late afternoon a degradation occurred that led to segfaulting of applications. The root cause is still under investigation. 

As a temporary remedy, we switched back to the software stack on the previous filesystem yesterday evening (May 19). Today, we confirmed that most applications seem to work well.

We will keep you informed about the progress to a complete solution of the problem.

Apologies for any inconveniences.

The maintenance has mostly concluded. Please read https://www.lrz.de/aktuell/ali00938.html for details.

On Monday, April 4, 2022, a new version of the spack-based development and application software stack will be rolled out.

The new spack version will be loaded as default starting April 11, 2022

After that date, you will be still able to switch to the previous spack stack with

> module switch spack spack/21.1.1

We strongly recommend recompiling self-built applications after the roll-out. See also https://doku.lrz.de/display/PUBLIC/Spack+Modules+Release+22.2.1 for details.

Base core frequency of jobs has been set to 2.3GHz. Higher frequencies possible using EAR.

The new hpcreport tool is now available to check job performance and accounting on SuperMUC-NG. Please check out

https://doku.lrz.de/display/PUBLIC/HPC+Report

https://www.lrz.de/aktuell/ali00923.html



Messages for Linux Clusters

There are 4 "new" Remote Visualization (RVS_2021) nodes available. The machines are in production mode. Nodes are operated under Ubuntu OS and NoMachine. Usage is limited to 2 hours and if you need a longer period of time, please file an LRZ Service Request. For more details please refer to the documentation.



Messages for Cloud and other HPC Systems

We have observed and addressed an issue with the LRZ AI Systems that concerned some running user jobs. As of now, newly started jobs should not be affected anymore. 
The work on the LRZ AI Systems to address the recently observed stability issues has been concluded. All users are invited to continue their work. We closely monitor system operation and will provide additional updates if needed. Thank you for your patience and understanding.

We have identified the likely root cause for the ongoing issues with the LRZ AI and MCML Systems following the latest maintenance downtime. We continue work towards a timely resolution and can currently not guarantee uninterrupted & stable system availability. For further details, please see LRZ AI Systems

The LRZ AI and MCML Systems did undergo a maintenance procedure from April 25th to April 27th (both inclusive.) During this period, the system was not available to users. Normal user operation did resume on 2022-04-27 16:30.



More Links

Children Display