Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

HideElements
breadcrumbtrue
titletrue
spacelogotrue

<< Zurück zur Dokumentationsstartseite

Lrz box
Picture/images/lrz/Icon_HPC.png
Heading1High Performance Computing

Forgot your Password? click here
Add new user (only for SuperMUC-NG)?
click here

Add new IP(only for SuperMUC-NG)?
click here
How to write good LRZ Service Requests? click here


System Status (see also:

 

Access and Overview of HPC Systems)

Status
colourGreen
= fully operational
Status
colourYellow
= operational with restrictions (see messages below)
Status
colourRed
= not available



Höchstleistungsrechner (SuperMUC-NG)

System: 


login nodes: skx.supermuc.lrz.de

Status
colour

Green

Yellow
title

UP

up

login
archive nodes: skx-arch.supermuc.lrz.de

Status
colour

Green

Yellow
title

UP

up

File Systems

: 


HOME

:


WORK

:


SCRATCH

:


DSS

:


DSA


Status
colourGreen
titleup

Status
colourGreen
title
UP
up

Status
colourGreen
title
UP
up

Status
colourGreen
title
UP
up
Status
colourGreen
title
UP
up

Partitions/Queues: 
micro

, fat

, general, large

fat, test


Status
colourGreen
titleup

Status
colourGreen
title

UP

up

 Globus Online File Transfer: 

Status
colourGreen
title

UP

up

Detailed node status


Details:

Submit an Incident Ticket for the SuperMUC-NG

Add new user? click here

Add new IP? click here



Linux Cluster

CoolMUC-2
login nodes:

lxlogin(1,2,3,4).lrz.de

Status
colourGreen
title

UPCoolMUC-2 serial, inter SLURM queues

up

serial partitions: serial

Status
colourGreen
titleup

parallel partitions cm2_(std,large)

Status
colourGreen
titleup

cluster cm2_tiny

Status
colourGreen
titleup

interactive partition: cm2_inter

Status
colourGreen
title

UPCoolMUC-2 parallel SLURM queues

up

c2pap

Status
colour

Yellow

Green
title

TEST OP

up

CoolMUC-3

login node:

lxlogin(8,9).lrz.de

SLURM

parallel partition: mpp3_batch

,

interactive partition: mpp3_inter


Status
colourGreen
title

UP

up

Status
colourGreen
titleup

Status
colourGreen
title

UP

up

teramem
, ivymuc
, kcs

login node: lxlogin10.lrz.de
SLURM: ivymuc,

teramem_inter

,

kcs

Status
colourGreen
titleUP

Status
colourGreen
title

UP

up

File Systems

HOME

, DSS
SCRATCH


SCRATCH
DSS
DSA


Status
colourGreen
titleup

Status
colourGreen
titleup

Status
colourGreen
title

UP

up

Status
colourGreen
title

UP

up

Detailed node status

click here


Detailed queue status


Details:

Submit an Incident Ticket for the Linux Cluster



Compute Cloud and
other HPC Systems

Compute Cloud: (https://cc.lrz.de)

Status
colourGreen
titleUP

GPU Cloud (

detailed status and free slots: https://

datalab

cc.

srv.

lrz.de

)

/lrz

Status
colourGreen
title

UPDGX-1

Status
colourGreen
titleup

DGX-1v

up

LRZ AI Systems

Status
colourGreen
titleUP

RStudio Server

(https://www.rstudio.lrz.de)

Status
colour

Green

Red
title

UP

End of LIfe

Details:

Dokumentation
RStudio Server (LRZ Service)

Submit an Incident Ticket for the Compute Cloud

Submit an Incident Ticket for RStudio Server



 See 
Messages for SuperMUC

A short (<1h) interruption of login services has been scheduled to start on at 18:00. A reboot is needed for a system configuration change to take effect.

Messages for Linux Cluster

If you encounter problems, please see:(warning)

CoolMUC-2: Open issues after the Cluster Hardware and Software Upgrade

-NG

as part of the recent maintenance, we shifted the software stack to a new filesystem. Yesterday, during the late afternoon a degradation occurred that led to segfaulting of applications. The root cause is still under investigation. 

As a temporary remedy, we switched back to the software stack on the previous filesystem yesterday evening (May 19). Today, we confirmed that most applications seem to work well.

We will keep you informed about the progress to a complete solution of the problem.

Apologies for any inconveniences.

The maintenance has mostly concluded. Please read https://www.lrz.de/aktuell/ali00938.html for details.

On Monday, April 4, 2022, a new version of the spack-based development and application software stack will be rolled out.

The new spack version will be loaded as default starting April 11, 2022

After that date, you will be still able to switch to the previous spack stack with

> module switch spack spack/21.1.1

We strongly recommend recompiling self-built applications after the roll-out. See also https://doku.lrz.de/display/PUBLIC/Spack+Modules+Release+22.2.1 for details.

Base core frequency of jobs has been set to 2.3GHz. Higher frequencies possible using EAR.

The new hpcreport tool is now available to check job performance and accounting on SuperMUC-NG. Please check out

https://doku.lrz.de/display/PUBLIC/HPC+Report

https://www.lrz.de/aktuell/

ali00830

ali00923.html



Messages for
the announcement of a Linux Cluster Maintenance on 17.-25. March 2020, affecting the operation of all LRZ Linux Clusters.

 End of service for NAS systems

NAS paths (former HOME and PROJECT areas) have been taken offline at the beginning of January, 2020. Please contact the Service Desk if you have outstanding data migration issues.

Messages for Cloud and other HPC Systems The RStudio Server maintenance has concluded. Make sure to read the RStudio Server (LRZ Service) documentation for a list of changes and actions required from the users after this maintenance. Please submit an incident ticket if you have any questions or encounter any issues
Linux Clusters

There are 4 "new" Remote Visualization (RVS_2021) nodes available. The machines are in production mode. Nodes are operated under Ubuntu OS and NoMachine. Usage is limited to 2 hours and if you need a longer period of time, please file an LRZ Service Request. For more details please refer to the documentation.



Messages for Cloud and other HPC Systems

We have observed and addressed an issue with the LRZ AI Systems that concerned some running user jobs. As of now, newly started jobs should not be affected anymore. 
The work on the LRZ AI Systems to address the recently observed stability issues has been concluded. All users are invited to continue their work. We closely monitor system operation and will provide additional updates if needed. Thank you for your patience and understanding.

We have identified the likely root cause for the ongoing issues with the LRZ AI and MCML Systems following the latest maintenance downtime. We continue work towards a timely resolution and can currently not guarantee uninterrupted & stable system availability. For further details, please see LRZ AI Systems

The LRZ AI and MCML Systems did undergo a maintenance procedure from April 25th to April 27th (both inclusive.) During this period, the system was not available to users. Normal user operation did resume on 2022-04-27 16:30.



More Links

Children Display