aka: Höchstleistungsrechner in Bayern I, Bundeshöchstleistungsrechner in Bayern II

Hardware

The HLRB II is based on SGI's Altix 4700 platform. The system installed at LRZ was optimized for high application performance and high memory bandwidth.

The following table provides an overview of the hardware and characteristics of the HLRB II.

Overall Characteristics for both installation phases 


Phase 1

Phase 2

Total number of cores

4096

9728

Peak Performance of the entire system

26.2 TFlop/s

62.3 TFlop/s

Linpack Performance24.5 TFlop/s56.5 TFlop/s

Total size of memory for entire system

17.5 TByte

39 TByte

Direct Attached Disks

300 TByte

600 TByte

Network Attached Disks

40 TByte

60 TByte

Granularity

Number of compute partitions

16

19  (13+ 6 with high density blades) 

Number of cores per compute partition

256

512

Number of blades (memory channels) per compute partition

256

128 (high density) or 256

Number of cores per socket

1

2

Number of cores per blade

1

2 or 4 (high density blades)

Processor

Processor typeIntel Itanium2 Madison 9MIntel Itanium2 Montecito Dual Core

Clock rate

1.6 GHz

1.6 GHz

Number of Floating Point Operations per clock

4   (=2 FMAs)

4   (=2 FMAs)

Peak performance of a socket

6.4 GFlop/s

12.8 GFlop/s

Max. number of Instructions per clock tick

6

12 (6 per Core)

Peak number of instructions per second of a socket (Gip/s)

9.6 Gip/s

19.2 Gip/s (9.6 per Core)

Number of  FP Registers

128

256 (128 per core)

Memory

Memory per core

4 GByte (8 GByte on interactive node)

4 GByte per Core
(1st socket in Partition contains 16 GByte)

Clock rate of frontside bus (FSB)

533 MHz

533 MHz

Peak bandwidth to local memory

8.5 GByte/s per core

8.5 GByte/s shared between 2 or  4 cores (density blades) 

Total bandwidth to local memory of the entire system

34816 GByte/s

34816 GByte/s

Latency to local memory

approx. 210 cycles

approx. ??? cycles

Memory Hierarchy

L1 Data Cache (not used for floating point data)


               size

16 kByte

16 kByte

               cacheline size

64 Byte

64 Byte

               associativity

4-way

4-way

               latency

1 cycle

1 cycle

               Bandwidth

25.6 GByte/s

25.6 GByte/s

L2 Data Cache (per core)


               Size

256 kByte

256 kByte

               Cacheline size

128 Byte

128 Byte

               Associativity

8-way

8-way

               min. Latency

INT: 5 cycles,
FP: 6 cycles

INT: 5 cycles,
FP: 6 cycles

               Bandwidth

51.2 GByte/s (FP)  (+25.6 GByte/s (INT))

51.2 GByte/s (FP)  (+25.6 GByte/s (INT))

               Data banks

16 Bytes/bank

16 Bytes/bank

L2 Instr. Cache (per core)


               Size

n/a

1 MByte

L3 Cache (per core)


               Size

6 MByte

9 MByte

               Cacheline size

128 Byte

128 Byte

               Associativity

12-way

12-way

               min. Latency

14 cycles

14 cycles

               Bandwidth

51.2 GByte/s 

51.2 GByte/s 

               Fill  Bandwidth

128 Byte in 4 cycles 

128 Byte in 4 cycles 

L2 Data TLB


               Entries

128

128

               Latency

30 cycle penalty for TLB miss

30 cycle penalty for
TLB miss

Internal Interconnect

Connection network type

NUMAlink 4

NUMAlink 4

Number of  (bidirectional) links per blade

2

2

Bandwidth of one link (bidirectional)

6.4 GByte/s

6.4 GByte/s

MPI latency

1-5  µs

1-5 µs

Disks

Direct attached disks



              Characteristics

few, but large files; high bandwidth;
Pseudo Temporary Files 

few, but large files; high bandwidth
Pseudo Temporary Files,
Temporary Project Files

              Size

300 TByte

600 TByte

              aggr. bandwidth to disks

20 GByte/s

40 GByte/s

Networked attached disks (Home Directories)

30 TByte60 TByt

              Characteristics

many, but small files; high transaction rate many, but small files; high transaction rate 

              Size

40 TByte

60 TByte

              bandwidth to disks

600 MByte/s

800 MByte/s

Environment

Footprint

24 m x 12 m

24 m x 12 m

Total weight

103 metric tons

103 metric tons

Total electrical power

~1000 kVA

~1100 kVA

  • No labels