DSS TUM-DSS system architecture

In the following, we give some insights in the system architecture of the TUM-DSS system, which was purchased by the Technical University of Munich to support their data intensive research folks.

Hardware Overview

The TUM-DSS system currently is built of the following hardware:

  • 1 x Data Direct Networks SFA7700X Storage System
    • 150 x 4TB NL-SAS Disks
    • 150 x 6TB NL-SAS Disks
    • 24 x 800GB SSDs
  • 4 x DELL (OEM) R730 Single CPU Servers
    • 2 x Mellanox ConnectX-3 FDR-IB/40GE network adapters

Two of the servers implement the IBM Spectrum Scale shared filesystem and the other two servers implement the NFS gateway. Each of the servers is connected with 2 x 40Gbit/s to our HPC core ethernet infrastructure, that interconnects our DSS storage with our compute resources.

File System Overview

The system implements two different file systems, whereby each of the file systems is optimized for a specific workload. For both file systems, the file system metadata is separated on SSDs from the actual data on disk.

The first filesystem, tumdssfs01 is implemented using the 6 TB drives and is optimized for large sized files of more than 8MB. The minimum allocation unit is 256K, whereby the actual file system block size is 8MB. The maximum sequential throughput performance* is about 5000 MB/s write and about 7000 MB/s read. The total available size is about 650 TB and the maximum number of file system objects that can be created is about 300 million.

The second filesystem, tumdssfs02 is implemented using the 4 TB drives and is optimized for small to medium sized files of about 512KB to 8MB. The minimum allocation unit is 32KB, whereby the actual file system block size is 1MB. The maximum sequential throughput performance* is about 2000 MB/s write and 1900 MB/s read. The total available size is about 430 TB and the maximum number of file system objects that can be created is about 600 million.

*Measured using iozone -i 0 -i 1 -r 256k -s 100g -t 10 via IBM Spectrum Scale NSD protocol from multiple (6) nodes in parallel with sufficient network bandwidth.

Note that the actual achievable performance from a certain HPC system or HPC node will also be limited by its network connection to the LRZ HPC core ethernet infrastructure.

NFS Overview

Besides the proprietary, high performance IBM Spectrum Scale Network Shared Disk (NSD) protocol - which is used to attach the DSS systems to our HPC clusters - DSS containers can be exported via a number of highly available, active/active NFS gateways to VMs in our HPC cloud or VMWare cluster or to bare metal machines, housed at LRZ. The total achievable bandwidth* over these servers is about 1450 MB/s write and 3000 MB/s read, whereby a single NFS connection is usually limited to about 700 MB/s write and 1000 MB/s read.

*Measured using iozone -i 0 -i 1 -r 256k -s 100g -t 10 via NFS v3 protocol from multiple (4) nodes in parallel with sufficient network bandwidth.

Note that the actual achievable performance from a certain HPC system or HPC node will also be limited by its network connection to the LRZ HPC core ethernet infrastructure.