DSS Release 2.0

General Availability: 2020-01-25

Overview

As you may have noticed, it's been a while since we last released an update of LRZ Data Science Storage (DSS) and DSSWeb Self Service Portal. The reason for this is, that we've been very busy, working on a huge new feature, called the Data Science Archive (DSA). The DSA relates to DSS like AWS Glacier relates to AWS S3. Data you put into DSA will "freeze" eventually. Frozen data must be thawed explicitly via a special command or via Globus Online. And the data you put in will be made immutable for 10 years, which means you cannot modify, delete or rename the data. After 10 years, the data can be deleted again but still not modified. As the data in the DSA is moved from disk to tape when it "freezes", DSA can be used to retain large amounts of scientific data at a very attractive price point. The first DSA system will be built for our flagship HPC system, SuperMUC-NG with a total usable capacity of 260PB and about 25GB/s native tape drive throughput. Currently a storage system under DSSWeb control can either be DSS-type or DSA-type but as the solution matures, we may also try blending these two together into a single storage system.

In addition to that, we are collaborating with the TUM-Workbench Team to integrate DSS Containers into the TUM-Workbench (https://www.ub.tum.de/workbench). We expect that the "Friendly User Phase" of this integration will start later this year and that general availability will come in 2021. So stay tuned.

In order to make it easier for data curators to integrate their container with certain other systems like the LRZ R-Studio servers or the TUM-Workbench, we also introduce a new feature called Container Configs, that allow data curators to automatically setup the integration with a single mouse click/command.

Besides the new DSA and Container Config feature we also added some new minor improvements like managing NFS exports via the GUI and an improved integration of Globus Sharing. 

New Features

DSS System Layer

  • Added support for IBM Spectrum Protect for Space Management to automatically migrate data to/from tape for DSA-type storage systems.

DSSWeb Self Service Portal

  • DSA
    • Added new DSA-type Storage Pools and Containers.
    • Introduced a tape optimised staging service, called DSA Recall Director to thaw frozen data.
    • Introduced a connection between Globus Connect Server and DSA Recall Director to automatically thaw frozen data if it is transferred by Globus Online.
  • DSS On Demand Quoting and Invoicing
    • LRZ personal now can create quotes and invoice documents for DSS On Demand automatically.
  • NFS Exports
    • Create container NFS exports to arbitrary machines in the LRZ datacenter using the GUI.
    • Update container NFS exports to arbitrary machines in the LRZ datacenter using the GUI.
    • Delete container NFS exports to arbitrary machines in the LRZ datacenter using the GUI.
    • Set an expire date for NFS exports after which they automatically get disabled and deleted using the GUI.
  • Container Configs
    • Apply and remove prepared configs for certain use cases like integrating a container with LRZ's R-Studio or the TUM workbench via the GUI, CLI and API
  • Globus
    • Managing of Globus Sharing permissions now is completely done through the Globus Web App, which gives the Data Curators access to advanced sharing features like Groups or Public access.
    • Globus transfers to a container can be monitored and upon the completion of a transfer, events can be send to a Message Bus Consumer to take some action upon this event.
  • Statistics
    • Record container and pool usage and some other statistical data once a day for historic reporting and trend analysis.

Deprecated Features

Known Issues and Limitations

  • Restore of backed-up/archived data is currently only possible via a service request on the LRZ Servicedesk.
  • DSSWeb can only be accessed from within the Munich Scientific Network (MWN).
  • DSSWeb Webinterface may not work correctly when opened in multiple browser tabs.
  • After accepting an invitation or doing the first time Globus registration, it can take up to two hours until the user is enabled for the Globus platform.
  • After accepting an invitation, it can take up to an hour until the systems will reflect this change, when the container group is in name server cache.
  • "Edit Container" Button is enabled for Container Managers but it shouldn't. Clicking on the button results in 404 Page Not Found error.
  • Currently only Data Curators and Container Managers are allowed to directly operate the DSA Recall Director. Normal users must use Globus Online or ask their respective DCs or CMs.