DSA documentation additions for users
Overview
This document is an addition to the DSS documentation for users that outlines specific differences between the usage of a Data Science Storage container and a Data Science Archive container.
Please note that our Data Science Archive is still in an early production phase and capabilities and other things may still change over the next few months. So it's probably a good idea to get back to this document from time to time or at least review our DSS Release Notes from time to time.
DSA - The A is for Archive
At first sight for you as a user, a Data Science Archive container does not look different from a Data Science Storage container. However, the purpose of the Data Science Archive is to safely store large amounts of cold research data to allow you to comply with the rules for good scientific practice. Therefore you will soon notice that files stored in a DSA container behave a little bit different because first their content is eventually moved from the disk partition of DSA to tape (we like the analogy that the data freezes like water in a glacier) and second they are protected against accidental data loss in a very special way.
The life-cycle of a DSA file
All files you put in a DSA container will follow the following life-cycle:
- A few hours after you've copied the file over to DSA, DSA will create a copy of this file on tape in two different data centres. (The data will still also be on the disk partition at this point in time)
- Approximately 24h* after the file has been created in DSA, it will be made immutable and a deletion hold of 10 years is put on the file. This means you will never be able to modify, append or rename the file again and you will also not be able to delete the file for the next 10 years.
- At some point in time when the disk partition is filled up to a certain watermark, DSA will begin to purge data of files which have a copy on tape from the disk partition. The file meta data is still kept so you will still see your files in the DSA container, however as the file is frozen now, when you try to access the file, you will get an error (
Permission denied
) - When you want to access your file again, after it has been frozen, you have to first thaw it again. This can be done either implicitly by initiating a Globus Online transfer or explicitly by sending a stage request to the DSA Recall Director service. We'll cover both methods later. After the files have been thawed they are usually accessible for at least 7 days unless there is very high pressure on the disk partition. A file can be frozen and thawed for an unlimited number of times.
- After 10 years the deletion hold will be released and you now can delete the file again if you want to. However you'll still not be able to modify, append or rename the file.
*For files smaller than 1GB, this time period is extended to 7 days as we consider storing a large number of files smaller than 1GB as an anit-pattern for DSA and we want to give you a little bit more time to cleanup files, stored in DSA mistakenly.
Getting data in and out
There are basically two ways in which you can get data in and out of DSA. The first one is via Globus Online and the second one is via the Login Nodes of SuperMUC-NG and the LRZ Linux Cluster. In the following we will describe both ways and outline the specific advantages and disadvantages.
Getting data in
As you'll see in this section getting data into DSA is relatively straightforward. However there is one important factor to keep in mind: Your data will eventually end up on tape which is a storage media that only allows sequential access. So in order to be able to get back your data later in a timely manner the files you put in to the archive must be large enough to allow the tape drives to operate efficiently. As a general rule of thumb, files should not be smaller than 1GB and files with 100GB or more are even better. As an upper boundary, if possible files should not be larger than 6TB. If you have many small files use tar or zip in order to combine them into a larger archive file.
Please also note that we only grant a very strict quota on the number of files that can be stored by each project which can not be increased (Usually a low 5-digit number). So if you are storing too many small files you'll be running out of quota very soon and then may need to rework your whole archive.
When working with tar
or zip
archives consider putting a "META" file next to each archive that describes the content. Files smaller than 20MB usually will never get purged from the disk partition and therefore can easily been searched through without having to stage them before.
Using Globus Online
The Data Science Archive is available as Globus Online Endpoint here. In order to access it, make sure you log in using the LRZ username you were invited to access your DSA container. You can find the user ID you have been invited in the DSA invitation mail we sent you.
For information how to use Globus Online please read their fine Getting Started Guide.
In order to transfer data from SuperMUC-NG or LRZ Linux Cluster, you can use the following endpoints to move data into DSA:
If you want to transfer data from a remote system that does not yet provide a Globus Endpoint, you can use Globus Connect Personal to turn virtually any System into a Globus Online endpoint within minutes.
Using Globus Online to put data into DSA has the following advantages:
- It will automatically calculate checksums of your data after transfer to make sure no silent data corruption occurred
- It will automatically copy data in parallel so performance will be much better than with
cp
orrsync
for example - It will take care of your data transfer and inform you by mail once it succeeded
Using HPC Login Nodes
As the frontend of the Data Science Archive looks like a normal file system, it is also mounted on the Login Nodes of your HPC systems. So you can also put data into DSA just by using whatever file copy or archiving utility is available on the login nodes. The path to your DSA container directory can be found in the invitation mail.
Beware of tools or options that try to preserve owning group and/or access rights like cp -a
, rsync -a
, rsync -p -g
. The concept of a DSA container is that everyone that has access to the container has access to all data and we work hard that this semantic is enforced. However tools that try to mess with ownership or access rights may break our effort and this could cause you to loose access to your data eventually.
You can directly create tar archives from your data on WORK or SCRATCH into a DSA container.
Getting data out
We've recently identified a lock contention issue when multiple staging jobs are startet in parallel that compete for the same tape volumes, which can lead to incomplete stages. In order to avoid that, please only start one stage job after another for a single container. We plan to have a fix for this available in Q1/2023. But anyway it is generally better/more performant to issue less but larger stage jobs than many small ones.
Using Globus Online
Getting data out with Globus Online is basically as easy as getting it in. Just start a transfer and Globus Online will take care of the rest.
The way Globus Online currently handles staging is not optimal. While it works reasonable well with files larger than 10GB or when you want to transfer only a few files, when trying to access many files smaller than 10GB, consider to first stage the files manually (see next section) and then transfer the files after staging.
Using HPC Login Nodes
When you try to access a file in a DSA container that is currently offline, meaning on tape only, you'll get an Permission denied error. In order to be able to access it again, it needs to be staged explicitly using the DSA CLI utility which is installed on all HPC login nodes.
The dsacli
tool has the option to specify various output formats like csv
, json
, yaml
and so on that may come handy when using the tool in your own scripts and programs. Just use the base command together with the -h switch to get an overview of the output options (e.g. dsacli stage job list -h
).
In order to start a new stage job, use the dsacli stage job create
command. In its simples form it expects the name of the DSA container and a file or directory name either relativ to the container base directory or as absolute path as argument. Alternatively it also accepts a file that contains all files to stage, again either with their absolute or relativ path. Additionally it provides the -w
switch that will allow you to use GLOB wildcards in file names. Last but not least you can add the -n
switch to get an email notification when staging is done.
In the following example we'll show you the various ways the dsacli stage job create
command can be used. For theses example sections, we use the DSA container pr74qo-dss-0007 which is available under the path /dss/dsafs01/0001/pr74qo-dss-0007
In order to get an overview over your recent stage jobs, you can use the dsacli stage job list
command. We keep information about your stage jobs of the last 30 days.
In order to get some details and the current status of a particular stage job, you can use the dsacli stage job show
command. It takes the ID of the stage job you want to view as argument.
The command will show you the container it is operating on, the number of tapes that need to be touched in order to fulfil the request and the number of tapes that have already been finished as well as the creation and end time and the current job status.
A stage job goes through the following states:
New pending
: The stage job has entered the system but has not jet been handed over to the worker nodesPreparing stage list
: The stage job is sorting out already staged files and computing the optimal stage orderWaiting for staging slots
: The stage job is waiting for free tape drivesStaging in progress
: The stage job has begun to move data from tape to diskStaging completed
: The stage job has completedStaging aborted by user
: The stage job has been aborted by a user request
In order to get a list of files and directories, the job is going to stage, you can use the dsacli stage job show list
command. It again takes the ID of the stage job you want to view as argument. Note that the file and directory names will always been shown relative to the container base directory.
You can also retrieve a list of already staged files of a job in realtime using the dsacli stage job show staged
command. It takes the ID of the stage job you want to view as argument. Additionally you can limit the number of files that are displayed at once using the --number
argument whereby with each call the next N
elements are displayed util it reaches the and and starts from the beginning again. Also you may want to use the --consume
switch where files that were once displayed are removed from the list.
As the staged file list is created in realtime, this command may be very handy when you want to already start processing files while the overall staging job is still running.
Last but not least, you can use the dsacli stage job abort command to abort running stage jobs. The command again takes the ID of the stage job you want to abort as argument. Note that it may take some time until the job is aborted as it will do that in a coordinated fashion and may need to wait for some tasks to get to a sane stage.
Sometimes it also can be helpful to directly find out for a given file if it is on the online partition of DSA or not. For this you can use the command /usr/lpp/mmfs/bin/mmlsattr
which will also tell you some more information about the file.