Info |
---|
Please note that the Data Transfer System for SuperMUC-NG is still under development. So this document, as well as the available data transfer options will increase over the next few months. So check back from time to time to see what's new. |
Inhalt |
---|
Overview
SuperMUC-NG provides a high performance network connection to the outside world. In order to perform easy and fast data staging to/from SuperMUC-NG, we provide various data transfer methods. However, because of limited support resources, there is a distinction on what is available to transfer data to/from arbitrary external sites and to/from our GCS and PRACE partner sites. In the following document, we want to give you the necessary information on using the available data transfer options.
SuperMUC-NG DTN Setup
The diagram below shows the SuperMUC-NG DTN topology. As you can see, we operate currently four Login nodes, two Globus Online and two Special Purpose Data Transfer Nodes (DTNs), each one connected via a 100GE network link to our Data Center Network. The Data Center Network is connected via four 100GE links to the German Research Network (DFN), that are operated in a pair wise active/passive mode. The DFN then provides high bandwidth connections to the internet and via the European Research Network (Geant) to other Research Networks all over the world.
Tipp |
---|
Note that the Data Transfer Nodes (DTNs) only provide server side functionality. Initiating Transfers to/from the DTNs is always done by the user from the Login Nodes or via the Globus Online Webinterface. User Login to the DTNs is not possible. |
draw.io Diagramm | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Data Transfer to/from arbitrary external sites
In the following we will outline the two data transfer methods, we currently support from/to arbitrary external endpoints.
Warnung |
---|
In order to be able to transfer data to/from an arbitrary external site, the IP(s) of the external endpoint(s) have to be registered as trusted IP(s) by your project in the SuperMUC-NG firewall. If not already done so, please ask the Master User of your project to submit a Service Request to register your IP(s) in the SuperMUC-NG firewall. |
Globus Online
The preferred data transfer method on SuperMUC-NG to/from arbitrary external sites is to use Globus Online. LRZ operates an dedicated Globus Endpoint for SuperMUC NG called LRZ SuperMUC-NG Globus DTN. In order to get started with Globus Online check out this short tutorial. Note that you can use your SuperMUC-NG credentials to log in to Globus Online and must use it to authenticate against the LRZ SuperMUC-NG Globus DTN endpoint. Just search for LRZ or Leibniz in the list of organisations on the login page(s).
Tipp |
---|
In general Globus Online performs best if you transfer multiple (>8) large (>10GB) files so that it can efficiently parallelise data transfers. This is especially important over long distance links. |
Hinweis |
---|
If you already used Globus on SuperMUC, you may ask yourself now one or more of the following questions: |
Secure Copy
For smaller amounts of data you also can use secure copy (scp
), secure ftp (sftp
) or RSYNC (rsync
) using ssh
as remote shell from your external endpoint to the SuperMUC-NG login nodes. Note that the direction from external to SNG is important as the firewall of SuperMUC-NG does not permit outgoing ssh connections.
Info |
---|
When choosing this data transfer method, please bear in mind that |
Data Transfer to/from GCS sites
The three GCS Sites (HLRS in Stuttgart, JSC in Jülich and LRZ in Garching) operate a network of Data Transfer Nodes (DTNs), that have been optimised to enable you to transfer data with up to 10GB/s between the three German National Flagship Supercomputers.
UNICORE File Transfer (UFTP)
UFTP is a data streaming library and file transfer tool which is based on a client/server architecture. In order to transfer between JSC, HLRS and LRZ, all three sites operate Unicore authentification services and data access servers (uftpd
). To transfer files, the user has to log in to one of the corresponding hosts at the site and then authenticates with a public/private ssh key to the authentication service of another site.
At LRZ, the two authenitification services are located at
Codeblock | ||
---|---|---|
| ||
https://datagw03.supermuc.lrz.de:9000/rest/auth/DATAGW https://datagw04.supermuc.lrz.de:9000/rest/auth/DATAGW |
Setting up the Client
To transfer files with a client at LRZ on SuperMUC-NG to another site, you need to log in to
Codeblock | ||
---|---|---|
| ||
skx-arch.supermuc.lrz.de |
Then you have to load the uftp-client
Codeblock | ||
---|---|---|
| ||
module use -a /lrz/sys/share/modules/extfiles module load uftp-client |
For more information on the uftp-client, see the examples below or please refer to https://www.unicore.eu/docstore/uftpclient-1.3.2/uftpclient-manual.html
SSH Key Management
In order to use uftp, you need to create and use seperate ssh keys. NOTE: Due to security concerns you are not allowed to use these keys for anything else than uftp transfers, especially not for SSH-based access (including SCP and SFTP) to any system. To create a new ssh key for uftp, use the following commands:
Codeblock | ||||
---|---|---|---|---|
| ||||
mkdir -p ~/.uftp ssh-keygen -a 100 -t ed25519 -f ~/.uftp/id_uftp_to_jsc |
and chose a secure(!) passphrase. Keys without passphrase violate the security regulations of LRZ and their use is strictly forbidden.
Transfer between LRZ and JSC
Login to judac.fz-juelich.de
and copy the contents of ~/.uftp/id_uftp_to_jsc.pub
(located at LRZ) into the file ~/.uftp/authorized_keys
(located at JUDAC). Now you should be able to get some informations from the uftp service at JSC by executing on skx-arch
Codeblock | ||
---|---|---|
| ||
uftp info -i ~/.uftp/id_uftp_to_jsc -u YOUR_USERNAME_AT_JSC https://uftp.fz-juelich.de:9112/UFTP_Auth/rest/auth/JUDAC |
where you need to replace YOUR_USERNAME_AT_JSC with your username at JUDAC/JSC. The output should look similar to this:
Codeblock | ||
---|---|---|
| ||
Client identity: CN=YOUR_USERNAME_AT_JSC , OU=ssh-local-users Client auth method: SSHKEY Auth server type: AuthServer Server: JUDAC URL base: https://uftp.fz-juelich.de:9112/UFTP_Auth/rest/auth/JUDAC: Description: JUDAC Remote user info: uid=YOUR_USERNAME_AT_JSC ;gid=N/A Sharing support: enabled Server status: OK [connected to UFTPD judacsrv.fz-juelich.de:64433] Server: JUDAC-PRACE URL base: https://uftp.fz-juelich.de:9112/UFTP_Auth/rest/auth/JUDAC-PRACE: Description: JUDAC via PRACE Network Remote user info: uid= YOUR_USERNAME_AT_JSC;gid=N/A Sharing support: not available Server status: OK [connected to UFTPD judacsrv.fz-juelich.de:64433] |
Hinweis | |||||
---|---|---|---|---|---|
If you receive a warning like
this can be SAFELY IGNORED and will disappear with never versions. |
To list the contents of a remote directory, use
Codeblock | ||
---|---|---|
| ||
uftp ls -i ~/.uftp/id_uftp_to_jsc -u YOUR_USERNAME_AT_JSC https://uftp.fz-juelich.de:9112/UFTP_Auth/rest/auth/JUDAC:/PATH/AT/JUDAC |
To download a file from JUDAC to LRZ:
Codeblock | ||
---|---|---|
| ||
uftp cp -i ~/.uftp/id_uftp_to_jsc -u YOUR_USERNAME_AT_JSC https://uftp.fz-juelich.de:9112/UFTP_Auth/rest/auth/JUDAC:/PATH/TO/FILE/AT/JUDAC /LOCAL/PATH/AT/LRZ |
To upload a file from LRZ to JUDAC, you just need to reverse the order of the last two arguments:
Codeblock | ||
---|---|---|
| ||
uftp cp -i ~/.uftp/id_uftp_to_jsc -u YOUR_USERNAME_AT_JSC /LOCAL/PATH/AT/LRZ https://uftp.fz-juelich.de:9112/UFTP_Auth/rest/auth/JUDAC:/PATH/TO/FILE/AT/JUDAC |
You can also use more streams to potentially speed up the transfer using the option "-t 10":
Codeblock | ||
---|---|---|
| ||
uftp cp -t 10 -i ~/.uftp/id_uftp_to_jsc -u YOUR_USERNAME_AT_JSC /LOCAL/PATH/AT/LRZ https://uftp.fz-juelich.de:9112/UFTP_Auth/rest/auth/JUDAC:/PATH/TO/FILE/AT/JUDAC |
Transfer between LRZ and JSC (with client at JUDAC)
To use the client at JUDAC, please also refer to https://apps.fz-juelich.de/jsc/hps/judac/uftp.html . The procedure to enable access to LRZ is similar to the one in the reverse direction:
At JUDAC create a ssh key with passphrase using
Codeblock | ||
---|---|---|
| ||
mkdir ~/.uftp ssh-keygen -a 100 -t ed25519 -f ~/.uftp/id_uftp_to_lrz |
At LRZ copy the contents of ~/.uftp/id_uftp_to_lrz.pub
(located at JUDAC) into the file ~/.ssh/authorized_keys
(located at LRZ) and replace the authentification URL https://uftp.fz-juelich.de:9112/UFTP_Auth/rest/auth/JUDAC with https://datagw03.supermuc.lrz.de:9000/rest/auth/DATAGW or https://datagw04.supermuc.lrz.de:9000/rest/auth/DATAGW, so e.g.
Codeblock | ||
---|---|---|
| ||
uftp cp -t 10 -i ~/.uftp/id_uftp_to_lrz -u YOUR_USERNAME_AT_LRZ /LOCAL/PATH/AT/JUDAC https://datagw03.supermuc.lrz.de:9000/rest/auth/DATAGW:/PATH/TO/FILE/AT/LRZ |
Transfer between LRZ and HLRS
To transfer from/to HLRS, create another(!) ssh key at skx-arch
:
Codeblock | ||
---|---|---|
| ||
mkdir -p ~/.uftp ssh-keygen -a 100 -t ed25519 -f ~/.uftp/id_uftp_to_hlrs |
To enable the public key for transfer, open a service request at HLRS and tell them your public key (the contents of ~/.uftp/id_uftp_to_hlrs.pub) and your username and ask them to enable your key for uftp. A sample command on skx-arch then should look like
Codeblock | ||
---|---|---|
| ||
uftp cp -t 10 -i ~/.uftp/id_uftp_to_hlrs -u YOUR_USERNAME_AT_HLRS /LOCAL/PATH/AT/LRZ https://gridftp-fr1.hww.hlrs.de:9000/rest/auth/HLRS:/PATH/TO/FILE/AT/HLRS |
If you encounter any problems with UFTP, please contact our Servicedesk.
Grid Community Toolkit (GridFTP)
The recommended way to transfer data between LRZ, JSC and HLRS is uftp, however, all three sites also offer a X509-Certificate based GridFTP service. To use GridFTP, you need a personal X509 certificate. If you do not have an X509 grid user certificate already, please follow the steps here: https://www.lrz.de/services/compute/grid_en/certificate_en/person-certificate_en/
LRZ generally provides two GridFTP Servers with the following URIs:
Codeblock |
---|
gsiftp://datagw03.supermuc.lrz.de gsiftp://datagw04.supermuc.lrz.de |
Associate your DN from your personal certificate with your LRZ-username
In the following we assume that you successfully obtained your signed certificate as a .p12
file which is called "SignedGridCert.p12
". As a next step, you need to extract you DN (Distingiushed Name) from the certificate. This can be done via
Codeblock | ||
---|---|---|
| ||
openssl pkcs12 -in SignedGridCert.p12 -nodes | openssl x509 -noout -subject Enter Import Password: subject=C = DE, O = GridGermany, OU = Leibniz-Rechenzentrum, CN = John Doe |
Afterwards, please follow the instructions on https://www.lrz.de/services/compute/grid_en/certificate_en/person-certificate_en/register_cert_en/ to associate your DN with your LRZ-Account.
Until the association becomes valid it may take up to thirty minutes.
Hinweis |
---|
Please note the reverse order in the DN. From the example above, the DN you need to enter into the IDM portal would be CN=John Doe,OU=Leibniz-Rechenzentrum,O=GridGermany,C=DE |
Hinweis |
---|
To use GridFTP with HLRS and JSC you also need to associate your DN with your corresponding usernames at these sites. For JSC, this can be done in https://judoor.fz-juelich.de/ under "Change data", for HLRS you need to contact the colleagues directly. |
Getting a proxy certificate to use GridFTP
The only node to initiate transfer between the sites at LRZ is login05
also known as
Codeblock |
---|
skx-arch.supermuc.lrz.de |
After login you need to execute the following commands
Codeblock |
---|
mkdir ~/.globus mv SignedGridCert.p12 ~/.globus/usercred.p12 chmod 600 ~/.globus/usercred.p12 |
Then you need to load the GridFTP module:
Codeblock |
---|
module use -a /lrz/sys/share/modules/extfiles module load gridftp-client |
Now you need to generate a proxy certificate with a limited lifetime. This is done via
Codeblock |
---|
grid-proxy-init |
After entering your passphrase you should have obtained a proxy certificate which is valid for several hours.
Copying data between the sites
With your valid proxy certificate, you can copy files using "globus-url-copy
". For example, to copy files from LRZ to JSC:
Codeblock |
---|
globus-url-copy -vb -p 6 gsiftp://datagw03.supermuc.lrz.de/PATH/TO/FILE/AT/LRZ gsiftp://judacsrv.fz-juelich.de/PATH/TO/FILE/AT/JSC |
From HLRS to LRZ (note the different port 2812 in the URI from HLRS):
Codeblock |
---|
globus-url-copy -vb -p 6 gsiftp://gridftp-fr1.hww.de:2812/PATH/TO/FILE/AT/HLRS gsiftp://datagw04.supermuc.lrz.de/PATH/TO/FILE/AT/LRZ |
But you can also initiate transfers from JSC to HLRS from LRZ (also on login05):
Codeblock |
---|
globus-url-copy -vb -p 6 gsiftp://judacsrv.fz-juelich/PATH/TO/FILE/AT/JSC gsiftp://gridftp-fr2.hww.de:2812/PATH/TO/FILE/AT/HLRS |
For more information on the usage of globus-url-copy, see https://gridcf.org/gct-docs/6.2/gridftp/user/index.html .
If you encounter any problems with GridFTP, please contact our Servicedesk.
Data Transfer to/from PRACE sites
Hinweis |
---|
Coming Soon. |
Grid Community Toolkit (GridFTP)
Beyond Data Transfer: Sharing and Public Access
If you want to share data generated on SuperMUC-NG with external parties or even make this data publicly available, you can have a look at our LRZ Data Science Storage (DSS). GCS has funded 20PB of DSS storage for SuperMUC-NG. For details on how SuperMUC-NG projects can apply for storage space on DSS see: Data Science Storage for SuperMUC.