File Transfer (Globus and SFTP)¶
Overview¶
For a quick start guide please visit: Globus: How to Get Started.
If you need further assistance beyond this documentation, please contact iour helpdesk.
Review https://www.globus.org/researchers/getting-started for a step by step guide on use Globus Online.
Note: For a video on how to utilize Globus Online with Google Drive, please see https://www.youtube.com/watch?v=tDdVsNVK3ko&feature=youtu.be.
Transfer Data¶
Use Globus Online web browser to transfer data files between your PC/workstation and the WVU HPC systems. There are two dedicated servers, data transfer nodes (DTNs), that are connected to WVU’s Science DMZ (REX) that allow data to be transferred with low latency and high bandwidth across WVU campus as well as to other locations around the globe. This space is available for all researchers at WVU who need to temporarily store and transfer data.
Logging in and transferring files¶
Visit the URL Globus Online to login.
Select West Virginia University from the organizational list and login using your WVU Login username and password.
Transferring data to your local workstations/PC¶
To learn how to transfer data to your local workstation, you will need to install Globus Connect Personal.
Archive Data¶
Globus Online provides an easy mechanism for researchers to archive their data sets for free to their MIX Google Drive account via a Google Drive Connector. All WVU faculty and students have unlimited storage in Google Drive through their MIX account. Staff can request a MIX account by contacting helpdesk@hpc.wvu.edu. Google Drive gives researchers a safe environment to archive their data sets, which replicates between multiple data centers to prevent data loss. Note: You can also use the Google Drive web interface as well as the Google Drive Sync Client to transfer files to your Google Drive MIX account; however, the sync client and web interface is best used for small files (Example .docx, .pdf, etc.). Globus Google Drive Connector is optimized for transferring a large number of files as well as very large files, which cannot be done, via the Google Drive Web Interface on Sync Client. You will see the best performance with Google Drive Connector with transferring several large files at once. If you have many small files, it might be best to zip/tar these files first before transferring to Google Drive.
Google Drive is not approved for use of sensitive or protected data sets (i.e. HIPPA, FERPA, ITAR, subject to Export Control laws, etc.). If you need assistance archiving these types of data sets, please contact helpdesk@hpc.wvu.edu.
To archive data to a Google Drive you will need to follow the instructions in the Globus Online Google Drive Connector Instructions, which will guide you on how to make a Shared Endpoint that you can attach to your MIX Google Drive Account to transfer data to/from.
Globus Online Google Drive Connector Instructions¶
Note: In step one, when searching all endpoints you will need to search for ‘wvu#googledrive’.
To transfer files to your MIX Google Drive Account, review the “Transfer Data” Section above. You will want to select the name of the Shared Endpoint you just created in step 3 as one of the two endpoints. The other endpoint could be a WVU endpoint, such as wvu#hpcdtn or a personal endpoint you created that connects to your workstation/PC.
Publish Data¶
Use Globus Online to publish your datasets, possibly as a requirement from a granting agency. Published datasets are organized by communities and their member collections and are searchable by other Globus users.
- Note
If you are not using an endpoint that starts with
wvu\#
, before you can start sharing data using Globus, you must contact Research Computing and request that a shared endpoint be created for you.
For more information on how to publish your data sets using Globus Online, visit Globus, Data Publication User Guide.
Due to a large amount of data users usually need to transfers to/from HPC systems, reliable, secure, and highly optimized transfer methods needs to be utilized. As a result, WVU offers dedicated Data Transfer Nodes (DTNs) at each location it supports. Transfers to our systems should only be done through one of these DTNs for several reasons:
DTNs are dedicated to transferring data only and built specifically for allowing users to transfers data as quickly, efficiently, and securely as possible. Only DTN’s are connected to WVU’s Science DMZ network which is a dedicated network to science/research data across campus.
Users who transfer data using a login node can put the login node under strain causing it to slow down affecting other users and even possibly cause the node to fail.
WVU’s recommended method for transferring data is Globus Online. The other supported method is sftp.
Data Transfer Nodes¶
Research Computing offers dedicated servers for transferring files at each one of its supported systems/locations (i.e. Spruce Knob, Thorny Flat, and the Data Depot). The following table describes the services available at each location as well as the associated Globus Endpoint Name or sftp Hostname.
Spruce Knob |
Thorny Flat |
Data Depot |
MIX Google Drive |
|
---|---|---|---|---|
Globus Endpoint |
wvu#hpcdtn |
wvu#thornydtn |
wvu#datadepot |
wvu#GoogleDrive |
sftp Hostname |
data.hpc.wvu.edu |
tf-data.hpc.wvu.edu |
datadepot-sftp.hpc.wvu.edu |
NA |
Alternative Transfer Methods¶
To transfer files to Research Computing systems without Globus Online, users may choose to utilize sftp to the hostnames in the above table. Basic instructions on how to use SFTP can be found here. For users who are more comfortable with Graphical User Interfaces, you may want to utilize one of the following clients: