.. _bs-file-transfer: File Transfer (Globus and SFTP) =============================== Overview -------- .. `Globus Online `__ is a tool that allows WVU researchers to `transfer data `__ between workstations, `share data `__ with colleagues, `publish datasets `__, and archive data in Google Drive for Education. For a quick start guide please visit: `Globus: How to Get Started `__. If you need further assistance beyond this documentation, please contact `iour helpdesk `__. **Review** https://www.globus.org/researchers/getting-started for a step by step guide on use Globus Online. Note: For a video on how to utilize Globus Online with Google Drive, please see https://www.youtube.com/watch?v=tDdVsNVK3ko&feature=youtu.be. Transfer Data ------------- Use Globus Online web browser to transfer data files between your PC/workstation and the WVU HPC systems. There are two dedicated servers, data transfer nodes (DTNs), that are connected to `WVU's Science DMZ (REX) `__ that allow data to be transferred with low latency and high bandwidth across WVU campus as well as to other locations around the globe. This space is available for all researchers at WVU who need to temporarily store and transfer data. .. In addition to faster transfer speeds, Globus Online also includes restartable file transfers in case a connection fails and includes data checksumming to ensure data is correctly transferred.   Logging in and transferring files ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Visit the URL `Globus Online `__ to login. .. `__ to login. Select West Virginia University from the organizational list and login using your WVU Login username and password. .. Navigate to the `Transfer Files `__ page to start transferring files. .. Additional information about how to Login and Transfer Files can be found at `Globus: How to Get Started `__. .. **Note:** WVU's High Performance Computing (HPC) End Point is named wvu#hpcdtn. You can search for other endpoints in the "Endpoint" Dialog Box. Transferring data to your local workstations/PC ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To learn how to transfer data to your local workstation, you will need to install Globus Connect Personal. .. Instructions on how to install Globus Connect Personal is located at  `Globus: Globus Connect Personal `__. .. *Note*: Globus Connect Personal is only needed to transfer files to your personnel workstation/PC. .. Most major academic institutions already have a Globus Connect Server installed, which allows you to transfer easily to the remote institution.  Share Data ---------- Use Globus Online to share your data files with researchers located at other institutions around the world without needing administrative assistance. *Note* If you are not using an endpoint that starts with ``wvu\#``, before you can start sharing data using Globus, you must contact Research Computing and request that a shared endpoint be created for you. .. Globus, visit `Globus: How to Share Data Using Globus `__. Archive Data ------------ Globus Online provides an easy mechanism for researchers to archive their data sets for free to their MIX Google Drive account via a `Google Drive Connector `__. All WVU faculty and students have unlimited storage in Google Drive through their MIX account. Staff can request a MIX account by contacting helpdesk@hpc.wvu.edu. Google Drive gives researchers a safe environment to archive their data sets, which replicates between multiple data centers to prevent data loss. Note: You can also use the Google Drive web interface as well as the `Google Drive Sync Client `__ to transfer files to your Google Drive MIX account; however, the sync client and web interface is best used for small files (Example .docx, .pdf, etc.). Globus Google Drive Connector is optimized for transferring a large number of files as well as very large files, which cannot be done, via the Google Drive Web Interface on Sync Client. You will see the best performance with Google Drive Connector with transferring several large files at once. If you have many small files, it might be best to zip/tar these files first before transferring to Google Drive. **Google Drive is not approved for use of sensitive or protected data sets (i.e. HIPPA, FERPA, ITAR, subject to Export Control laws, etc.). If you need assistance archiving these types of data sets, please contact helpdesk@hpc.wvu.edu.** To archive data to a Google Drive you will need to follow the instructions in the Globus Online Google Drive Connector Instructions, which will guide you on how to make a Shared Endpoint that you can attach to your MIX Google Drive Account to transfer data to/from. `Globus Online Google Drive Connector Instructions `__ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - **Note:** In step one, when searching all endpoints you will need to search for ‘wvu#googledrive’. To transfer files to your MIX Google Drive Account, review the “Transfer Data” Section above. You will want to select the name of the Shared Endpoint you just created in step 3 as one of the two endpoints. The other endpoint could be a WVU endpoint, such as wvu#hpcdtn or a personal endpoint you created that connects to your workstation/PC. Publish Data ------------ Use Globus Online to publish your datasets, possibly as a requirement from a granting agency. Published datasets are organized by communities and their member collections and are searchable by other Globus users. *Note* If you are not using an endpoint that starts with ``wvu\#``, before you can start sharing data using Globus, you must contact Research Computing and request that a shared endpoint be created for you. For more information on how to publish your data sets using Globus Online, visit `Globus, Data Publication User Guide `__. Due to a large amount of data users usually need to transfers to/from HPC systems, reliable, secure, and highly optimized transfer methods needs to be utilized. As a result, WVU offers dedicated Data Transfer Nodes (DTNs) at each location it supports. Transfers to our systems should only be done through one of these DTNs for several reasons: * DTNs are dedicated to transferring data only and built specifically for allowing users to transfers data as quickly, efficiently, and securely as possible. Only DTN's are connected to `WVU's Science DMZ network `__ which is a dedicated network to science/research data across campus. * Users who transfer data using a login node can put the login node under strain causing it to slow down affecting other users and even possibly cause the node to fail. WVU's recommended method for transferring data is `Globus Online `__. The other supported method is sftp. Data Transfer Nodes ------------------- Research Computing offers dedicated servers for transferring files at each one of its supported systems/locations (i.e. Spruce Knob, Thorny Flat, and the Data Depot). The following table describes the services available at each location as well as the associated Globus Endpoint Name or sftp Hostname. +-----------------+------------------+----------------------+----------------------------+--------------------+ | | Spruce Knob | Thorny Flat | Data Depot | MIX Google Drive | +=================+==================+======================+============================+====================+ | Globus Endpoint | wvu#hpcdtn | wvu#thornydtn | wvu#datadepot | wvu#GoogleDrive | +-----------------+------------------+----------------------+----------------------------+--------------------+ | sftp Hostname | data.hpc.wvu.edu | tf-data.hpc.wvu.edu | datadepot-sftp.hpc.wvu.edu | NA | +-----------------+------------------+----------------------+----------------------------+--------------------+ Alternative Transfer Methods ---------------------------- To transfer files to Research Computing systems without Globus Online, users may choose to utilize sftp to the hostnames in the above table. Basic instructions on how to use SFTP can be found `here `__. For users who are more comfortable with Graphical User Interfaces, you may want to utilize one of the following clients: * https://winscp.net/eng/index.php * https://filezilla-project.org/