Globus Online for GM/CA @ APS Users
Globus Online is a free service sponsored by DOE, NIH, NSF, Argonne, and
the University of Chicago (see the
list of sponsors).
It addresses the challenges faced by researchers in moving, sharing, and archiving
large volumes of data among distributed sites. With Globus Online, you hand-off
data movement tasks to a hosted service that manages the entire operation,
monitoring performance and errors, retrying failed transfers, correcting
problems automatically whenever possible, and reporting status to keep you
informed so that you can focus on your research. Our tests show that Globus
Online is 2x faster than scp and rsync. This make a big difference reducing
data transfer times from e.g. 6 hours to 3 hours. Also, the transfer progress
can be started and watched from any place with Internet connection, e.g. from ANL
Guest House, airport, or home.
Transferring Data Between GM/CA @ APS and Your Machine
- Login on Globus. There are at least three ways to do it:
- If your institution has an association with Globus, then you can login
to Globus using your existing organizational login. Navigate to
https://auth.globus.org,
choose your institution from the dropdown link and click "Continue". It will
bring your institutional login web page when you will need to enter your
institutional username and password. Then, your institution will authenticate
you with Globus.
- If you have a Google account, you can use it too. Navigate to
https://auth.globus.org
and click on "Sign in with Google". Then follow the prompts.
- You can create a Globus ID personal account (one time for life) at
https://www.globusid.org/create,
Then, navigate to https://auth.globus.org,
choose "Globus ID" from the drop-down link and click "Continue".
- Once you login, you will see the File Manager interface. Choose the two panels option at the top as
one panel (normally left) will be showing your home directory at GMCA and the other one will be
showing a directory on your computer.
- If you already have globusconnect application installed on the computer
(the application which will be receiving data) and the computer already has an encryption key
registered with Globus under your account), then skip to Step-4. Otherwise
install globusconnect on the computer and obtain a unique computer Setup
Key from Globus. Click into any of the two "Collection" fields to reveal the globusconnect
download link:
The above link brings you to the page where you can generate your personal Setup Key (if needed)
and download Globus client for MacOS, Linux, and Windows:
- Name your data-receiving computer as a new Globus Endpoint. Then, generate the computer Setup
Key by pressing the "Generate Setup Key" button and copy this key to clipboard. You
will need to provide this key to Globus Client after the installation.
- Download a one-click Globus Connect Setup for the operating system of the data receiving computer.
The application is available for MacOS, Linux, and Windows. Note the place
where you saved the installer (e.g. on the Desktop or in the "Downloads" folder).
- Unpack and install Globus Connect by running the downloaded Setup application.
When Setup asks you for a setup key, paste from clipboard the
previously copied Setup key. More detailed instructions can be found on the
Globus website.
|
- Start the globusconnect
application on the data receiving computer. For example, on Linux starting
globusconnect is as simple as typing
"./globusconnect" in respective directory:
On Windows the application can be started via Start -> Programs ->
Globus Connect menu. Normally no administrative privileges are required.
NOTE: The globusconnect client should be
started each time you want to transfer files to or from your data receiving
computer. This application makes the computer visible among available endpoints
on Globus Online web page. The client has a GUI interface, which looks like
this:
- On the Globus Connect web page, start typing gmca in the left "Collection" field to display
the GMCA endpoints. Then, select one of the following endpoints:
GMCA 23ID/APS Data Collection
or:
GMCA 23ID/APS Data Collection 2
You can select any of the two regardless of the GMCA beamline you used.
- Enter your GMCA account credentials as prompted to access data at GMCA. Please note that the account is
normally active for 2 days after your beamtime to avoid interference with other users. However, if there
is an unforeseen reason for an extension, please contact your host.
- Once the credentials are accepted, you should see your folders at GMCA:
- Click on the right "Collection" pane and choose your local endpoint on the "Your Collections" tab:
- After that you should see a listing of local directory in the right pane;
- Select a file or directory and click on the highlighted "arrow button"
to initiate the transfer:
- To watch the file transfer progress or possibly cancel a transfer, choose
"View Transfers" from the drop down menu in the top bar of the
Globus Online web page. The screen will look like this:
NOTE-1: In the present form Globus Online offers directories synchronization
option in the "Transfer Files" drop down box, but no continuous synchronization.
Although the lack of continuous synchronization is some inconvenience, it is
outweighed by the speed, which has been tested to be at least 2x faster than
traditional scp or rsync (rsync deploys scp for the transfers). We recommend to
use Globus Online for transferring large amounts of data and possibly scp/rsync
for post-transfer synchronizations.
NOTE-2: All workstations serving the GM/CA Globus nodes have 100Gbs fiber
network connection to the Argonne border. If you find download speeds insufficient,
the problem is almost certainly not at the Argonne site. Check the speed of your
connection to Argonne with one of the tools suggested here.
Additional Learning Resources
- Video guide: "Remote data
transfers with Globus GridFTP" presented by Raj Kettimuthu, MCS. This
12-minutes video courtesy one of core Globus developers guides you through the
steps of setting up Globus transfers with GMCA servers.
- In-depth video guide
"Globus for Research Data Management" by Rachana Ananthakrishnan, Globus;
presented at the Argonne Training Program on Extreme-Scale Computing, Summer 2015
(53 minutes).