Transferring Data To and From MSI

This page outlines common ways to move data between your local computer, external services, and MSI storage.

Choose the transfer method that matches your source and destination:

  • Google Drive and Tier 1: use rclone

  • Local computer and Tier 1: use SFTP

  • Local computer and Tier 1 or Tier 2: use Globus

  • External Globus user and MSI: use the globus-incoming workflow on the MSI Globus endpoint

Before starting any large transfer:

  • confirm that the destination path is correct

  • make sure you have enough quota available at the destination

  • verify the copied data before deleting the original copy

Transfer Between Google Drive and Tier 1 with rclone

For Google Drive transfers, MSI recommends rclone.

Regular transfers can be run from the command line on MSI, typically from an interactive job for larger copies. The one-time Google authentication step may require a browser session, so many users complete the initial setup from an Open OnDemand desktop session and then use the command line for later transfers.

Load the MSI module:

module load rclone/1.64.1

One-Time Configuration

You only need to configure the remote once.

Start the configuration tool:

rclone config

Use answers like these during the setup:

  • n to create a new remote

  • provide a name such as your_umn_gdrive

  • choose google drive for the storage type

  • leave client_id blank

  • leave client_secret blank

  • choose full access when prompted for scope

  • leave root_folder_id blank

  • leave service_account_file blank

  • choose No for advanced config

  • choose No for auto config on the headless MSI system

  • open the provided URL in a browser, sign in to your UMN Google account, authorize access, and paste the verification code back into the terminal

  • choose No for team drive unless you specifically need one

  • confirm the configuration and quit

The resulting rclone configuration is stored in your home directory for later use.

If you cannot complete the browser-based authorization from a plain SSH session, start an Open OnDemand desktop session temporarily, run rclone config there, and then return to the CLI for routine use.

Routine Transfers

Copy a Google Drive folder into a new directory in your current Tier 1 location:

rclone copy -v your_umn_gdrive:Work_MSI/Projects/data ./data

Copy the contents of a Google Drive folder into your current Tier 1 directory:

rclone copy -v your_umn_gdrive:Work_MSI/Projects/data .

Copy a single file from Google Drive into your current Tier 1 directory:

rclone copy -v your_umn_gdrive:Work_MSI/Projects/file.txt .

rclone also works in the opposite direction, so you can reverse the source and destination to move data from Tier 1 back to Google Drive when needed.

Transfer Between Your Local Computer and Tier 1 with SFTP

SFTP is a straightforward option when you want to move files between your local computer and MSI Tier 1 storage over SSH.

For a graphical client, see Transfer Between Your Local Computer and Tier 1 with FileZilla. For large or long-running transfers, Globus is usually the better choice.

Connect to MSI with SFTP:

sftp INTERNETID@agate.msi.umn.edu

Common SFTP commands:

  • lpwd shows your current local directory

  • pwd shows your current remote MSI directory

  • lcd LOCAL_PATH changes your local directory

  • cd REMOTE_PATH changes your remote MSI directory

  • put myfile.txt uploads a file from your computer to MSI

  • put -r mydir uploads a directory from your computer to MSI

  • get myfile.txt downloads a file from MSI to your computer

  • get -r mydir downloads a directory from MSI to your computer

Example session:

sftp INTERNETID@agate.msi.umn.edu
sftp> cd /projects/standard/PROJECT_NAME/shared/INTERNETID
sftp> lcd ~/Desktop
sftp> put -r analysis-output

Use SFTP for Tier 1 locations such as your home directory or project space. SFTP is not an access method for Tier 2 object storage.

Transfer Between Your Local Computer and Tier 1 or Tier 2 with Globus

Globus is often the easiest choice for large or long-running transfers because it manages retries and continues transfers even if your browser session ends.

At MSI, Globus can be used with:

  • UMN MSI Home for Tier 1 storage

  • UMN MSI Tier2 for Tier 2 storage

  • your local system through Globus Connect Personal or another Globus endpoint you can access

Typical Globus workflow:

  1. Sign in to Globus.

  2. Open the Files view.

  3. Select your local endpoint in one pane.

  4. Select UMN MSI Home or UMN MSI Tier2 in the other pane.

  5. Browse to the source and destination folders.

  6. Start the transfer.

  7. Review transfer status in Globus and confirm the files arrived where expected.

For more on MSI’s Globus setup, see MSI’s Globus transfer page.

Receive Data from an External Globus User

If someone outside MSI needs to send you data through Globus, use the MSI Globus endpoint’s globus-incoming location described on the MSI Globus page.

Typical staging locations look like this:

  • Tier 1: /projects/standard/PROJECT_NAME/shared/globus-incoming

  • Tier 2: s3://BUCKET_NAME/globus-incoming

Create the Tier 1 directory from the CLI if it does not already exist:

mkdir -vp "$SHARED/globus-incoming"

For Tier 2, create the globus-incoming prefix from the Globus interface when you set up the transfer destination.

Typical workflow:

  1. Coordinate with the external collaborator on the directory name and expected contents.

  2. Have the collaborator transfer the data into your designated location under globus-incoming on the MSI endpoint.

  3. After the transfer completes, log in to MSI and verify the files.

  4. Move the data from globus-incoming into its long-term destination in Tier 1 or transfer it onward to Tier 2 if that is the better storage location.

Treat globus-incoming as a staging area rather than a permanent storage location.

Choosing the Right Transfer Method

Use this quick guide when deciding how to move data:

  • Use rclone for Google Drive and Tier 1 transfers

  • Use SFTP for simple local-to-Tier-1 copies over SSH

  • Use Globus for large local transfers, managed transfers, and Tier 1 to Tier 2 workflows

  • Use globus-incoming when receiving data from external Globus users