File Storage

MSI provides two storage tiers for different stages of research workflows. For a service-level overview, capacity guidance, and additional policy information, see the MSI Storage page.

Choosing the right storage location

Location

Best for

How you access it

Important limits

Home directory

Personal configuration files, source code, small working files, and lightweight software environments.

POSIX filesystem access on MSI systems, Open OnDemand, SSH.

Limited capacity and file count. Use project space rather than $HOME for shared or fast-growing research data.

Project space

Active shared research data that a PI group needs to keep on the high-performance filesystem.

POSIX filesystem access on MSI systems, Open OnDemand, SSH.

Intended for active work. Snapshots are available, but groups still need their own retention plan for important data.

Global scratch

Temporary job data, staging, and short-lived intermediate outputs.

POSIX filesystem access on MSI systems, Open OnDemand, SSH.

Data older than 30 days is deleted automatically. No snapshots or backups.

Tier 2

Large-scale shared storage, inactive data, data sharing, and S3-compatible workflows.

s3cmd, rclone, and Globus.

Not a mounted filesystem. No snapshots or MSI-managed backups.

Tier 1

Tier 1 is MSI’s high-performance filesystem for active research data. It includes private home directories, shared project space, and global scratch.

For common workflows that move data between MSI, local systems, Google Drive, and external collaborators, see Transferring Data To and From MSI.

Default allocations

  • Home directory: 200 GB and 1 million files per user

  • Project space: 150 GB and 5 million files per group by default

  • Global scratch: 40 TB and 10 million files per group

Common Tier 1 paths

# Private home directory
/users/[0-9]/$USER

# Project space
/projects/standard/PROJECT_NAME
/projects/regulated/PROJECT_NAME

# Shared and public project directories
$SHARED
$PUBLIC

# Global scratch
/scratch.global/$USER
$SCRATCH

Project directories are organized for shared group access, so MSI does not create private home directories directly under the root of a project. If you need a personal working area inside project space, create a directory inside the project’s shared area:

mkdir -v $SHARED/$USER

This keeps the top-level project directory organized while still giving each user a predictable place for personal working files inside the group’s Tier 1 space.

Data Insurances

MSI takes several precautions to reduce the risk of data loss on Tier 1, but those protections are not a substitute for your own backups of irreplaceable data.

Snapshots

Snapshots retain the same file structure and permissions that existed when the snapshot was taken. They are available for:

  • Home directories under /users/*/*

  • Project directories under /projects/*/PROJECT_NAME, including subdirectories and SURFs folders

Snapshots are not available for /scratch.global or Tier 2 storage.

Snapshot schedule

  • Daily snapshots are typically named like snapshot_2026-04-15_05_00_00_UTC

  • Weekly snapshots are typically named like snapshot_2026-04-12_00_00_00_UTC

  • Daily snapshots are retained for 6 days

  • Weekly snapshots are retained for 4 weeks

Access and list snapshots

# List snapshots for your home directory
ls -1 $HOME/.snapshot

# Example output
snapshot_2026-03-22_00_00_00_UTC
snapshot_2026-03-29_00_00_00_UTC
snapshot_2026-04-05_00_00_00_UTC
snapshot_2026-04-09_05_00_00_UTC
snapshot_2026-04-10_05_00_00_UTC
snapshot_2026-04-11_05_00_00_UTC
snapshot_2026-04-12_00_00_00_UTC
snapshot_2026-04-12_05_00_00_UTC
snapshot_2026-04-13_05_00_00_UTC
snapshot_2026-04-14_05_00_00_UTC
snapshot_2026-04-15_05_00_00_UTC

# List snapshots for your project's shared directory
ls -1 $SHARED/.snapshot

# Browse a specific snapshot
ls -lah "$SHARED/.snapshot/snapshot_2026-04-12_00_00_00_UTC"

Copy data back from a snapshot

# Restore a file from a home directory snapshot
cp -a "$HOME/.snapshot/snapshot_2026-04-14_05_00_00_UTC/example.txt" "$HOME/"

# Restore a project file into your personal work directory in shared space
cp -a "$SHARED/.snapshot/snapshot_2026-04-12_00_00_00_UTC/results/output.csv" "$SHARED/$USER/"

# If you are already in the directory that contains the deleted file,
# you can inspect the matching path inside a snapshot first
ls -lah .snapshot/snapshot_2026-04-14_05_00_00_UTC

You must still have permission to write to the destination where you are restoring data.

Disaster recovery

MSI also maintains periodic tape backups for disaster recovery from these Tier 1 locations:

  • /projects/standard/GROUP/public/disaster_recovery

  • /projects/standard/GROUP/shared/disaster_recovery

These tape backups are intended for rare cases where snapshots are not usable, such as a catastrophic storage or data center event. They are not scoped for routine, user-directed file restores, and recovered data may not represent the exact point in time you want. Users should still maintain their own secondary copies of difficult-to-recreate data.

As of December 2, 2025, MSI notes that tape backups are not yet available for /projects/regulated/GROUP/shared/disaster_recovery.

Tier 2

Tier 2 is MSI’s object storage platform for large-scale data management, collaboration, and workflows that benefit from S3-compatible access. For service details, see MSI’s Tier 2 storage page.

Typical Tier 2 workflow

Tier 2 is commonly used as shared lab storage managed at the PI or group level:

  • A PI can request help creating an initial shared bucket and receives a default Tier 2 allocation of 120 TB

  • Group members have a default personal quota of 5 GB

  • The PI or bucket administrator can grant collaborators read and write access to the shared bucket

  • Lab members can then upload, retrieve, and share data without placing all long-term storage pressure on Tier 1

Access methods

Tier 2 buckets use S3-style object storage rather than a mounted filesystem. Common access methods include:

  • s3cmd for command-line access on MSI systems

  • rclone for scripting and syncing workflows

  • Globus for managed transfers

Bucket names are typically referenced like this:

s3://BUCKET_NAME

Additional capacity and backups

Tier 2 data is not protected by snapshots or MSI-managed backups. If data stored there is deleted or lost, MSI may not be able to recover it. PIs and research groups should plan their own backup strategy for important Tier 2 data.

Groups that need more Tier 2 capacity can request additional storage by reviewing the MSI Service Catalog.