Scratch

Scratch Space or Scratch Storage refers to any storage at MSI that is impermanent, meaning all data it contains is deleted periodically or after a given event.

Global Scratch

Global Scratch (/scratch.global) is a large storage space that is available to all MSI users collectively. It has a total storage capacity of 910 TB, and quotas are on a per-group basis at a limit of 40 TB or 10 million files.

In order to use Global Scratch, users must access /scratch.global and create a directory named after their user there. The following command is a good starting point, creating the user directory, changing into it, and printing the full path:

mkdir /scratch.global/$USER && cd /scratch.global/$USER && pwd

Data on Global scratch will become eligible for deletion 30 days after file creation. As such, it is not recommended that users store any important data in Global Scratch. Users are advised to keep data larger datasets in Tier 2 so they may be restored into Global scratch as needed.

Local Scratch

Local scratch (/scratch.local) is temporary storage space specifically allocated to a job’s compute node(s) while it is running. It must be specifically requested by the job slurm script by using the --tmp flag.

Any data in that allocated storage is immediately deleted after a job has finished running. It is crucial that users include commands to transfer any relevant job-generated data to Tier 1 storage before the job terminates.

Data Backups

No scratch space has any kind of backup system. Any data that is deleted from scratch space at any point is gone forever. This goes to reinforce our recommendation to never store important data in scratch storage.

Quirks and Edge Cases

  • Some data transfer methods, such as s3cmd sync with default settings, will preserve timestamps on transferred data. Accurate timestamps are important for correctly determining whether items have exceeded a 30-day period in scratch space, so it is recommended that users favor methods that update timestamps after transfers. With s3cmd sync, for example, the --no-preserve flag can be passed to enable this.