Tier 2 Data Management

This page aims to surface some lesser-known use cases for the Tier 2 storage platform available on MSI systems.

Multipart Uploads configuration

This feature is typically used by S3 clients to improve write performance. If a multipart upload fails unexpectedly, it can result in partially uploaded data lingering on the storage system in the form of incomplete multipart uploads.

These incomplete multipart uploads count against a user’s quota and, in large numbers, can affect the performance of list operations on the bucket. MSI recommends applying a data lifecycle policy to automatically delete incomplete multipart uploads. The following is an example lifecycle policy:

<LifecycleConfiguration>
  <Rule>
    <ID>abort-multipartupload-3days</ID>
    <Prefix></Prefix>
    <Status>Enabled</Status>
    <AbortIncompleteMultipartUpload>
      <DaysAfterInitiation>3</DaysAfterInitiation>
    </AbortIncompleteMultipartUpload>
  </Rule>
</LifecycleConfiguration>

Set this up with the following steps:

  1. Download the example XML file from the Tier 2 bucket s3://helpdesk.

    s3cmd get s3://helpdesk/abort-lc-mp-3days.xml $HOME
    
  2. Apply it to your existing bucket.

    s3cmd setlifecycle $HOME/abort-lc-mp-3days.xml s3://bucketname
    

This tells Tier 2 to clean up these old incomplete multipart uploads for you. Tier 2 enforces lifecycle policies once a day, late at night.

s3cmd also has a command to list incomplete multipart uploads:

s3cmd multipart s3://bucketname

This can be very slow to run when there are a large number of files and incomplete multipart uploads.