Tier 2 Data Management
This page aims to surface some lesser-known use cases for the Tier 2 storage platform available on MSI systems.
Multipart Uploads configuration
This feature is typically used by S3 clients to improve write performance. If a multipart upload fails unexpectedly, it can result in partially uploaded data lingering on the storage system in the form of incomplete multipart uploads.
These incomplete multipart uploads count against a user’s quota and, in large numbers, can affect the performance of list operations on the bucket. MSI recommends applying a data lifecycle policy to automatically delete incomplete multipart uploads. The following is an example lifecycle policy:
<LifecycleConfiguration>
<Rule>
<ID>abort-multipartupload-3days</ID>
<Prefix></Prefix>
<Status>Enabled</Status>
<AbortIncompleteMultipartUpload>
<DaysAfterInitiation>3</DaysAfterInitiation>
</AbortIncompleteMultipartUpload>
</Rule>
</LifecycleConfiguration>
Set this up with the following steps:
Download the example XML file from the Tier 2 bucket
s3://helpdesk.s3cmd get s3://helpdesk/abort-lc-mp-3days.xml $HOME
Apply it to your existing bucket.
s3cmd setlifecycle $HOME/abort-lc-mp-3days.xml s3://bucketname
This tells Tier 2 to clean up these old incomplete multipart uploads for you. Tier 2 enforces lifecycle policies once a day, late at night.
s3cmd also has a command to list incomplete multipart uploads:
s3cmd multipart s3://bucketname
This can be very slow to run when there are a large number of files and incomplete multipart uploads.