AWS Glacier – How to Move Data from Glacier Vault to Glacier Deep Archive

amazon-glacieramazon-web-services

I have a few Glacier data vaults with data in them.
I would like to move that data to new a storage class – the "Glacier Deep Archive" ones.

How to do that? I cannot see such an option in the console in vault preferences.

Best Answer

I looked at this when Glacier came out, and I posted a comment on the AWS blog that was never replied to.

Best I can tell there is no migration path from Glacier to Glacier Cold Archive. You will have to migrate the data manually.

I have two suggested approaches:

Local Upload

If you have the data locally and are confident of its integrity simply use the AWS command line or any tool you desire to upload it. You may want to tweak the S3 parameters in your config file to speed this, which can increase your internet bandwidth utilization by using more threads. This is especially useful if you have a lot of small files, with large files you could potentially max out your bandwidth.

Download then Upload

Second approach is to

Restore the data from Glacier
Download the data to a computer, either local or ideally an EC2 on demand instance (not spot as you may lose your data if your instance is terminated)
Upload the data to S3 using the IA tier

Create a User

Here's the S3 command I use for upload from Windows. Note that you need a profile "glacier writer"

You'll have created an IAM user that has access to that bucket, and any other resources you need. Have their access / secret keys available. If you need to do this with a role it's a bit more work but not difficult, there's docs online

aws configure --glacier-writer

You can then edit your configure file to include this or similar. This works well on my home internet connection as I have 20Mbps upload. If you have high bandwidth and a fast machine you can increase the concurrent requests. I've successfully used up to 80 threads on high bandwidth corporate connections, which takes 1-2 xeon cores.

[profile glacier-writer]
region = us-west-2
output = json
s3 =
    max_concurrent_requests = 10
    max_queue_size = 100
    multipart_chunksize = 75MB
    multipart_threshold = 200MB

On Windows this is in

c:\users\username\.aws\configure

On Linux it's in

~\home\.aws\configure

Do the Upload

A simple S3 sync is what I do, but you can also use "s3 cp" to simply upload to S3.

aws s3 sync C:\Source\Folder\ s3://bucket-name/ --profile glacier-writer --storage-class DEEP_ARCHIVE --exclude *.tmp

Related Solutions

AWS S3 – How to Automatically Archive AWS S3 Buckets to S3 Glacier

$ aws s3 cp s3://bucketname s3://bucketname --recursive --storage-class GLACIER

Be aware that there is a cost to transition objects to the Glacier storage class (approximately US$0.05 per 1,000 transition requests, dependent on region, so changing 1,000,000 objects to Glacier would cost approximately US$50).

How to View Pending S3 Glacier Retrievals in AWS Console

From the s3 restore-object CLI documentation page.

To get the status of object restoration, you can send a HEAD request. (...) You can use Amazon S3 event notifications to notify you when a restore is initiated or completed. For more information, see Configuring Amazon S3 Event Notifications in the Amazon Simple Storage Service Developer Guide.

When it says "head" I assume it's referring to S3API Head. That would be:

aws s3api head-object --bucket my-bucket --key index.html

Note that this is using AWS CLI v2, if you're using V1 the syntax may be a bit different.

While the archive is being retrieved, the JSON will contain a Restore key similar to:

"Restore": "ongoing-request=\"true\""

When the archive is ready to be downloaded, the Restore key will change to something like:

"Restore": "ongoing-request=\"false\", expiry-date=\"Thu, 17 Sep 2020 00:00:00 GMT\""

You can then proceed with downloading the archive from the AWS S3 web console like any other file.

Best Answer

Related Solutions

AWS S3 – How to Automatically Archive AWS S3 Buckets to S3 Glacier

How to View Pending S3 Glacier Retrievals in AWS Console

Related Topic