Alright, so I'm setting up an off-site backup of my S3 data and have set up Cross-Region Replication to a new bucket in another region. However I have a lot of data in the original bucket which needs to be moved as well, >100TB in >20 million files. My first thought was to just run:
aws s3 sync s3://source-bucket s3://destination-bucket
on an EC2 instance. But that's taking way longer than I anticipated, and with all the PUT/LIST requests it's making, costing more than I anticipated.
Reading the AWS docs, it looks like they recommend AWS Snowball for this kind of operation. From the FAQs:
As a rule of thumb, if it takes more than one week to upload your data
to AWS using the spare capacity of your existing Internet connection,
then you should consider using Snowball.
However it looks like those are intended for either import or export, not both at once. Would I need to do two separate jobs with the same snowball? Won't I still get charged for all those PUT/LIST requests anyway, to get the data onto the snowball? They mention the $0.03/GB for data transfer, but didn't mention API requests.
Best Answer
If you're copying data that is already in AWS to another region, Snowball doesn't seem like a good option. Running sync from the command line of an EC2 instance within AWS seems right.
Have you tried configuring max_concurrent_requests? Possibly that will address the performance issue. Also, this article has more suggestions for your situation.
How can I copy objects between Amazon S3 buckets?