I have read about the versioning feature for S3 buckets, but I cannot seem to find if >recovery is possible for files with no modification history. See the AWS docs here on >versioning:
I've just tried this. Yes, you can restore from the original version. When you delete the file it makes a delete marker and you can restore the version before that, i.e: the single, only, revision.
Then, we thought we may just backup the S3 files to Glacier using object lifecycle >management:
But, it seems this will not work for us, as the file object is not copied to Glacier but >moved to Glacier (more accurately it seems it is an object attribute that is changed, but >anyway...).
Glacier is really meant for long term storage, which is very infrequently accessed. It can also get very expensive to retrieve a large portion of your data in one go, as it's not meant for point-in-time restoration of lots of data (percentage wise).
Finally, we thought we would create a new bucket every month to serve as a monthly full >backup, and copy the original bucket's data to the new one on Day 1. Then using something >like duplicity (http://duplicity.nongnu.org/) we would synchronize the backup bucket every >night.
Don't do this, you can only have 100 buckets per account, so in 3 years you'll have taken up a third of your bucket allowance with just backups.
So, I guess there are a couple questions here. First, does S3 versioning allow recovery of >files that were never modified?
Yes
Is there some way to "copy" files from S3 to Glacier that I have missed?
Not that i know of
Use EC2 instances behind an ELB.
Upon launch, your nodes should download and install the latest security updates and do whatever other configuration is necessary to get your application running.
As for cycling out your instances, once a day:
- Create a second EC2 node
- Wait for it to configure itself and become available
- Add the second node to the ELB
- Remove the old node from the ELB
- Shoot the old node in the head
All of the above can be trivially automated using various AWS APIs, perhaps even as a Lambda job.
Best Answer
We use S3 Lifecycle Policies for the Athena temp files cleanup.
Our
AthenaStagingDir
iss3://.../tmp/
and we've got a Lifecycle rule for that/tmp/
prefix that:I haven't found a way to immediately delete objects after 1 day but I haven't tried too hard to be honest. This 2-step / 2-day approach works well.
Hope that helps :)