Amazon EC2 Backup Strategy

amazon ec2backup

I have a couple Web server/DB server setups using Amazon's EC2. I am currently taking daily snapshots of all my system and EBS drives that contains all of my application files, DB files, source code and DB backups. I have a console application that runs the backup creations on a schedule. My images are EBS images.

I am working on a task that will drop my snapshots after so many days. I guess my question is, Should/can I also schedule a complete image/EBS task as well? This way, if the server fails or is corrupted I can just launch the latest image then apply the latest snapshot.

As I am working on my backup strategy, I am using Jungle Disc to back up my data discs.

Best Answer

My recommendations:

  1. Always document and/or script the setup of each new instance so that you can reproduce the software installation and system configuration in the event you lose the instance. Test this by starting a new instance and following the procedure. You can use a custom, private AMI if the installation takes a long time and you need to start instances quickly, but that AMI itself should be built using a documented and/or scripted procedure.

  2. Keep your important data on separate EBS volume(s) and not on the root EBS volume. This has many benefits including making it easier to port your data to new instances (e.g., based on different AMIs) and making it easier to get copies of your data on other instances (e.g., with snapshots and new volumes).

  3. Create regular snapshots of the EBS data volumes. If possible/applicable, use a tool like my ec2-consistent-snapshot to improve the chances that you are taking a snapshot of a consistent filesystem / database. Back up the data outside of AWS/EC2, as your AWS account itself is a single point of failure.

  4. Create snapshots of the root EBS volume from time to time on important instances. Though this may help you in the event of instance or EBS volume failure, that part is not so critical because of #1 and #2 above. The main reason I do this is that creating snapshots reduces the risk of failure of the root EBS volume itself.

The rate of failure of an EBS volume is directly related to the number of blocks that have been modified on that volume since the last EBS snapshot.