Amazon EC2 Data Persistence

amazon ec2amazon-ebs

According to the Amazon EC2 FAQ, when an instance is terminated the data is gone. What steps can I take to preserve data in the event my instance is rebooted? I've been looking into EBS and S3 – would either of these be useful to store an active database? How often are instances rebooted anyways?

Best Answer

Like others have said, EBS--Elastic Block Storage. I am using it myself now that it is released to the general public. It is better than S3 on multiple points:

  • EBS are fast. Faster than even the local mounts, according to Amazon.
  • EBS mounts as proper devices. Unlike S3, which you'll need custom S3 oject access logic in your code, or middleware (JungleDisk, ElasticDisk, et al) which present their own problems and costs
  • EBS are easy to back up. Amazon give one the ability to take snap shots, which are saved on S3
  • EBS are portable between instances--volumes can be unmounted from one instance, and attached to another instance
  • EBS devices can even be RAID'ed together for improved reliability

My experience with EBS so far has been the most positive thing about AWS I've dealt with to date.


Update: While my experience with EBS has been positive, others have had issues. Very specifically EBS do not implement fsync() correctly. Ted Dziuba has some interesting words about this in his blog post Amazon — The Purpose of Pain: Myth 2: Architecture Will Save You from Cloud Failures

This gets even more entertaining with Amazon Elastic Block Store, which, as the Reddit administrators have found, will happily accept calls to fsync(), and lie to your face, saying that the data has been written to disk, when it may not have been.