Method for offsite backup of EC2 servers

amazon-web-servicesbackupdisaster-recovery

We run a dozen or so Ubuntu Linux webserver production instances on Amazon VPC. The instances are bootstrapped and managed via Puppet. Most management is done via the AWS Console.

Our AWS credentials are pretty secure. The master-account is hardly ever needed, has a strong password and 2-factor auth. A few trusted admins have access to most services via their own IAM accounts, also with strong passwords and 2-factor auth. A few IAM accounts have very limited access for specific purposes, such as writing files to S3. Access by other employees to any high-level credentials is very limited. Overall, the chance of someone gaining access to the Console or API's seems low.

The recent Code Spaces debacle, where someone gained high-level access to their AWS Console and deleted instances, volumes and EBS Snapshots, effectively making it impossible for Code Spaces to recover their business, got me to investigate methods for backing up our data off-line/offsite (i.e. out of reach of our main AWS account).

How can I ensure our customer data is safe from being wiped out by someone who gains access to our AWS credentials, or by some disaster at AWS? Should be automatic, stable and reasonably priced.

I can't seem to be able to find an 'easy' way, after searching for a few hours. Copying EBS snapshots to another AWS account doesn't seem possible. I can't export EBS snapshots to S3 objects. I could rsync all important data by pulling from a third-party server but I'd need to script it to handle things like varying numbers of servers, retention, error-handling, etc. Seems like a lot of work. I found no ready-to-go software for this.

Our current backup strategy consists of nightly automated EBS Snapshots of all volumes, as well as uploading compressed MySQLdumps to S3. All source code and Puppet code is deployed from external version control, but our customers' files and MySQL databases are only stored on the EBS volumes and their snapshots, i.e. insider the AWS ecosystem.

Best Answer

A lot of people tend to over-think this. Just think of these servers as if they were deployed in a colo or in a corporate datacenter. In that case, how would you back them up?

Likely it would be via a "legacy" backup product (Netbackup, Amanda, BareOS, etc.) that is connected to a tape library or VTL.

This is something you should consider doing for your AWS infrastructure. Build up a backup server and tape library outside of amazon somewhere and use that as your "doomsday" restoration method.

Tape is one of the most reliable data storage mechanisms and unlike all other cloud backup systems, is not vulnerable to the type of thing that happened to CodeSpaces. Your backup data is truly offline, and you can keep the tapes in as secure a location as you choose - anywhere from a fire safe in the office to renting a safe deposit box at your local bank. Getting that kind of protection from a cloud storage provider is impossible.

You already have configuration management in place. (Yay!) So in the event of a disaster, you'll be able to re-build your servers in a reasonably fast manner, so the tape backup (or VTL) will be mostly for your data. Databases, uploaded files, etc. Things that aren't covered by your puppet manifests.

If this isn't an option, the next best thing would be to create a completely separate AWS account for backup purpose. Within that account, create IAM credentials for S3 that have upload-only permissions and then use that from your production environment to push backups. Ensure these credentials are kept in a completely separate location from your production credentials to limit the possibility that they both get compromised at the same time.