Linux server sync to an Amazon S3 bucket

amazon s3rsyncs3cmds3fs

I am looking for a stable solution to replace a classic server backup to another server using rsync.
I have to sync a whole filesystem (more than 1Tb) to Amazon S3.

Where am I?

Solution 1:
I mapped the S3 bucket to a mounting point in the system using s3fs.
System gets unstable and traffic is really slow. This is no way a solution.

Solution 2:
Using s3cmd sync command. Everything goes smooth at good speeds (at least for less than 2Gb folders).
The problem comes when I try to sync all the filesystem on the server (with some exclusions). The process just hangs.

Any hints?

Best Answer

This is a bad way to do backups. You should be separating your OS configuration from your valuable data. None of your permissions will be transferred, which in the Linux world are a necessity if you're planning on restoring backups (which you should be - backups without verified restorations are pointless).

Firstly, you can synchronise your valuable instance data (e.g. /var/www) to S3 using s3cmd sync as you've stated.

Secondly, using a configuration management utility such as Puppet or Chef, you can spin up a new instance of your OS with minimal effort, ensuring a fresh and reliable set of configurations.

There's no details of your underlying architecture in your question (EC2? VMware? KVM? Xen? Physical hardware?) so I can't recommend any specific tools (i.e. architecture-specific snapshotting). If you're running on a virtual platform (e.g. EC2, VMware, KVM) you should be using that platform's snapshotting architecture.

Related Topic