Best way to mirror data across two-node apache cluster

amazon ec2amazon s3apache-2.2drbdrsync

I'm architecting a new server structure and I plan to have two apache workhorses and a sql database behind both of them. I was wondering what the best way to mirror the data between the apache servers is. User data should be limited on these servers as most of it will be in the could with S3.

From the prelimiary research I've done so far I've read about GlusterFS and DRBD, but would a simple rsync script do the trick?

Best Answer

Honestly I wouldn't suggest doing live-time replication between the apache machines. Have them have their own code, or rsync from the "main" server every so often. live-time (or near) is a lot of file inspection that just isn't necessary 99% of the time.

Personally i would recommend having three layers.

Load Balancing / Web Servers (your http/php processes)

File Servers (code/files that are needing to be shared across all web nodes)

Database Servers (your backend databases)

A lot simpler then having to do full replication between servers.

If you dont have the ability to have a dedicated file server (NFS/ect), Have "Web2" Mount "Web1's User Uploads", Both webservers will be able to read & write to the shared area, no syncing ect required unless your updating website code.

Web1
/var/www/website/www
/var/www/website/_files

web2
/var/www/website/www
/var/www/website/_files (NFS mouned to Web1)
/var/www/website/_files.bak (rsync copy from web1 incase web1 explodes)

Both servers are near-livetime in terms of storage, they're redundant, and you dont have to add in any complex syncing nonsense.

edit:

http://www.migrate2cloud.com/blog/how-to-setup-nfs-server-on-aws-ec2

there's a guide on how to use NFS on an EC2 instance.