How to achieve redundancy across data centers

failoverredundancy

I have a LAMP server with a lot of hardware redundancy built in. I am not worried about the server becoming unavailable. What I am worried about, however, are potential network issues in the data center the server is in. What I would like to have is another server in another data center for redundancy. Load balancing is less of a concern.

With that said, I am relatively clueless on two points:

  1. How to have two servers in two geographically separate data centers that have exactly the same data, in terms of both files and MySQL databases.
  2. How to ensure that all traffic coming into one data center are automatically transferred to the other database in the case of a network or server failure at the first data center.

Any guidance on how to accomplish the above two problems would be greatly appreciated.

Best Answer

Your question has many parts!

For MySQL, you will want to have two servers that replicate from each other, so that if either one is the active master, the other will get its data. You can monitor that they're getting the same data by checking how many seconds behind the master the replica's replication is. You should be writing to only the active master, or you'll probably wind up horking your database when they both use the auto_increment value.

For files, you might want to run rsync periodically to propagate changes back and forth. It might be enough just to monitor whether they have the same files - there are a variety of ways to do that.

Your second problem is more tricky. Suppose the primary data center just suddenly goes offline. How does the backup data center know what to do about it? There generally needs to be a manual component in deciding that a failure has occurred and whether to failover to the backup. To have client processes automatically move to the backup would usually require a third site that is either able to proxy connections to the active data center or to provide fast DNS switchover for names. I recommend the former, because using DNS for data center failover requires low TTL values.

Related Topic