MySQL – Architecture for Highly Available MySQL with Automatic Failover

disaster-recoveryheartbeatlinuxlinux-haMySQL

I have been researching high availability (HA) solutions for MySQL between data centers.

For servers located in the same physical environment, I have preferred dual master with heartbeat (floating VIP) using an active passive approach. The heartbeat is over both a serial connection as well as an ethernet connection.

Ultimately, my goal is to maintain this same level of availability but between data centers. I want to dynamically failover between both data centers without manual intervention and still maintain data integrity.

There would be BGP on top. Web clusters in both locations, which would have the potential to route to the databases between both sides. If the Internet connection went down on site 1, clients would route through site 2, to the Web cluster, and then to the database in site 1 if the link between both sites is still up.

With this scenario, due to the lack of physical link (serial) there is a more likely chance of split brain. If the WAN went down between both sites, the VIP would end up on both sites, where a variety of unpleasant scenarios could introduce desync.

Another potential issue I see is difficulty scaling this infrastructure to a third data center in the future.

The network layer is not a focus. The architecture is flexible at this stage. Again, my focus is a solution for maintaining data integrity as well as automatic failover with the MySQL databases. I would likely design the rest around this.

Can you recommend a proven solution for MySQL HA between two physically diverse sites?

Thank you for taking the time to read this. I look forward to reading your recommendations.

Best Answer

You will face the "CAP" theorem problem. You cannot have consistency, availability and partition-tolerance at the same time.

DRBD / MySQL HA relies on synchronous replication at the block device level. This is fine while both nodes are available, or if one suffers a temporary fault, is rebooted etc, then comes back. The problems start when you get a network partition.

Network partitions are extremely likely when you're running at two datacentres. Essentially, neither party can distinguish a partition from the other node failing. The secondary node doesn't know whether it should take over (the primary has failed) or not (the link is gone).

While your machines are in the same location, you can add a secondary channel of communication (typically a serial cable, or crossover ethernet) to get around this problem - so the secondary knows when the primary is GENUINELY down, and it's not a network partition.


The next problem is performance. While DRBD can give decent** performance when your machines have a low-latency connection (e.g. gigabit ethernet - but some people use dedicated high speed networks), the more latency the network has, the longer it takes to commit a transaction***. This is because it needs to wait for the secondary server (when it's online) to acknowledge all the writes before saying "OK" to the app to ensure durability of writes.

If you do this in different datacentres, you typically have several more milliseconds latency, even if they are close by.

** Still much slower than a decent local IO controller

*** You cannot use MyISAM for a high availability DRBD system because it doesn't recover properly/ automatically from an unclean shutdown, which is required during a failover.