Mysql – STONITH with a DRBD/Pacemaker/Corosync 2 node cluster

drbdMySQLpacemaker

So I am seeing a lot of conflicting viewpoints out there on using STONITH with a 2-node DRBD/Pacemaker/Corosync cluster for replicating MySQL data. The example I could find on the Pacemaker website seems to turn it off, but a lot of other places say you should keep it on….. My setup will be 2 nodes with 2 interfaces, one physically connected to the other machine, the other hooked up to a switch. In that case, if I have redundant communications is STONITH necessary? If a server loses both network connections it wont be receiving any MySQL data anyway, and when it comes back up I plan to set the stickyness to infinite so it (shouldn't) won't try to become the master. In this case is STONITH necessary or even advisable?

Best Answer

The best thing to do is test what actually happens under different failure modes, to make sure there is no single failure that could cause both MySQL servers to try to become masters.

Test disabling the internet connection on one server. See what happens on both servers, and watch what happens when you bring it back up.

Do the same for any redundant connections. Then do the same with disabling ALL network connections at once.

One reason for not doing STONITH on a two node cluster is it is fairly easy to end up with both nodes trying to kill the other, and actually succeeding. You need to test your setup to make sure that they don't either both shutdown, or both keep running as masters and get your database out of sync.

One other thing I recommend, while you are testing it, before it goes into production: Intentionally break it. Do something that will cause mysql and drbd to get out of sync, and learn how to fix it. Write down what you needed to do to fix it. Because it's much better to know how to do that BEFORE you really need to.