Highly Available Web Application (LAMP)

apache-2.2high-availabilitylamp

I work for a small company who provides a web application for thousands of users. Earlier this year they had one server hosted one company. We recently acquired another server in a different location with the hopes of one day making this a redundant failover machine. I understand what to do with the mysql replication, I plan on using a master-master replication setup, and rsync to sync the scripts and files, however I am at a stand still about how to configure the fail-over. Ideally I would like the two machines to accept requests, like a round robin dns, however if one machine goes down I do not want requests to go that machine. All of the solutions I am come across assumes high availability of servers in the same location, these servers are in two completely different locations with different public ip address. Any help would be great. Thanks

Best Answer

Typically, heartbeat (pacemaker) or MMM is used to manage an IP resource that would fail over dynamically. For that to work effectively, you need to share the same network segment.

If the servers are not in the same physical space, even having two disparate Internet links for monitoring is more fallible than one of the links being a several foot serial cable.

You will need to measure the risk and prioritize based on your needs. What's your top priority? Availability or data integrity? If data integrity is not a priority, you could potentially failover automatically, but you still risk partitioning. The CAP theorem explores this in greater detail.

It's generally not advised to write to both master servers at the same time, as there can be id conflicts. You can configure an offset but this is something that needs to be considered with your entire architecture in context.

Based on what I know from what you described, I'd probably lean towards data integrity. I'd setup dual master, only write to one master IP from your application. In case of failure on your primary, I would have the manual failover procedure be to repoint the Web application to the secondary db.

If you insisted on automatically failover, you could write a script that would consider more than two failure points and you could minimize the data risk with the additional logic. This architecture is substantially more complicated, however, and you would have to design some of it yourself.

There's a variety of technologies available between MySQL clustering (NDB) and Google's patches but nothing completely eliminates the CAP theorem.