High availability load balancing for cold standby with automatic service startup

amazon ec2amazon-web-serviceshaproxyhigh-availabilitykeepalived

I'm trying to set up a high availability cluster in AWS for an application that I'm running. The application is on an EC2 instance which sits in front of an Oracle RDS. If I use traditional load balancing and have two instances running at the same time, the database becomes corrupt because the two instances will be making changes to the database but will not be aware of each other, so I need to do active/passive load balancing.

The problem that I have is that if the passive instance is running the application, even with no traffic coming into it, it nevertheless makes changes to the database around 0.1% of the time. This means that I need to ensure that the application service is not running on the backup node when the primary node is healthy.

My ideal scenario would be to have something that runs active/passive failover which can do the following:

  1. Health-check the primary node
  2. If the primary node is healthy, forward traffic to the primary node
  3. If the primary node is not healthy, run a script which starts up the service and then forward traffic to the secondary

I've been looking into HAProxy, but I haven't seen a way to get HAProxy to run an arbitrary script on the backup server in the event of a failure in the primary node.

I've seen some discussion on using keepalived. Is what I'm suggesting possible in keepalived? Is there something else that can do this?

Best Answer

You can run Active-Passive failover with Route53 (as opposed to Load-Balancing with ALB which will distribute traffic evenly which is not what you want).

With Active-Passive failover you can have a primary active service with one (or more) secondaries on cold standby. If a health check fails on the primary it will redirect to the secondary, all using DNS. This WON'T necessarily give you the robustness of HAProxy, but it will essentially achieve what you're after (depending upon criticality and seamlessness of the failover latency - although you can have a faster check interval which may get a cutover delay down to maybe 60 seconds).

This guide by AWS covers Route53 DNS failover:

Active-Active and Active-Passive Failover

Related Topic