Amazon Web Services Elastic Load Balancing No Downtime

amazon ec2amazon-elbamazon-web-services

I'm trying to figure out how Amazon Web Services Elastic Load Balancing would create no downtime.

Elastic Load Balancing pings your server path every so often (normally a couple of seconds). If it doesn't receive a response within a set period of time (normally a second or two) it will take the server offline and not send anymore traffic to that server until it comes back online.

What I'm confused about is although that server will be taken offline it will take a few seconds for AWS Elastic Load Balancing to ping it and it to actually be taken offline. I'm assuming there is a way to eliminate this gap of needing to ping and only send traffic to TRULY active servers and eliminate this chance of Elastic Load Balancing sending traffic to a server that is having issues. How can I achieve this and create 0 downtime in my application?

Best Answer

There is conflicting information about this online. Some resources say ELB retries a request if it goes past the default 60 second timeout before a response is received from the server, but these are in the minority. Some say ELB doesn't retry requests. The AWS documentation doesn't say what happens when an ELB times out - a fairly significant omission. Based on what I've read I tend to think that if your back end server times out the client is sent an error code, probably 408 timeout. You should test this, and my advice below is based on this assumption. If ELB retries than my advice below is incorrect.

I don't believe what you want is possible using ELB for a standard web application because of the lack of retries. Bigger picture, you can't guarantee 100% availability, it's virtually impossible. You need to set your availability to a realistic level then architect your system to achieve this. For example you might have two regions active, Route 53 doing geographic load balancing with failover. However you won't get 100% as it's set up to test and send requests to instances thought to be healthy, not to retry requests if they fail.

ELB won't retry a request if a server is down or times out. You would have to put in your own logic or load balancer, which itself could fail. Hardware outside of AWS might work but isn't a good idea, and your own load balancer inside AWS is a bad idea because you're unlikely to be able to create a load balancer as reliable as ELB.

I suggest you concentrate on making your web / application servers stable, scaleable, and stateless so they can be scaled up and down as required.