F5 LTM The time between a node going down and a health check failing

availabilityf5-big-iphigh-availability

Simple question but I couldn't find an answer online.

Suppose the following scenario plays out: The node to receive traffic next becomes unresponsive before a health check can be initiate when it receives a new client request. How will the F5 LTM load balancer react to that?

The reason I ask is because we want to be able to reboot the backend nodes as needed but don't want there to be any dropped connections. Will LTM just attempt to connect to the pool member and then proceed to the next node for the same HTTP request? I suppose we could run some iControl REST call prior to the reboot, but I'm not keen on over-engineering this either.

Best Answer

Your health checks should be configured to be 3n+1, n being your polling interval. Say you have an interval of 5 seconds, so the timeout is 16 seconds. So you have the potential for upwards of 16 seconds where traffic will be passed to the node that is not responsive. The pool setting "action on service down" will determine how the BIG-IP reacts, you can read about that here: https://devcentral.f5.com/articles/ltm-action-on-service-down

My advice is to take pool members out of service, allow current connections to bleed off, then force them offline for maintenance before actually downing your servers. This can be done in the GUI or via iControl SOAP/REST as you stated. This is standard operating procedure for many customers, I don't think that's overengineering at all.

Related Topic