Apache mod_proxy_balancer health checking

apache-2.2mod-proxy

I have 3 Apache web servers set up behind another Apache server running mod_proxy_balancer.

During today, one of the web servers got rebooted. It took about 16 minutes to reboot.

During that time, I can see the following log line every 30 seconds in the error log on my Apache mod_proxy_balancer server

[Tue Sep 30 07:04:42 2014] [error] ap_proxy_connect_backend disabling worker for (s1-sc1-c-use)

This is logged 32 times during the 16 minute outage

I am trying to figure out what is going on here. My concern is that due to deficiencies in my balancer config, that the balancer is repeatedly trying to send requests to the rebooting server (and therefore returning errors to the user).

Why is Apache repeatedly telling me that it is "disabling worker"? Does the balancer module intermittently send user requests to the failing node, to try and determine if it is back up, or does it have its own internal health checking mechanism that is invisible to the user?

Best Answer

OK. I can explain this.

Apache mod_proxy_balancer doesn't have its own independent healthcheck mechanism. The state of Balancer Members (workers) is determined based on outcome of actual forwarded users requests.

Sequence is as follows:

  1. Apache httpd sends request to worker
  2. Worker doesn't respond, or responds with HTTP status which triggers failover,and puts member into ERR state
  3. Apache httpd starts retry timer (default 60 secs) and doesn't send any more requests until retry timer expires
  4. When retry timer expires, go back to Step 1 in sequence

My retry value is 60 secs (default).

The reason I am seeing multiple log entries is that my Apache httpd Balancer is configured with multiple balancers, each with its own independent retry timer.

As such, depending on application activity, the retry timers are being reset arbitrarily, and being tested arbitrarily, which explains the non-uniform distribution of worker status updates in the log.