Docker – HAProxy Does Not Recover After Failed Check

dockerhaproxyPROXY

All,

I'm having an issue with HAProxy not recovering after a backend server is restarted. I'm not using the proxy for load balancing, but to direct different URLs to different servers (web and web services) to avoid cross-domain issues. The proxy works fine until the first check fails, but then HAProxy never resumes forwarding to it once it is back up. The proxy is running inside a docker container (https://registry.hub.docker.com/u/dockerfile/haproxy/dockerfile/) and it should be running HAProxy 1.5.3.

haproxy.cfg

global
    log         127.0.0.1 local0
    log         127.0.0.1 local1 notice
    user        haproxy
    group       haproxy

defaults
    mode        http
    log         global
    option      dontlognull
    option      httpclose
    option      httplog
    option      forwardfor
    option      persist
    option      redispatch
    option      http-server-close
    contimeout 5000
    clitimeout 50000
    srvtimeout 50000
    maxconn     60000
    retries     3
    errorfile 400 /etc/haproxy/errors/400.http
    errorfile 403 /etc/haproxy/errors/403.http
    errorfile 408 /etc/haproxy/errors/408.http
    errorfile 500 /etc/haproxy/errors/500.http
    errorfile 502 /etc/haproxy/errors/502.http
    errorfile 503 /etc/haproxy/errors/503.http
    errorfile 504 /etc/haproxy/errors/504.http
    stats enable
    stats uri /haproxy-stats

frontend http-in
    bind *:80

    default_backend       server1
    acl url_server1       path_beg -i     /server1
    acl url_server2       path_beg -i     /services/server2
    use_backend server1   if url_server1
    use_backend server2   if url_server2

backend server1
    balance roundrobin
    option httpchk GET /server1/content/
    server server1 server1:8080 check inter 10s rise 1 fall 5

backend server2
    balance roundrobin
    option httpchk GET /services
    server server2 server2:8181 check inter 5s rise 1

In the logs for HAProxy, I see the following error messages:

[WARNING] 040/210248 (1) : Server server1/server1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT] 040/210248 (1) : backend 'server1' has no server available!

In the browser I see:

503 Service Unavailable
No server is available to handle this request.

Once I restart the server, I never see a log message indicating that HAProxy has determined it has come back up and I still receive the 503 error in the browser, even though I can hit server1:8080 and see that the site is back up.

Best Answer

I ran into the same issue and my understanding is that HaProxy considers the whole backend as down if all the servers in the backend are down. And once a backend is marked as down it doesn't go back up (this is not documented, I came to this conclusion based on my experience).

Since you have only one server, it is very likely that your server is marked as down during restart. This would be the correct behavior of a load balancer if you had a multiple servers.

My solution consisted in increasing the interval between checks and the fall parameters:

  • The rise parameter sets the number of checks a server must pass to be declared operational. Default is 2.
  • The fall parameter sets the number of checks a server must fail to be declared dead. Default is 3.
  • The inter parameter sets the interval between these checks. Default is 2000 milliseconds.

For instance:

server foo 1.2.3.4:80 check inter 5000 fall 5 rise 1