HAProxy responding with NOSRV while backend is up

503-errorhaproxy

I have a strange situation where requests to my HAProxy are returning with a 503. HAProxy logs show it returning a NOSRV error:

Mar 26 19:47:01 localhost haproxy[23910]: 10.0.0.30:34261 
  [26/Mar/2013:19:46:48.579] fe v2/<NOSRV> 12801/-1/-1/-1/12801 503 
  212 - - SC-- 0/0/0/0/0 0/0 "GET /path/v2/ HTTP/1.1"

During this time, the backend server was confirmed up and was receiving traffic from an internal load balancer. This thing happened spontaneously without any configuration or other changes in HAProxy. Restarting the HAProxy fixed this.

Does anyone know if this is a known issue? Thanks for your help/insight.

Thanks.

My configuration looks like this:

global
    maxconn     1000 # Total Max Connections. This is dependent on ulimit
    daemon
    nbproc      1 # Number of processing cores. Dual Dual-core Opteron is 4 cores for example.
    log         127.0.0.1 local1
defaults
        mode        http
        clitimeout  60000
        timeout server 300000
        contimeout  4000
        option      httpclose # Disable Keepalive

backend v2
        server v2Elb internal-xxx.us-west-1.elb.amazonaws.com:80 weight 1 maxconn 512 check
backend v2e
        server v2eElb 10.0.1.28:80 weight 1 maxconn 512 check
frontend fe
        bind :80
        option httpchk
        option forwardfor # This sets X-Forwarded-For
        option httplog
        log global
        acl v2e path_beg /path/v2e
        acl v2 path_beg /path/v2
        redirect location https://my.domain.com/path/v2/ if !v2e !v2
        use_backend v2e if v2e
        use_backend v2 if v2

Best Answer

I notice from the given configuration that you are running infront of an AWS ELB load balancer v2 and I am guessing that v2e points directly to an app server (which would otherwise be behind ELB)?

If so this will suggest to me that, along with the 503 error, the connection between your HAProxy instance and ELB is hitting a timeout, either the 4 second contimeout timeout or the 300 second server timeout. The more likely one is the 4 second contimeout and the sporadicalness of the error further confirms that it is likely to be a network issue between HAProxy and ELB.

I would attempt increasing the contimeout value as well as monitoring latency between HAProxy and ELB.