Mysterious HAProxy request errors

haproxy

I see tons of request errors on one of my frontend proxies (on the order of a few per second), but I can't figure out what's causing them. I've tried using the "show errors" command on the stats socket…

echo "show errors" | socat unix-connect:/var/run/haproxy.stat stdio

But this returns nothing. Debug logging doesn't give me any hints either. Is there some other place I should be looking?

Edit: Just to clarify, there is no "error message" per se (though it would sure help to have one). I'm just looking at the counter labeled "request errors" in the web interface and the socat output, and I'm trying to figure out what's incrementing it.

Best Answer

Have you looked at dmesg? A common problem with Proxy servers is to hit the max Linux connection tracking since each request is using to connections. If this is the case you will see ip_conntrack: table full, dropping packet. in dmesg. You can see the current count and raise it via sysctl or proc:

[kbrandt@lb01: ~] cat /proc/sys/net/netfilter/nf_conntrack_max
131072
[kbrandt@lb01: ~] cat /proc/sys/net/netfilter/nf_conntrack_count
185

You can also bypass connection tracking with the NOTRACK target, i.e.:

sudo /sbin/iptables -t raw -A PREROUTING -p tcp --sport 80 -j NOTRACK

Keep in mind that it is a security risk to disable tracking though, you don't want to do it unless you are already behind a stateful firewall.

Can you post the errors you are seeing?

Related Solutions

HAProxy: Display a “BADREQ” | BADREQ’s by the thousands

Your timeouts are too low. Increase them.

timeout connect 30s
timeout client  30s

The absolute minimum is 5 seconds for traffic between two servers in the same rack. A TCP connection takes 3 seconds to open if there is any packet loss, which invariably happens from time to time.

The minimum timeout is 15 seconds to support international traffic, like a client from Australia connecting to a server in North America. There is quite a high latency and low bandwidth in some locations in the world, much worse than one would expect. Being reasonable on timeouts is a prerequisite to do business worldwide.

The minimum timeout is 30 seconds to support mobile connections and poor reception WiFi. It's unreliable connectivity that can and do experience short periods of blackout.

Keep in mind. Timeouts are meant to handle the worst case scenario of connectivity and they should only catch truly failed connections. They could be set somewhat shorter but this has no benefit except generating errors on clients and servers, which is not a benefit.

Consider that a periodic request made every 5 seconds, something as simple as a healthcheck or a polling API, is actually as much as 17280 requests per day. Thus a good timeout setting should cause less than 0.01% of false positives or it's creating errors every day for no reason.

88500 Sessions and 4500 errors in the last 20 minutes.

That's 5% of errors. It's a very high error rate.

Considering that the average webpage takes more than 20 sub requests to load, it means that every single page on your site is failing to load partially.

HAProxy not logging all requests

With all HAProxy versions prior to 1.5-dev22, when used in mode http, it worked in the tunnel "sub-mode" if no other "sub-mode" was specified. I realize there's not actually such a thing as a "sub-mode" in HAProxy, but I'm not sure what else to call it. The docs just use the word 'mode', but I find that even more confusing...

In any event, in tunnel "sub-mode" only the first request and response are processed, and everything else is forwarded with no analysis at all. This mode should not be used as it creates lots of trouble with logging and HTTP processing.

As of 1.5-dev22, the default "sub-mode" was changed from tunnel to keep alive, meaning that all requests and responses are processed, and connections remain open but idle between responses and new requests.

This can be changed by using the option http-keep-alive, option http-tunnel, option httpclose, option http-server-close and option forceclose keywords in frontends and backends, with the effective mode (or "sub-mode" if you will) being outlined in the docs. Under section 4, there's a table that shows the effective "sub-mode" based on which options are set in the frontend and backed used for a particular connection.

For completeness, here's the relevant section of the docs, including the table and it's various "sub-modes", as it exists at the time of this writing (1.5.14):

In HTTP mode, the processing applied to requests and responses flowing over
a connection depends in the combination of the frontend's HTTP options and
the backend's. HAProxy supports 5 connection modes :

  - KAL : keep alive ("option http-keep-alive") which is the default mode : all
    requests and responses are processed, and connections remain open but idle
    between responses and new requests.

  - TUN: tunnel ("option http-tunnel") : this was the default mode for versions
    1.0 to 1.5-dev21 : only the first request and response are processed, and
    everything else is forwarded with no analysis at all. This mode should not
    be used as it creates lots of trouble with logging and HTTP processing.

  - PCL: passive close ("option httpclose") : exactly the same as tunnel mode,
    but with "Connection: close" appended in both directions to try to make
    both ends close after the first request/response exchange.

  - SCL: server close ("option http-server-close") : the server-facing
    connection is closed after the end of the response is received, but the
    client-facing connection remains open.

  - FCL: forced close ("option forceclose") : the connection is actively closed
    after the end of the response.

The effective mode that will be applied to a connection passing through a
frontend and a backend can be determined by both proxy modes according to the
following matrix, but in short, the modes are symmetric, keep-alive is the
weakest option and force close is the strongest.

                          Backend mode

                | KAL | TUN | PCL | SCL | FCL
            ----+-----+-----+-----+-----+----
            KAL | KAL | TUN | PCL | SCL | FCL
            ----+-----+-----+-----+-----+----
            TUN | TUN | TUN | PCL | SCL | FCL
 Frontend   ----+-----+-----+-----+-----+----
   mode     PCL | PCL | PCL | PCL | FCL | FCL
            ----+-----+-----+-----+-----+----
            SCL | SCL | SCL | FCL | SCL | FCL
            ----+-----+-----+-----+-----+----
            FCL | FCL | FCL | FCL | FCL | FCL

Best Answer

Related Solutions

HAProxy: Display a “BADREQ” | BADREQ’s by the thousands

HAProxy not logging all requests

Related Topic