Haproxy BADREQ errors

haproxy

I am seeing errors similar to the following in my haproxy logs:

Jul 18 17:05:30 localhost haproxy[8247]: 188.223.50.7:51940 [18/Jul/2011:17:05:24.339] http_proxy_ads http_proxy_ads/<NOSRV> -1/-1/-1/-1/6001 408 212 - - cR-- 100/89/0/0/0 0/0 "<BADREQ>" 
Jul 18 17:05:30 localhost haproxy[8247]: 188.223.50.7:51943 [18/Jul/2011:17:05:24.341] http_proxy_ads http_proxy_ads/<NOSRV> -1/-1/-1/-1/6000 408 212 - - cR-- 99/88/0/0/0 0/0 "<BADREQ>"

etc…

So far I have tried to increase the client timeout (to 6 seconds from 3), and increase the http request buffer from 16k to 32k. The errors still appear.

Can anyone give me guidance on what to look for here?

Best Answer

A Preconnect from a browser could lead to BADREQ too if the browser is not using all connections. For example when a user is downloading only one file per browser.

That means there are two possible causes for BADREQ with cR-- or CR-- (verified with HAProxy v1.5-dev24):

Unused connection: That means for HTTP(S) a client connected per TCP but no HTTP request header was sent until from timeout http-request (CR--) or the client was closing the connection again (cR--). Cause: Unused connection from a preconnect of a normal client or loadbalancer or from a scan.
Bad Request. A client was sending a bad request. These errors should be visible per stats socket (see previous answer from womble).

Most modern browsers like Firefox or Chrome are doing a preconnect. I was seeing that Firefox or Chrome were opening always at least 2 connections even if the browser is doing only one request like downloading a file (for example only downloading http://cdn.sstatic.net/serverfault/img/favicon.ico)

Increasing the value of timeout http-request in your HAProxy configuration can help to reduce such log entries for unused connections just because a higher value means a higher chance that the connection will be used from a client, but you are also risking that your server cannot handle all open (idle) connections anymore. If you are using another loadbalancer like Amazon ELB in front of HAProxy, check that this timeout in HAProxy is matching with the loadbalancer, because they could use preconnect too.

For unused connections you can use option dontlognull in HAProxy to disable this log entries. Quote from HAProxy Docu for this option:

It is generally recommended not to use this option in uncontrolled environments (eg: internet), otherwise scans and other malicious activities would not be logged.

Related Solutions

Mysterious HAProxy request errors

Have you looked at dmesg? A common problem with Proxy servers is to hit the max Linux connection tracking since each request is using to connections. If this is the case you will see ip_conntrack: table full, dropping packet. in dmesg. You can see the current count and raise it via sysctl or proc:

[kbrandt@lb01: ~] cat /proc/sys/net/netfilter/nf_conntrack_max
131072
[kbrandt@lb01: ~] cat /proc/sys/net/netfilter/nf_conntrack_count
185

You can also bypass connection tracking with the NOTRACK target, i.e.:

sudo /sbin/iptables -t raw -A PREROUTING -p tcp --sport 80 -j NOTRACK

Keep in mind that it is a security risk to disable tracking though, you don't want to do it unless you are already behind a stateful firewall.

Can you post the errors you are seeing?

HAProxy: Display a “BADREQ” | BADREQ’s by the thousands

Your timeouts are too low. Increase them.

timeout connect 30s
timeout client  30s

The absolute minimum is 5 seconds for traffic between two servers in the same rack. A TCP connection takes 3 seconds to open if there is any packet loss, which invariably happens from time to time.

The minimum timeout is 15 seconds to support international traffic, like a client from Australia connecting to a server in North America. There is quite a high latency and low bandwidth in some locations in the world, much worse than one would expect. Being reasonable on timeouts is a prerequisite to do business worldwide.

The minimum timeout is 30 seconds to support mobile connections and poor reception WiFi. It's unreliable connectivity that can and do experience short periods of blackout.

Keep in mind. Timeouts are meant to handle the worst case scenario of connectivity and they should only catch truly failed connections. They could be set somewhat shorter but this has no benefit except generating errors on clients and servers, which is not a benefit.

Consider that a periodic request made every 5 seconds, something as simple as a healthcheck or a polling API, is actually as much as 17280 requests per day. Thus a good timeout setting should cause less than 0.01% of false positives or it's creating errors every day for no reason.

88500 Sessions and 4500 errors in the last 20 minutes.

That's 5% of errors. It's a very high error rate.

Considering that the average webpage takes more than 20 sub requests to load, it means that every single page on your site is failing to load partially.

Best Answer

Related Solutions

Mysterious HAProxy request errors

HAProxy: Display a “BADREQ” | BADREQ’s by the thousands

Related Topic