HAProxy causing delay

haproxynetworkingubuntu-12.04

I am trying to configure HAProxy to do load balancing for a custom webserver I created. Right now I am noticing an increasing delay with HAProxy as the size of the return message increases. For example, I ran four different tests, here are the results:

Response 15kb through HAProxy:
Avg. response time: .34 secs
Transacation rate: 763 trans/sec
Throughput: 11.08 MB/sec

Response 2kb through HAProxy:
Avg. response time: .08 secs
Transaction rate: 1171 trans / sec
Throughput: 2.51 MB/sec

Response 15kb directly to server:
Avg. response time: .11 sec
Transaction rate: 1046 trans/sec
throughput: 15.20 MB/sec

Response 2kb directly to server:
Avg. Response time: .05 secs
Transaction rate: 1158 trans/sec
Throughput: 2.48 MB/sec

All transactions are HTTP requests. As you can see, there seems to be a much bigger difference between response times for when the response is bigger, than when it is smaller. I understand there will be a slight delay when using HAProxy. Not sure if it matters, but the test itself was run using siege. And during the test there was only one server behind the HAProxy(the same that was used in the direct to server tests). Here is my haproxy.config file:

global  
       log 127.0.0.1   local0
       log 127.0.0.1   local1 notice
       maxconn 10000
       user haproxy
       group haproxy
       daemon
       #debug

defaults
       log     global
       mode    http
       option  httplog
       option  dontlognull
       retries 3
       option redispatch
       option httpclose
       maxconn 10000
       contimeout      10000
       clitimeout      50000
       srvtimeout      50000
       balance roundrobin
       stats enable
       stats uri /stats

listen  lb1  10.1.10.26:80
        maxconn 10000
        server  app1 10.1.10.200:8080 maxconn 5000

I couldn't find much in terms of options in this file that would help my problem. I have heard suggestions that I may have to adjust a few of my sysctl settings. I could not find a lot of information on this however, most documentation is for Linux 2.4 and 2.6 on the sysctl stuff, I am running 3.2(Ubuntu server 12.04), which seems to auto tuning, so I have no clue what I should or shouldn't be changing. Most settings changes I tried had no effect or a negative effect on performance.

Just a notice, this is a very preliminary test, and my hope is that at deployment time, my HAProxy will be able to balance 10k-20k requests/sec to many servers, so if anyone could provide information to help me reach that goal, it would be much appreciated.

Thank you very much for any information you can provide. And if you need anymore information from me please let me know, I will get you anything I can.

[Edit] As requested haproxy -vv

HA-Proxy version 1.4.18 2011/09/16
Copyright 2000-2011 Willy Tarreau <w@1wt.eu>

Build options :
  TARGET  = linux26
  CPU     = generic
  CC      = gcc
  CFLAGS  = -O2 -g -fno-strict-aliasing
  OPTIONS = USE_LINUX_SPLICE=1 USE_LINUX_TPROXY=1 USE_PCRE=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200

Encrypted password support via crypt(3): yes

Available polling systems :
   sepoll : pref=400,  test result OK
   epoll : pref=300,  test result OK
   poll : pref=200,  test result OK
   select : pref=150,  test result OK
Total: 4 (4 usable), will use sepoll.

Best Answer

I'm thinking about several points :

1) are you running in a virtualized environment ?

2) do you have nf_conntrack loaded on the machine ?

3) did you at any point saturate the CPU on any of the machines involved ?

4) please use "option http-server-close" instead of "option httpclose", as the latter lets both sides close by themselves, resulting in longer connections.

5) what do you see in haproxy's logs ? The time will be split in multiple fields that let you analyse where it is spent.

6) if you see much smaller times in haproxy's logs than what you have on your test machine, it means the delay is caused SYN packets waiting in the system's backlog (or worse, retransmitted), which could be cause by a lack of system tuning.

7) (less important), when reporting an issue with an old version, you should first update it to the latest fixes (1.4.22) to see if the issue is still there. I don't think what you observe matches any known issue but still that's the general idea.