I have three back end servers load balanced using HAProxy.
When all 3 of those are up i see each servers Id (1,2,3) consecutively in the HTTPS response. IE 1,2,3,1,2,3. As expected.
However, if i take one back end server down i then only see one server id repeatedly.
I.e if I take server 1 down I then only see server id 2 (2,2,2,2…) when I should see server id's 2,3 (2,3,2,3,2,3…..)
I have tried both leastconn and round robin balancing and both exhibit the same behaviour
Why does HAProxy behave this way? How do I alter my HAProxy config to load balance across the remaining alive back end servers?
HA proxy version is 1.6.9. Server for HA proxy is ubuntu 14.04.
The HTTP being served is an API that returns some JSON over HTTPS. Its a restful API so there is no session persistence required or wanted.
The configuration for haproxy is below.
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
maxconn 3072
# Default SSL material locations
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
# Default ciphers to use on SSL-enabled listening sockets.
# For more information, see ciphers(1SSL). This list is from:
# https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
ssl-default-bind-ciphers kEECDH+aRSA+AES:kRSA+AES:+AES256:RC4-SHA:!kEDH$
ssl-default-bind-options no-sslv3
defaults
log global
mode http
option httplog
option dontlognull
option dontlog-normal
option redispatch
retries 3
maxconn 3072
timeout connect 5000
timeout client 10000
timeout server 10000
frontend apis-frontend
bind x.x.x.x:443 ssl verify none crt /etc/haproxy/xxx
mode http
option forwardfor
default_backend apis
backend apis
balance leastconn
mode http
option httpchk GET /
server 1 x.x.x.x:443 ssl verify none check
server 2 x.x.x.x:443 ssl verify none check
server 3 x.x.x.x:443 ssl verify none check
listen stats
bind *:8181
mode http
stats enable
stats uri /
stats realm Haproxy\ Statistics
stats auth xx
Edit: added stats.
pxname,svname,qcur,qmax,scur,smax,slim,stot,bin,bout,dreq,dresp,ereq,econ,eresp,wretr,wredis,status,weight,act,bck,chkfail,chkdown,lastchg,downtime,qlimit,pid,iid,sid,throttle,lbtot,tracked,type,rate,rate_lim,rate_max,check_status,check_code,check_duration,hrsp_1xx,hrsp_2xx,hrsp_3xx,hrsp_4xx,hrsp_5xx,hrsp_other,hanafail,req_rate,req_rate_max,req_tot,cli_abrt,srv_abrt,comp_in,comp_out,comp_byp,comp_rsp,lastsess,last_chk,last_agt,qtime,ctime,rtime,ttime,
apis-frontend,FRONTEND,,,0,2,3072,24,9118,10021,0,0,19,,,,,OPEN,,,,,,,,,1,2,0,,,,0,0,0,4,,,,0,24,0,19,0,0,,0,6,43,,,0,0,0,0,,,,,,,,
apis,mt-wol-vlx-vps-01,0,0,0,1,,12,4863,3240,,0,,0,0,0,0,UP,1,1,0,1,0,12097,0,,1,3,1,,12,,2,0,,2,L7OK,200,36,0,12,0,0,0,0,0,,,,0,0,,,,,39,OK,,0,1,1,29,
apis,mt-lon-vlx-vps-01,0,0,0,1,,12,4255,3228,,0,,0,0,0,0,UP,1,1,0,0,0,12097,0,,1,3,2,,12,,2,0,,2,L7OK,200,23,0,12,0,0,0,0,0,,,,0,0,,,,,39,OK,,0,1,1,5,
apis,mt-cov-uks-vps-01,0,0,0,0,,0,0,0,,0,,0,0,0,0,DOWN,1,1,0,1,1,12094,12094,,1,3,3,,0,,2,0,,0,L4TOUT,,2001,0,0,0,0,0,0,0,,,,0,0,,,,,-1,,,0,0,0,0,
apis,BACKEND,0,0,0,1,308,24,9118,6468,0,0,,0,0,0,0,UP,2,2,0,,0,12097,0,,1,3,0,,24,,1,0,,3,,,,0,24,0,0,0,0,,,,,0,0,0,0,0,0,39,,,0,1,1,33,
stats,FRONTEND,,,1,4,3072,10,3811,169181,0,0,5,,,,,OPEN,,,,,,,,,1,4,0,,,,0,1,0,4,,,,0,9,0,5,0,0,,1,3,15,,,0,0,0,0,,,,,,,,
stats,BACKEND,0,0,0,0,308,0,3811,169181,0,0,,0,0,0,0,UP,0,0,0,,0,12097,0,,1,4,0,,0,,1,0,,0,,,,0,0,0,0,0,0,,,,,0,0,0,0,0,0,0,,,0,0,0,18,
edit: load balancing behaviour in different browsers
After a request to test from a different ip. I noticed the second ip load balanced as expected. I then tried from the first IP but using different browsers edge and firefox. Both edge and firefox load balance as expected.
However the original browser chrome still stick to one back end server. Even after closing all chrome windows and restarting chrome
Best Answer
After much further testing the behaviour indicated only appears when using Google's Chrome browser. Why chrome shows this behaviour is undetermined.
The observed behaviour was not a HAProxy issue.
This tested multiple times with other browsers and other clients. Final proof being live service for several hours.
The HAProxy config above has now been running for several hours and is fairly distributing traffic amongst the three servers. This is evidenced from our DB records which records server id. Over a period of several hours each server was assigned the same number of requests within +- 50 from circa 4,500 requests.
More importantly, given the original question, it also continues to distribute fairly in the event of failure of one of the servers, amongst the other two. Again evidenced by our DB records which record server id. When measured over 15 minutes a clear pattern of equally balanced server ids can be seen in our DB records.
A poignant reminder to design testing to pin point where an issue lies, before assuming the issue lies with the part(s) of the solution you don't know very well, yet.
Thank you for the many replies it took for that realisation to dawn.