To your first question: Yes, in the normal HAProxy configuration all traffic flows through the load balancer both when it comes in to your servers, and when it goes out again from the servers to the clients. This is more or less always so with all load balancers, as they're generally implemented as HTTP proxies or IP level NAT / routing boxes. The exeption is when "direct server return" (DSR) is used, see this inlab.com explanation of what DSR is.
My load balancers and backends are located in different parts of the US
Ehh, why? If you're using geo-loadbalancing or multicast routing then I would not expect you to be asking these questions. In the normal use case you really should have your servers in the same rack, and on a fast collision-free, low-latency LAN. That would make life easier for your server software, and give you more performance from your servers, as well as more consistent / dependable performance characteristics...
The canonical setup for the software you're using would be something like this:
nginx (for HTTP compression) --> Varnish cache (for caching) --> HTTP level load balancer (HAProxy, or nginx, or the Varnish built-in) --> webservers.
Optionally, if your load is high, you could have multiple nginx or varnish servers at the very front; but that's for sites with thousands of request per second.
To your second question When you ask "more efficient", I'm in doubt about what you mean. More efficient as in lower traffic between the servers? Marginally, as the Varnish cache stops some traffic from going further back. More efficient with regards to CPU use -- you can just shuffle the services around to less loaded physical servers, as long as you keep the logical struucture the same.
A/B Testing - How do you test two "versions" of each page and compare? I mean, how does
varnish know which page to serve up? If and how do you save seperate versions on each page?
You have several choices:
- Simply expose them at different URLs.
Bypass the cache for the specific URL. You could do this by returning pass
in vcl_recv
. Something like this:
sub vcl_recv {
if (req.url ~ "^/path/to/document") {
return (pass);
}
}
Explicitly purge the cache when you expose a new version.
Feature rollout - how would you set up a simple feature rollout mechanism?
Let's say i want to open a new feature/page to just 10% of the traffic.. and
then later increase that to 20%?
I'm not sure there's a "simple" way to do this. Since you can put
arbitrary C
code in your .vcl
files you could probably add some
logic to pick a random number and then select the appropriate backend
path based on the result.
How do you handle code deployments? Do you purge your entire varnish
cache every deployment? (We have deployments on a daily basis). Or
do you just let it slowly expire (using TTL)?
For major changes we just purge the cache, and for smaller changes we
just let things expire.
Best Answer
Simply put..
HaProxy is the best opensource loadbalancer on the market.
Varnish is the best opensource static file cacher on the market.
Nginx is the best opensource webserver on the market.
(of course this is my and many other peoples opinion)
But generally, not all queries go through the entire stack.
Everything goes through haproxy and nginx/multiple nginx's.
The only difference is you "bolt" on varnish for static requests.
Overall, this model fits a scalable and growing architecture (take haproxy out if you don't have multiple servers)
Hope this helps :D
Note: I'll actually also introduce Pound for SSL queries as well :D
You can have a server dedicated to decrypting SSL requests, and passing out standard requests to the backend stack :D (It makes the whole stack run quicker and simpler)