Nginx, Varnish, HAProxy, Webserver – Optimal Ordering

haproxynginxvarnishweb

I've seen people recommend combining all of these in a flow, but they seem to have lots of overlapping features so I'd like to dig in to why you might want to pass through 3 different programs before hitting your actual web server.

nginx:

ssl: yes
compress: yes
cache: yes
backend pool: yes

varnish:

ssl: no (stunnel?)
compress: ?
cache: yes (primary feature)
backend pool: yes

haproxy:

ssl: no (stunnel)
compress: ?
cache: no
backend pool: yes (primary feature)

Is the intent of chaining all of these in front of your main web servers just to gain some of their primary feature benefits?

It seems quite fragile to have so many daemons stream together doing similar things.

What is your deployment and ordering preference and why?

Best Answer

Simply put..

HaProxy is the best opensource loadbalancer on the market.
Varnish is the best opensource static file cacher on the market.
Nginx is the best opensource webserver on the market.

(of course this is my and many other peoples opinion)

But generally, not all queries go through the entire stack.

Everything goes through haproxy and nginx/multiple nginx's.
The only difference is you "bolt" on varnish for static requests.

any request is loadbalanced for redundancy and throughput (good, that's scalable redundancy)
any request for static files is first hitting the varnish cache (good, that's fast)
any dynamic request goes direct to the backend (great, varnish doesn't get used)

Overall, this model fits a scalable and growing architecture (take haproxy out if you don't have multiple servers)

Hope this helps :D

Note: I'll actually also introduce Pound for SSL queries as well :D
You can have a server dedicated to decrypting SSL requests, and passing out standard requests to the backend stack :D (It makes the whole stack run quicker and simpler)

Related Solutions

Nginx – Need HAproxy + Varnish + nginx setup suggestions

To your first question: Yes, in the normal HAProxy configuration all traffic flows through the load balancer both when it comes in to your servers, and when it goes out again from the servers to the clients. This is more or less always so with all load balancers, as they're generally implemented as HTTP proxies or IP level NAT / routing boxes. The exeption is when "direct server return" (DSR) is used, see this inlab.com explanation of what DSR is.

My load balancers and backends are located in different parts of the US

Ehh, why? If you're using geo-loadbalancing or multicast routing then I would not expect you to be asking these questions. In the normal use case you really should have your servers in the same rack, and on a fast collision-free, low-latency LAN. That would make life easier for your server software, and give you more performance from your servers, as well as more consistent / dependable performance characteristics...

The canonical setup for the software you're using would be something like this:

nginx (for HTTP compression) --> Varnish cache (for caching) --> HTTP level load balancer (HAProxy, or nginx, or the Varnish built-in) --> webservers.

Optionally, if your load is high, you could have multiple nginx or varnish servers at the very front; but that's for sites with thousands of request per second.

To your second question When you ask "more efficient", I'm in doubt about what you mean. More efficient as in lower traffic between the servers? Marginally, as the Varnish cache stops some traffic from going further back. More efficient with regards to CPU use -- you can just shuffle the services around to less loaded physical servers, as long as you keep the logical struucture the same.

How to utilize Varnish for A/B Testing and Feature Rollout

A/B Testing - How do you test two "versions" of each page and compare? I mean, how does varnish know which page to serve up? If and how do you save seperate versions on each page?

You have several choices:

Simply expose them at different URLs.
Bypass the cache for the specific URL. You could do this by returning pass in vcl_recv. Something like this:
```
sub vcl_recv {
    if (req.url ~ "^/path/to/document") {
        return (pass);
    }
}
```
Explicitly purge the cache when you expose a new version.

Feature rollout - how would you set up a simple feature rollout mechanism? Let's say i want to open a new feature/page to just 10% of the traffic.. and then later increase that to 20%?

I'm not sure there's a "simple" way to do this. Since you can put arbitrary C code in your .vcl files you could probably add some logic to pick a random number and then select the appropriate backend path based on the result.

How do you handle code deployments? Do you purge your entire varnish cache every deployment? (We have deployments on a daily basis). Or do you just let it slowly expire (using TTL)?

For major changes we just purge the cache, and for smaller changes we just let things expire.

Best Answer

Related Solutions

Nginx – Need HAproxy + Varnish + nginx setup suggestions

How to utilize Varnish for A/B Testing and Feature Rollout

Related Topic