NGINX – Determine NGINX Reverse-Proxy Load Limits

linuxnginxreverse-proxy

I have an nginx server (CentOS 5.3, linux) that I'm using as a reverse-proxy load-balancer in front of 8 ruby on rails application servers. As our load on these servers increases, I'm beginning to wonder at what point will the nginx server become a bottleneck? The CPUs are hardly used, but that's to be expected. The memory seems to be fine. No IO to speak of.

So is my only limitation bandwidth on the NICs? Currently, according to some cacti graphs, the server is hitting around 700Kbps ( 5 min average ) on each NIC during high load. I would think this is still pretty low.

Or, will the limit be in sockets or some other resource in the operating system?

Thanks for any thoughts and insights.

Edit:
racyclist:

Thank you for your insights. I have done a little more digging. I have 1 worker allowing 1024 worker_connections. Let's assume that 95% of the requests are for small amounts of data. Any recommendations on what a system with 512MB should be able to handle, connections wise?

Also, what's a good way to count connections? Would something like this be accurate?:

netstat -np | grep ESTABLISHED | grep nginx | wc -l

End Edit

Aaron

Best Answer

Currently you have pretty low load according to the bandwidth utilization. There is a lot of possible bottlenecks, to name few:

Network related

As the number of connections grows, you can hit worker_connections limit of an Nginx worker process. The racyclist's description is pretty good, I'll just add few cents to it. Actually the more workers you have, the more possiblity you can hit worker_connections of one particular worker. The reason for that is Nginx master process cannot guarantee even distribution of connections between the workers -- some of them can process requests faster than others and thus the limit can be exceeded finally.

My advice is to use as few workers as possible with large number of worker_connections. However you will have to increase the number of workers if you have IO (see later). Use nginx's status module to watch the number of sockets it uses.

You shall likely hit OS (Linux or FreeBSD) limit on the number of per-process open file descriptors. Nginx will use descriptors not only for incoming requests, but for outgoing connections to backends as well. Initially this limit is set to the very low value (e.g. 1024). Nginx will complain in its error.log on this event.

If you are using iptables and its conntrack module (Linux), you shall exceed the size of conntrack table as well. Watch out dmesg or /var/log/messages. Increase this limit as necessary.

Some very good optimized applications utilize 100% bandwidth. My bet is that you shall face previous problem(s) before.

IO related

In fact, a Nginx worker blocks on IO. Thus if your site is serving static content, you will need to increase the number of Nginx workers to account for IO blocking. It's hard to give recipes here, as they vary a lot depending on the number and size of files, type of load, available memory, etc.

If you are proxying connections to some backend through Nginx, you should take into account that it creates temporary files to store the backend's answer and in the case of high traffic this can result in substantial load on the filessystem. Watch for messages in Nginx's error.log and tune proxy_buffers (or fastcgi_buffers) accordingly.

If you have some background IO (e.g. MySQL), it will affect static files serving as well. Watch for IO wait%

Related Solutions

Nginx – How to Set Up as a Caching Reverse Proxy

I don't think that there is a way to explicitly invalidate cached items, but here is an example of how to do the rest. Update: As mentioned by Piotr in another answer, there is a cache purge module that you can use. You can also force a refresh of a cached item using nginx's proxy_cache_bypass - see Cherian's answer for more information.

In this configuration, items that aren't cached will be retrieved from example.net and stored. The cached versions will be served up to future clients until they are no longer valid (60 minutes).

Your Cache-Control and Expires HTTP headers will be honored, so if you want to explicitly set an expiration date, you can do that by setting the correct headers in whatever you are proxying to.

There are lots of parameters that you can tune - see the nginx Proxy module documentation for more information about all of this including details on the meaning of the different settings/parameters: http://nginx.org/r/proxy_cache_path

http {
  proxy_cache_path  /var/www/cache levels=1:2 keys_zone=my-cache:8m max_size=1000m inactive=600m;
  proxy_temp_path /var/www/cache/tmp; 


  server {
    location / {
      proxy_pass http://example.net;
      proxy_cache my-cache;
      proxy_cache_valid  200 302  60m;
      proxy_cache_valid  404      1m;
    }
  }
}

The difference between Load Balancer and Reverse Proxy

Your confusion is reasonable - they are often the same thing. But not always. When you refer to a load balancer you are referring to a very specific thing - a server or device that balances inbound requests across two or more web servers to spread the load. A reverse proxy, however, typically has any number of features:

load balancing: as discussed above
caching: it can cache content from the web server(s) behind it and thereby reduce the load on the web server(s) and return some static content back to the requester without having to get the data from the web server(s)
security: it can protect the web server(s) by preventing direct access from the internet; it might do this through simple means by just obfuscating the web server(s) or it may have some more active components that actually review inbound requests looking for malicious code
SSL acceleration: when SSL is used; it may serve as a termination point for those SSL sessions so that the workload of dealing with the encryption is offloaded from the web server(s)

I think this covers most of it but there are probably a few other features I've missed. Certainly it isn't uncommon to see a device or piece of software marketed as a load balancer/reverse proxy because the features are so commonly bundled together.

Best Answer

Related Solutions

Nginx – How to Set Up as a Caching Reverse Proxy

The difference between Load Balancer and Reverse Proxy

Related Topic