NGINX – Determine NGINX Reverse-Proxy Load Limits

linuxnginxreverse-proxy

I have an nginx server (CentOS 5.3, linux) that I'm using as a reverse-proxy load-balancer in front of 8 ruby on rails application servers. As our load on these servers increases, I'm beginning to wonder at what point will the nginx server become a bottleneck? The CPUs are hardly used, but that's to be expected. The memory seems to be fine. No IO to speak of.

So is my only limitation bandwidth on the NICs? Currently, according to some cacti graphs, the server is hitting around 700Kbps ( 5 min average ) on each NIC during high load. I would think this is still pretty low.

Or, will the limit be in sockets or some other resource in the operating system?

Thanks for any thoughts and insights.

Edit:
racyclist:

Thank you for your insights. I have done a little more digging. I have 1 worker allowing 1024 worker_connections. Let's assume that 95% of the requests are for small amounts of data. Any recommendations on what a system with 512MB should be able to handle, connections wise?

Also, what's a good way to count connections? Would something like this be accurate?:

netstat -np | grep ESTABLISHED | grep nginx | wc -l

End Edit

Aaron

Best Answer

Currently you have pretty low load according to the bandwidth utilization. There is a lot of possible bottlenecks, to name few:

Network related

As the number of connections grows, you can hit worker_connections limit of an Nginx worker process. The racyclist's description is pretty good, I'll just add few cents to it. Actually the more workers you have, the more possiblity you can hit worker_connections of one particular worker. The reason for that is Nginx master process cannot guarantee even distribution of connections between the workers -- some of them can process requests faster than others and thus the limit can be exceeded finally.

My advice is to use as few workers as possible with large number of worker_connections. However you will have to increase the number of workers if you have IO (see later). Use nginx's status module to watch the number of sockets it uses.

You shall likely hit OS (Linux or FreeBSD) limit on the number of per-process open file descriptors. Nginx will use descriptors not only for incoming requests, but for outgoing connections to backends as well. Initially this limit is set to the very low value (e.g. 1024). Nginx will complain in its error.log on this event.

If you are using iptables and its conntrack module (Linux), you shall exceed the size of conntrack table as well. Watch out dmesg or /var/log/messages. Increase this limit as necessary.

Some very good optimized applications utilize 100% bandwidth. My bet is that you shall face previous problem(s) before.

IO related

In fact, a Nginx worker blocks on IO. Thus if your site is serving static content, you will need to increase the number of Nginx workers to account for IO blocking. It's hard to give recipes here, as they vary a lot depending on the number and size of files, type of load, available memory, etc.

If you are proxying connections to some backend through Nginx, you should take into account that it creates temporary files to store the backend's answer and in the case of high traffic this can result in substantial load on the filessystem. Watch for messages in Nginx's error.log and tune proxy_buffers (or fastcgi_buffers) accordingly.

If you have some background IO (e.g. MySQL), it will affect static files serving as well. Watch for IO wait%