Nginx – Tips for maximizing Nginx requests/sec

high-volumenginxredisscaling

I'm building an analytics package, and project requirements state that I need to support 1 billion hits per day. Yep, "billion". In other words, no less than 12,000 hits per second sustained, and preferably some room to burst. I know I'll need multiple servers for this, but I'm trying to get maximum performance out of each node before "throwing more hardware at it".

Right now, I have the hits-tracking portion completed, and well optimized. I pretty much just save the requests straight into Redis (for later processing with Hadoop). The application is Python/Django with a gunicorn for the gateway.

My 2GB Ubuntu 10.04 Rackspace server (not a production machine) can serve about 1200 static files per second (benchmarked using Apache AB against a single static asset). To compare, if I swap out the static file link with my tracking link, I still get about 600 requests per second — I think this means my tracker is well optimized, because it's only a factor of 2 slower than serving the same static asset repeatedly.

However, when I benchmark with millions of hits, I notice a few things —

  1. No disk usage — this is expected, because I've turned off all Nginx logs, and my custom code doesn't do anything but save the request details into Redis.
  2. Non-constant memory usage — Presumably due to Redis' memory managing, my memory usage will gradually climb up and then drop back down, but it's never once been my bottleneck.
  3. System load hovers around 2-4, the system is still responsive during even my heaviest benchmarks, and I can still manually view http://mysite.com/tracking/pixel with little visible delay while my (other) server performs 600 requests per second.
  4. If I run a short test, say 50,000 hits (takes about 2m), I get a steady, reliable 600 requests per second. If I run a longer test (tried up to 3.5m so far), my r/s degrades to about 250.

My questions —

a. Does it look like I'm maxing out this server yet? Is 1,200/s static files nginx performance comparable to what others have experienced?

b. Are there common nginx tunings for such high-volume applications? I have worker threads set to 64, and gunicorn worker threads set to 8, but tweaking these values doesn't seem to help or harm me much.

c. Are there any linux-level settings that could be limiting my incoming connections?

d. What could cause my performance to degrade to 250r/s on long-running tests? Again, the memory is not maxing out during these tests, and HDD use is nil.

Thanks in advance, all 🙂

EDIT
Here is my nginx config — http://pastie.org/1450749 — it's mostly vanilla, with obvious fat trimmed out.

Best Answer

You're abusing Nginx's worker_threads. There is absolutely no need to run that many workers. You should run as many workers as you have CPUs and call it a day. If you're running gunicorn on the same server, you should probably limit nginx workers to two. Otherwise, you're just going to thrash the CPUs with all the context switching required to manage all of those processes.