Nginx – Tuning high-raffic nginx and wordpress server

load-testingnginxPHPphp-fpmWordpress

I have been conducting load-tests (via blitz.io) as I attempt to tune server performance on a pool of servers running php 5.5, wordpress 3.9.1, and nginx 1.6.2.

My confusion arises when I overload a single server with too much traffic. I fully realize that there are finite resources on a server and at some level it will have to begin rejecting connections and/or returning 502 (or similar) responses. What's confusing me though, is why my server appears to be returning 502s so early within a load test.

I have attempted to tune nginx to accept several connections:

nginx.conf

worker_processes auto;
worker_rlimit_nofile 100000;

events {
    worker_connections 1024;
    use epoll;
    multi_accept on;
}

site.conf

location ~ \.php$ {
      try_files $uri =404;
      include /etc/nginx/fastcgi_params;
      fastcgi_pass unix:/var/run/php5-fpm.sock;
      fastcgi_index index.php;
      fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
      fastcgi_read_timeout 60s;
      fastcgi_send_timeout 60s;
      fastcgi_next_upstream_timeout 0;
      fastcgi_connect_timeout 60s;
   }

php www.conf

pm = static
pm.max_children = 8

I expect the load test to saturate the PHP workers rather quickly. But I also expect nginx to continue accepting connections and after the fast_cgi timeouts are hit, begin returning some sort of HTTP error code.

What I'm actually seeing is nginx returning 502s almost immediately after the test is launched.

nginx error.log

2014/11/01 20:35:24 [error] 16688#0: *25837 connect() to unix:/var/run/php5-fpm.sock failed 
(11: Resource temporarily unavailable) while connecting to upstream, client: OBFUSCATED, 
server: OBFUSCATED, request: "GET /?bust=1 HTTP/1.1", upstream: 
"fastcgi://unix:/var/run/php5-fpm.sock:", host: "OBFUSCATED"

What am I missing? Why aren't the pending requests being queued up, and then either completing or timing out later in the process?

Best Answer

This means the php part crashed and isn't listening anymore on the unix socket.

So nginx won't queue anything as it just can't contact the proxied server to send request to, and at this point, you can easily imagine that requests get processed very fast on nginx's side.

If your php server didn't crash, requests would indeed standby regarding fastcgi_connect_timeout and fastcgi_read_timeout values, waiting for some event to show up. If these timeouts were reached you should see 504 error codes.

Your worker_connections seems a bit low by the way compared to rlimit.

It may also be time to start using an upstream block to decide how nginx must behave when target servers seems down, using health checks. With this you can manage how long is the delay that, once reached, will mark a server as down. Once considered down, requests won't reach him until the healthcheck condition to mark him up again passes.

Related Topic