Linux – php5-fpm maxes out all cores (nginx+php5fpm+wordpress)

linuxnginxphp-fpmvpsWordpress

We have a 16 gig linode running with 8 cpu cores and I'm trying to debug an issue that when I try a load test that the server starts to freak out around 250 requests a second on a pretty basic wordpress site. Im only hitting the front page of the site fyi.

When I log in to the server and view the server stats on Htop I see all cores pegged and a crap load of php5-fpm processes. After the test is done, these processes still exists and I eventually have to restart the php-fpm in order to bring the server back to live.

I saw this in the log today when doing the load test.

[23-Dec-2013 12:19:03] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 0 idle, and 27 total children
[23-Dec-2013 12:19:04] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 16 children, there are 0 idle, and 35 total children
[23-Dec-2013 12:19:05] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 45 total children
[23-Dec-2013 12:19:06] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 55 total children
[23-Dec-2013 12:19:07] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 65 total children
[23-Dec-2013 12:19:08] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 75 total children
[23-Dec-2013 12:19:19] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 5 idle, and 93 total children

So obviously something is fubar, I'm just no expert on tweaking these settings for the current system we have.

Here is our www.conf

; Start a new pool named 'www'.
; the variable $pool can we used in any directive and will be replaced by the
; pool name ('www' here)
[www]

; Unix user/group of processes
; Note: The user is mandatory. If the group is not set, the default user's group
;       will be used.
user = www-data
group = www-data

; The address on which to accept FastCGI requests.
; Valid syntaxes are:
;   'ip.add.re.ss:port'    - to listen on a TCP socket to a specific address on
;                            a specific port;
;   'port'                 - to listen on a TCP socket to all addresses on a
;                            specific port;
;   '/path/to/unix/socket' - to listen on a unix socket.
; Note: This value is mandatory.
listen = /var/run/php5-fpm.sock

; Set listen(2) backlog. A value of '-1' means unlimited.
; Default Value: 128 (-1 on FreeBSD and OpenBSD)
listen.backlog = 65536

pm = dynamic

; The number of child processes to be created when pm is set to 'static' and the
; maximum number of child processes when pm is set to 'dynamic' or 'ondemand'.
; This value sets the limit on the number of simultaneous requests that will be
; served. Equivalent to the ApacheMaxClients directive with mpm_prefork.
; Equivalent to the PHP_FCGI_CHILDREN environment variable in the original PHP
; CGI. The below defaults are based on a server without much resources. Don't
; forget to tweak pm.* to fit your needs.
; Note: Used when pm is set to 'static', 'dynamic' or 'ondemand'
; Note: This value is mandatory.
pm.max_children = 150
; The number of child processes created on startup.
; Note: Used only when pm is set to 'dynamic'
; Default Value: min_spare_servers + (max_spare_servers - min_spare_servers) / 2
pm.start_servers = 20
; The desired minimum number of idle server processes.
; Note: Used only when pm is set to 'dynamic'
; Note: Mandatory when pm is set to 'dynamic'
pm.min_spare_servers = 10
; The desired maximum number of idle server processes.
; Note: Used only when pm is set to 'dynamic'
; Note: Mandatory when pm is set to 'dynamic'
pm.max_spare_servers = 35
; The number of seconds after which an idle process will be killed.
; Note: Used only when pm is set to 'ondemand'
; Default Value: 10s
;pm.process_idle_timeout = 10s;

; The number of requests each child process should execute before respawning.
; This can be useful to work around memory leaks in 3rd party libraries. For
; endless request processing specify '0'. Equivalent to PHP_FCGI_MAX_REQUESTS.
; Default Value: 0
pm.max_requests = 500

pm.status_path = /status-www

; The log file for slow requests
; Default Value: not set
; Note: slowlog is mandatory if request_slowlog_timeout is set
;slowlog = log/$pool.log.slow
slowlog = /var/log/php-fpm-slow.log

; The timeout for serving a single request after which a PHP backtrace will be
; dumped to the 'slowlog' file. A value of '0s' means 'off'.
; Available units: s(econds)(default), m(inutes), h(ours), or d(ays)
; Default Value: 0
request_slowlog_timeout = 115s

; The timeout for serving a single request after which the worker process will
; be killed. This option should be used when the 'max_execution_time' ini option
; does not stop script execution for some reason. A value of '0' means 'off'.
; Available units: s(econds)(default), m(inutes), h(ours), or d(ays)
; Default Value: 0
request_terminate_timeout = 120s

; Set open file descriptor rlimit.
; Default Value: system defined value
rlimit_files = 65536

; Set max core size rlimit.
; Possible Values: 'unlimited' or an integer greater or equal to 0
; Default Value: system defined value
rlimit_core = 0

Any advice or something to point out with the www.conf ?

Best Answer

First, you really should ask yourself the question: Is 250 req / second something your site really should be able to handle? That is a LOT of requests for a page. Quite frankly, I think 250req/s maxing your CPU is normal given what you have -- a virtualized system.

pm.max_children = 150

You probably kept on raising this because you keep on getting error saying you're maxed out on children. But you're actually shooting yourself in the foot when you raise this beyond what your system can handle. It's better to NOT accept a next request it can't handle than to accept. Increasing this not only thins out your system, you create more resource requirements as well as more overhead. And when that explodes to a size your system can no longer handle, it creates signs like freezing. I don't know exactly what your VPS is giving you (and you'll never know since it's not static as it depends on other users in the node), but I would personally say that anything above 20 for your specified system is a not realistic setting.

You should also adjust start/min/max spare accordingly as having then greater than max children makes no sense. Or you can just not worry about it and set to pm = static.