Nginx – PHP-FPM using 40% CPU for a Single Request

amazon ec2nginxPHPphp-fpmWordpress

(I googled and searched this forum for hours, found some topics, but none of them worked for me)

I'm using WordPress with: Varnish + Nginx + PHP-FPM + APC + W3 Total Cache + PageSpeed.

As I'm using Varnish, first time I call www.mysite.com it use just 10% of CPU. Calling the second time, it will be cached. The problem is passing request parameter in URL.


For just 1 request (www.mysite.com?1=1) it shows in top:

PID  USER      PR  NI  VIRT  RES  SHR S %CPU %MEM   TIME+  COMMAND
7609 nginx     20   0  438m  41m  28m S 11.6  7.0   0:00.35 php-fpm
7606 nginx     20   0  437m  39m  26m S 10.3  6.7   0:00.31 php-fpm

After the page is fully loaded, these processes above are still active. And after 2 seconds, they are replaced by another 2 php-fpm processes(below), which are active for 3 seconds.

PID USER       PR  NI  VIRT  RES  SHR S %CPU %MEM   TIME+  COMMAND
7665 nginx     20   0  444m  47m  28m S 20.9  7.9   0:00.69 php-fpm
7668 nginx     20   0  444m  46m  28m R 20.9  7.9   0:00.63 php-fpm

40% CPU usage just for 1 request not cached!

Strange things:

  • CPU usage is higher after the page was loaded
  • When I purged the cache (W3 and Varnish), it take just 10% of CPU to load a not cached page
  • This high CPU usage just happend passing request parameter or in WordPress Admin

When I try to do 10 request(pressing F5 key 10x), the server stop serving and in php-fpm log appears:

WARNING: [pool www] server reached max_children setting (10), consider raising it

I raised that value to 20, same problem.

I'm using pm=ondemand (pm.max_children=10 and pm.max_requests=500).

Inittialy I was using pm=dynamic (pm.max_children=10, pm.start_servers=1, pm.min_spare_servers=1, pm.min_spare_servers=2, pm.max_requests=500) and it happened the same problem.

Anyone could help, plz? Any help would be appreciated!

PS:

  • APC is ON (98% Hits, 2% Misses)
  • Server is Amazon Micro (613MB RAM)
  • PHP 5.3.26 (fpm-fcgi)
  • Linux version 3.4.48-45.46.amzn1.x86_64 Red Hat 4.6.3-2 (I think it's based on CentOS 5)

Best Answer

It's hard to debug where the problem is coming from.

I'd say slim your setup down.

You are using: Varnish + Nginx + PHP-FPM + APC + W3 Total Cache + PageSpeed

Why do you need Varnish? nginx can also do caching for static pages. Take a look at fastcgi_cache

PHP-FPM and APC should be fine, just consider given APC enough memory so that all files can be cached without memory problems and fragmentation.

Why do you need W3 Total Cache? Depending on the configuration options this can hog a lot of CPU e.g. for minifying code or caching pages or database calls to disk...

The same with mod_pagespeed - It's a wrapper that processes your output files and also adds complexity that uses CPU cycles.

So - If you want a faster website I'd say untangle that mess and simplify it:

  • Get rid of Varnish: If you don't have a strong use case for it. nginx can do caching just fine and configure nginx to make use of fastcgi_cache and use a socket to talk to PHP-FPM.

  • Get rid of W3TC: Use memcached and and the memcache object caching plugin. This is your DB-Cache and Object-Cache. For caching complete pages just use nginx or Varnish if you must. You can get rid of configuring full page caching for nginx or Varnish if you use batcache for caching whole pages in memcached. Also try to use sockets for memcached.

  • Get rid of mod_pagespeed. Read into what optimisation it does for you and try to apply these on your blog theme or images by hand. If you are using gzip in nginx most of the stuff shouldn't be important anyway.

  • Enable the MySQL query cache and look for performance optimized MySQL settings. If you have a lot of writes (e.g. lots of comments) consider using InnoDB.

  • Use PHP 5.4 or even PHP 5.5 - Lots's of performance and memory improvements went into these releases that should give you some speedup and memory savings.

More advanced approaches:

Take a look at xdebug profiler. This should give you a rundown what function does consume a lot of cpu. The page gives some details on how to look at the generated data using kcachegrind.

You could try to look at the amount of syscalls using strace on the process tree. You'll need to -f flag for this and probably just printing statistics -c should be enough to learn about a possible problem.

I'd say apply the KISS principle and only make use of performance or tuning stuff if you have a clear use case for it and the tools show an improvement using profiling.