Linux – Apache unresponsive / stalling occasionally

apache-2.4linuxMySQLPHP

I'm running Apache http server 2.4 with PHP 7.0 and MySQL 5.5 on Debian GNU/Linux 8 (jessie). Occasionally, Apache becomes completely unresponsive for several seconds – about 30 seconds or more. In this time, requests seem to queue up – and when Apache finally starts to work properly again, a lot of requests that have piled up have to be processed at once, which of course is not so nice as well.

The reason for Apache to become unresponsive is unclear, because:

  • CPU load goes completely down; neither Apache, nor MySQL or anything else uses the CPU notably
  • There is no error in the Apache error_log
  • There is no blocking query in MySQL – nothing shown when I enter "SHOW PROCESSLIST"
  • Once a second, an "internal dummy connection" is visible in access_log
  • The overall load on the server must not be high for this to occur; even if the load is below average with not many users logged in to our system, this can happen
  • Even a PHP script containing only echo "Hello World!"; is not executed
  • In PHP, no MySQL error is thrown and I can execute MySQL statements from the MySQL console easily
  • RAM seems to be ok – the swap partition is not used a lot. This is what top says during the stall:

    KiB Mem:   6129344 total,  5975748 used,   153596 free,       24 buffers
    KiB Swap:  1952764 total,   199428 used,  1753336 free.  4397256 cached Mem
    

I tried to analyze the problem using strace – to be precise, when I note that the server becomes unresponsive, I enter in the shell:

ps auxw | grep apache | awk '{print" -p " $2}' | xargs sudo strace

What I observed is that during this time lines like the following are visible in the strace output quite often, which they don't at times in which the problem is not present:

[pid 13521] fcntl(57, F_SETLK, {type=F_RDLCK, whence=SEEK_SET, start=1073741824, len=1}) = -1 EAGAIN (Resource temporarily unavailable)

Normally, when there is no problem, I can see lines like the following:

[pid  3414] fcntl(55, F_SETLK, {type=F_RDLCK, whence=SEEK_SET, start=1073741824, len=1}) = 0

Does anyone know that this means? It seems to me there is a locking conflict of some kind …

Just for completeness, here is my Apache configuration:

 LogFormat "%h PID %P %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-agent}i\" %V" common
ServerTokens ProductOnly
ServerSignature Off
TraceEnable off

<IfModule mod_ssl.c>
    SSLHonorCipherOrder On
    SSLProtocol ALL -SSLv2 -SSLv3
    SSLCipherSuite EECDH+AES:AES256-SHA:AES128-SHA:AES:CAMELLIA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!aECDH:!EDH:!EXP:!SRP:!DSS:!LOW;
    SSLVerifyClient none
    SSLVerifyDepth 1
    SSLInsecureRenegotiation Off
</IfModule>

ScriptAlias /cgi-bin52/ /usr/share/phpcgi/php52/
ScriptAlias /cgi-bin53/ /usr/share/phpcgi/php53/
ScriptAlias /cgi-bin54/ /usr/share/phpcgi/php54/
ScriptAlias /cgi-bin55/ /usr/share/phpcgi/php55/
ScriptAlias /cgi-bin56/ /usr/share/phpcgi/php56/
ScriptAlias /cgi-bin70/ /usr/share/phpcgi/php70/

Mutex flock

LoadModule deflate_module /usr/lib/apache2/modules/mod_deflate.so
LoadModule status_module /usr/lib/apache2/modules/mod_status.so

AcceptFilter http none
AcceptFilter https none

ExtendedStatus on
TimeOut 60
KeepAlive Off
MaxKeepAliveRequests 50
KeepAliveTimeout 2
Options Indexes MultiViews FollowSymLinks
MaxRequestWorkers 256
MaxRequestsPerChild 300

You see, there is already an entry related to locking behavior: Mutex flock … it was preconfigured by my Webhoster for stability reasons, as he says. Also, at https://httpd.apache.org/docs/2.4/mod/core.html#mutex , this seems to be one of the few options which does not have any issues.

I added the AcceptFilter entries while trying to find a solution, but no success.

Can anybody explain what the line I logged using strace containing fcntl ... means, or suggest another method for analysis of the problem?

Best Answer

You can see in strace what file-number is blocking. When you do an ls -l /proc/$pid/fd you will see all open files for that process. The symlink with the file-number will point to the file in question.

I have seen such issues, usually it is the session file that has this issue. If it is the session file make your developers understand what session_write_close() in PHP does and what omitting session_write_close() will do to your performance under load.

Related Topic