Linux – getting a Sudden Drop in throughput and sluggishness with no CPU increase

apache-2.2linuxperformance-tuning

Occasionally, during random parts of the day, I get a 10 minute period of extreme sluggishness where my requests are taking 50-1000 times longer then they normally do. Note: I am on Apache/2.2.16 (Debian), running PHP 5.3.3

Newrelic shows that the time is not spent in the Database, it's supposedly spent while PHP is executing before the first line of code (according to some traces). At the same time, I see a huge drop in throughput to nearly 1/3 the normal amount.

When I look at the graphs, I can see that CPU, Memory, Disk IO, and CPU waitIO are all at steady levels: No spikes at all. I don't see any error messages in the error log for PHP or the web server during that time. The server has more then enough memory, according to newrelic it's only using about 25%. Total memory is 3.3 GB.

Note: The load average is about .25 on two cores, hence load is fairly low. I typically get about 1000-1500 requests per minute. response times are usually 15ms to 150ms.

here are some of my apache configs:

<IfModule mpm_worker_module>
    StartServers          2
    MinSpareThreads      25
    MaxSpareThreads      75
    ThreadLimit          64
    ThreadsPerChild      25
    MaxClients          550
    MaxRequestsPerChild   0
</IfModule>

 <IfModule mpm_event_module>
     StartServers          2
     MaxClients          550
     MinSpareThreads      25
     MaxSpareThreads      75
     ThreadLimit          64
     ThreadsPerChild      25
     MaxRequestsPerChild   0
 </IfModule>

MaxClients is set that high becuase our average memory per process is very low: about 1-4mb

The only explanation I can think of is that my Host is dropping connectivity or is having some sort of connectivity issue. Which wouldn't surprise me, since this host (rimuhosting) has been less then reliable.

Is there any other possible explanation?

Best Answer

Yes, there are some issues to think about when troubleshooting performance, and most of them can be tweaked at /etc/sysctl.conf file.

Apache and PHP are susceptible to a number of resource depletion Denial of Service attacks, notably, SlowLoris, file.fs-max depletion, socket depletion, number of ephemeral open ports depletion.

Check if sysctl -w net.ipv4.ip_local_port_range="1024 8048" has any effect in your rush hour period. That command informs the OS to use from port 1024 to 8048 to respond to requests, and if your server is getting hammered at some point, you could be getting into socket depletion.

Also, run netstat -na | egrep -c TIME_WAIT and netstat -na | egrep -c STAB to watch for socket usage patterns.

Edit: better than those count commands: watch -n1 'cat /proc/net/sockstat'

Related Topic