On my dedicated CentOS 5.4 server, I configure apache with about a dozen virtual hosts. I test a few of 'em, each loads within about a second; fairly quick. Load average is less than 1. No problems. I'm running static HTML sites, one WordPress blog with MySQL 5.0… these are not high-bandwidth sites; nothing that would stress this server.

Next morning, I get in to work, load up the main site, and it takes 10 to 20 seconds to load. I check the load average on the server and it's hovering around 3, sometimes up to 5, once saw it at 8, never below 2. At this point I gracefully bounce apache:

# apachectl -k graceful

Takes about half a minute, then all is well again. All virtual hosts load fast, less than a second. Load average quickly sinks below 1.

When checking /server-status, not a lot is going on; when checking net traffic (vnstat -l or vnstat -h), not a lot of bandwidth is being used. Both are compariable at the beginning of the day as at the end. Yet, when I check it in the morning, apache is much, much slower than pretty much all day. What is happening overnight to make apache slow down so much and consume so many more system resources?

# httpd -V
Server version: Apache/2.2.3
# uname -a
Linux 2.6.18-92.el5 #1 SMP Tue Jun 10 18:51:06 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
# free
             total       used       free     shared    buffers     cached
Mem:       1025576    1017292       8284          0       8208      43160
-/+ buffers/cache:     965924      59652
Swap:      2096472     361012    1735460

I suppose I could set up a cronjob which gracefully bounced apache daily, but that seems like a quick-and-dirty solution. I'd rather find the cause and fix that.

UPDATE 2009-10-28 14:38; samples taken every 10 seconds over five minutes with average:

$ sar -W 10 30 && date
Linux 2.6.18-92.el5 (   10/28/2009

02:32:36 PM  pswpin/s pswpout/s
02:32:46 PM     10.31     30.43
02:32:56 PM      2.30     32.93
02:33:06 PM     21.56      0.00
02:33:16 PM      1.80      0.00
02:33:26 PM      5.69     26.67
02:33:36 PM      0.10      0.00
02:33:46 PM     25.70      7.60
02:33:56 PM     10.61      7.11
02:34:06 PM      4.10      2.60
02:34:16 PM      0.70      0.00
02:34:26 PM      0.00      0.00
02:34:36 PM      0.00      0.00
02:34:46 PM      3.80      0.00
02:34:56 PM      0.00      0.00
02:35:06 PM      0.00     11.01
02:35:16 PM      7.70     30.30
02:35:26 PM     20.32      0.00
02:35:36 PM      1.60      0.00
02:35:46 PM     11.60      0.00
02:35:56 PM      2.50      0.00
02:36:06 PM      0.00      0.00
02:36:16 PM      3.60      0.00
02:36:26 PM      0.00      0.00
02:36:36 PM      0.00      0.00
02:36:46 PM      0.00      0.00
02:36:56 PM    445.20     56.60
02:37:06 PM      0.00      0.00
02:37:16 PM      0.00      0.00
02:37:26 PM      0.00      0.00
02:37:36 PM      0.00      0.00
Average:        19.31      6.84
Wed Oct 28 14:37:36 PDT 2009

Curiously, apache is not slow this morning. I made some tweaks to the number of servers started, num spare servers, max number of servers, etc, yesterday. Let me get the old values and compare…

Original values from /etc/httpd/conf/httpd.conf:

StartServers      20
MinSpareServers   20
MaxSpareServers  120
ServerLimit      256
MaxClients       256
MaxRequestsPerChild  4000

New values which, from all appearances, seems to work just fine:

StartServers     30
MinSpareServers  30
MaxSpareServers  40
ServerLimit      50
MaxClients       50
MaxRequestsPerChild  4000

I'll probably continue to tweak these settings a little, but they do seem to work well now.

Sar command again this morning:

$ sar -W 10 30 && date
Linux 2.6.18-92.el5 (   10/29/2009

09:31:09 AM  pswpin/s pswpout/s
09:31:19 AM      5.80     54.40
09:31:29 AM     62.10      0.00
09:31:39 AM      0.00      0.00
09:31:49 AM      0.00      0.00
09:31:59 AM      0.00      0.00
09:32:09 AM      3.30      0.00
09:32:19 AM      2.70      0.00
09:32:29 AM      0.00      0.00
09:32:39 AM      0.00      0.00
09:32:49 AM      0.00      0.00
09:32:59 AM      3.10      0.00
09:33:09 AM      5.80      0.00
09:33:19 AM      0.00      0.00
09:33:29 AM      0.00      0.00
09:33:39 AM      0.00      0.00
09:33:49 AM      0.00      0.00
09:33:59 AM      0.00      0.00
09:34:09 AM      0.00      0.00
09:34:19 AM      0.00      0.00
09:34:29 AM      0.00      0.00
09:34:39 AM      4.00      0.00
09:34:49 AM      0.10      0.00
09:34:59 AM      0.00      0.00
09:35:09 AM      4.80      0.00
09:35:19 AM      0.00      0.00
09:35:29 AM    291.29      0.00
09:35:39 AM      0.00      0.00
09:35:49 AM      0.80      0.00
09:35:59 AM      0.00      0.00
09:36:09 AM      0.00      0.00
Average:        12.78      1.81
Thu Oct 29 09:36:09 PDT 2009

The average is actually lower! And the server got more traffic than yesterday. Womble, it seems you were right! And now all is well in the universe again.

John Gardeniers, good idea! It's got the -o [filename] switch just for that. Thanks for the tip!

Jeremy Visser, dstat is a really sweet tool! Thanks for the tip! It was not installed, had to yum install dstat.

Based on your free output, I strongly suspect that your Apache processes are heavily buried in swap. The output of sar -W 1 0 will confirm (or refute) this hypothesis (run it when the machine is running slow).

If the Apache processes aren't all actually serving requests (as shown by mod_status) you should tune the number of "spare" children (with MaxSpareServers) so that they get reaped quicker (and hence don't lay around consuming RAM). If you really do need the number of children you're running to service the request load, you'll need more RAM (I'd go with another 1GB straight up; RAM is cheap, diagnosis time isn't).