Greetings experts,
On my dedicated CentOS 5.4 server, I configure apache with about a dozen virtual hosts. I test a few of 'em, each loads within about a second; fairly quick. Load average is less than 1. No problems. I'm running static HTML sites, one WordPress blog with MySQL 5.0… these are not high-bandwidth sites; nothing that would stress this server.
Next morning, I get in to work, load up the main site, and it takes 10 to 20 seconds to load. I check the load average on the server and it's hovering around 3, sometimes up to 5, once saw it at 8, never below 2. At this point I gracefully bounce apache:
# apachectl -k graceful
Takes about half a minute, then all is well again. All virtual hosts load fast, less than a second. Load average quickly sinks below 1.
When checking /server-status, not a lot is going on; when checking net traffic (vnstat -l
or vnstat -h
), not a lot of bandwidth is being used. Both are compariable at the beginning of the day as at the end. Yet, when I check it in the morning, apache is much, much slower than pretty much all day. What is happening overnight to make apache slow down so much and consume so many more system resources?
# httpd -V
Server version: Apache/2.2.3
# uname -a
Linux myserver.com 2.6.18-92.el5 #1 SMP Tue Jun 10 18:51:06 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
# free
total used free shared buffers cached
Mem: 1025576 1017292 8284 0 8208 43160
-/+ buffers/cache: 965924 59652
Swap: 2096472 361012 1735460
I suppose I could set up a cronjob which gracefully bounced apache daily, but that seems like a quick-and-dirty solution. I'd rather find the cause and fix that.
UPDATE 2009-10-28 14:38; samples taken every 10 seconds over five minutes with average:
$ sar -W 10 30 && date
Linux 2.6.18-92.el5 (myserver.com) 10/28/2009
02:32:36 PM pswpin/s pswpout/s
02:32:46 PM 10.31 30.43
02:32:56 PM 2.30 32.93
02:33:06 PM 21.56 0.00
02:33:16 PM 1.80 0.00
02:33:26 PM 5.69 26.67
02:33:36 PM 0.10 0.00
02:33:46 PM 25.70 7.60
02:33:56 PM 10.61 7.11
02:34:06 PM 4.10 2.60
02:34:16 PM 0.70 0.00
02:34:26 PM 0.00 0.00
02:34:36 PM 0.00 0.00
02:34:46 PM 3.80 0.00
02:34:56 PM 0.00 0.00
02:35:06 PM 0.00 11.01
02:35:16 PM 7.70 30.30
02:35:26 PM 20.32 0.00
02:35:36 PM 1.60 0.00
02:35:46 PM 11.60 0.00
02:35:56 PM 2.50 0.00
02:36:06 PM 0.00 0.00
02:36:16 PM 3.60 0.00
02:36:26 PM 0.00 0.00
02:36:36 PM 0.00 0.00
02:36:46 PM 0.00 0.00
02:36:56 PM 445.20 56.60
02:37:06 PM 0.00 0.00
02:37:16 PM 0.00 0.00
02:37:26 PM 0.00 0.00
02:37:36 PM 0.00 0.00
Average: 19.31 6.84
Wed Oct 28 14:37:36 PDT 2009
Curiously, apache is not slow this morning. I made some tweaks to the number of servers started, num spare servers, max number of servers, etc, yesterday. Let me get the old values and compare…
Original values from /etc/httpd/conf/httpd.conf:
StartServers 20
MinSpareServers 20
MaxSpareServers 120
ServerLimit 256
MaxClients 256
MaxRequestsPerChild 4000
New values which, from all appearances, seems to work just fine:
StartServers 30
MinSpareServers 30
MaxSpareServers 40
ServerLimit 50
MaxClients 50
MaxRequestsPerChild 4000
I'll probably continue to tweak these settings a little, but they do seem to work well now.
Sar command again this morning:
$ sar -W 10 30 && date
Linux 2.6.18-92.el5 (myserver.com) 10/29/2009
09:31:09 AM pswpin/s pswpout/s
09:31:19 AM 5.80 54.40
09:31:29 AM 62.10 0.00
09:31:39 AM 0.00 0.00
09:31:49 AM 0.00 0.00
09:31:59 AM 0.00 0.00
09:32:09 AM 3.30 0.00
09:32:19 AM 2.70 0.00
09:32:29 AM 0.00 0.00
09:32:39 AM 0.00 0.00
09:32:49 AM 0.00 0.00
09:32:59 AM 3.10 0.00
09:33:09 AM 5.80 0.00
09:33:19 AM 0.00 0.00
09:33:29 AM 0.00 0.00
09:33:39 AM 0.00 0.00
09:33:49 AM 0.00 0.00
09:33:59 AM 0.00 0.00
09:34:09 AM 0.00 0.00
09:34:19 AM 0.00 0.00
09:34:29 AM 0.00 0.00
09:34:39 AM 4.00 0.00
09:34:49 AM 0.10 0.00
09:34:59 AM 0.00 0.00
09:35:09 AM 4.80 0.00
09:35:19 AM 0.00 0.00
09:35:29 AM 291.29 0.00
09:35:39 AM 0.00 0.00
09:35:49 AM 0.80 0.00
09:35:59 AM 0.00 0.00
09:36:09 AM 0.00 0.00
Average: 12.78 1.81
Thu Oct 29 09:36:09 PDT 2009
The average is actually lower! And the server got more traffic than yesterday. Womble, it seems you were right! And now all is well in the universe again.
John Gardeniers, good idea! It's got the -o [filename]
switch just for that. Thanks for the tip!
Jeremy Visser, dstat
is a really sweet tool! Thanks for the tip! It was not installed, had to yum install dstat
.
Best Answer
Based on your
free
output, I strongly suspect that your Apache processes are heavily buried in swap. The output ofsar -W 1 0
will confirm (or refute) this hypothesis (run it when the machine is running slow).If the Apache processes aren't all actually serving requests (as shown by mod_status) you should tune the number of "spare" children (with MaxSpareServers) so that they get reaped quicker (and hence don't lay around consuming RAM). If you really do need the number of children you're running to service the request load, you'll need more RAM (I'd go with another 1GB straight up; RAM is cheap, diagnosis time isn't).