Apache threads are stacking up on one of my web servers (300-500 simultaneous requests, some taking 3-8s to process!), but CPU usage is very low (~10%). Page load time is slowing way down as a result. I have plenty of idle CPU power. How can I use more of it to handle these threads faster?
Here's the top of top…
Tasks: 469 total, 1 running, 468 sleeping, 0 stopped, 0 zombie
Cpu(s): 8.1% us, 1.7% sy, 0.0% ni, 90.3% id, 0.0% wa, 0.0% hi, 0.0% si
Mem: 9181012k total, 7998772k used, 1182240k free, 0k buffers
Swap: 0k total, 0k used, 0k free, 0k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11351 apache 15 0 364m 30m 17m S 11.9 0.3 0:00.73 httpd
7527 apache 15 0 365m 36m 23m S 8.6 0.4 0:01.76 httpd
7607 apache 16 0 364m 35m 22m S 2.3 0.4 0:01.47 httpd
11498 apache 17 0 359m 19m 11m S 2.3 0.2 0:00.07 httpd
11497 apache 16 0 362m 23m 13m S 1.7 0.3 0:00.05 httpd
1840 apache 15 0 366m 44m 29m S 1.3 0.5 0:03.74 httpd
5358 apache 15 0 364m 36m 24m S 1.3 0.4 0:02.58 httpd
8090 apache 15 0 365m 31m 17m S 1.3 0.3 0:01.10 httpd
11346 apache 15 0 361m 28m 18m S 1.3 0.3 0:00.12 httpd
4051 apache 16 0 365m 40m 27m S 1.0 0.5 0:01.72 httpd
32575 apache 16 0 365m 42m 28m S 0.7 0.5 0:03.62 httpd
5145 apache 16 0 365m 37m 24m S 0.7 0.4 0:02.23 httpd
8173 apache 16 0 363m 35m 23m S 0.7 0.4 0:00.29 httpd
5466 apache 15 0 365m 31m 18m S 0.3 0.4 0:01.18 httpd
7420 apache 16 0 364m 36m 23m S 0.3 0.4 0:01.24 httpd
11485 apache 16 0 362m 23m 12m S 0.3 0.3 0:00.04 httpd
1 root 15 0 10272 612 584 S 0.0 0.0 0:02.78 init
30129 root 16 -4 12536 400 396 S 0.0 0.0 0:00.00 udevd
30402 root 16 0 5840 580 480 S 0.0 0.0 0:14.01 syslogd
30414 rpc 18 0 7992 408 404 S 0.0 0.0 0:00.00 portmap
30439 root 18 0 10088 548 544 S 0.0 0.0 0:00.00 rpc.statd
30478 memcache 15 0 141m 5364 516 S 0.0 0.1 1:16.34 memcached
30496 root 16 0 60604 744 636 S 0.0 0.0 0:07.31 sshd
30507 root 15 0 21572 796 688 S 0.0 0.0 0:04.56 xinetd
31817 root 15 0 166m 932 860 S 0.0 0.0 0:00.03 httpsd
31820 psaadm 15 0 175m 7992 4596 S 0.0 0.1 0:02.31 httpsd
31924 root 15 0 19704 924 552 S 0.0 0.0 0:02.50 crond
13316 root 16 0 98528 3628 2796 S 0.0 0.0 0:00.01 sshd
1655 root 19 0 8600 1180 972 S 0.0 0.0 0:00.00 mysqld_safe
1695 mysql 16 0 4268m 464m 4684 S 0.0 5.2 10:05.19 mysqld
32564 root 16 0 98528 3612 2780 S 0.0 0.0 0:00.00 sshd
28489 root 15 0 98528 3628 2796 S 0.0 0.0 0:00.00 sshd
32152 root 16 0 98528 3612 2780 S 0.0 0.0 0:00.01 sshd
5781 root 15 0 98528 3628 2796 S 0.0 0.0 0:00.01 sshd
7801 root 17 0 356m 11m 5472 S 0.0 0.1 0:05.74 httpd
7804 apache 16 0 366m 36m 21m S 0.0 0.4 0:00.35 httpd
7805 apache 16 0 370m 31m 13m S 0.0 0.4 0:00.11 httpd
8172 apache 15 0 366m 34m 19m S 0.0 0.4 0:00.62 httpd
9430 apache 16 0 365m 45m 32m S 0.0 0.5 0:02.67 httpd
11393 apache 16 0 363m 37m 25m S 0.0 0.4 0:00.75 httpd
11551 apache 17 0 360m 31m 22m S 0.0 0.4 0:00.24 httpd
32345 apache 16 0 364m 39m 27m S 0.0 0.4 0:02.86 httpd
32472 apache 16 0 394m 70m 27m S 0.0 0.8 0:03.77 httpd
32488 apache 16 0 364m 42m 29m S 0.0 0.5 0:02.38 httpd
32501 apache 16 0 365m 41m 28m S 0.0 0.5 0:01.71 httpd
32644 apache 16 0 365m 36m 23m S 0.0 0.4 0:01.79 httpd
32765 apache 15 0 364m 39m 26m S 0.0 0.4 0:02.65 httpd
1334 apache 16 0 368m 42m 26m S 0.0 0.5 0:02.77 httpd
1339 apache 15 0 362m 39m 29m S 0.0 0.4 0:01.84 httpd
1351 apache 15 0 364m 43m 30m S 0.0 0.5 0:02.59 httpd
1553 apache 16 0 363m 41m 29m S 0.0 0.5 0:02.74 httpd
1555 apache 16 0 365m 37m 24m S 0.0 0.4 0:01.59 httpd
1564 apache 15 0 365m 40m 27m S 0.0 0.5 0:01.88 httpd
1569 apache 16 0 364m 35m 22m S 0.0 0.4 0:00.63 httpd
1573 apache 15 0 367m 39m 24m S 0.0 0.4 0:01.66 httpd
1575 apache 16 0 363m 36m 24m S 0.0 0.4 0:01.88 httpd
1583 apache 16 0 364m 34m 21m S 0.0 0.4 0:01.92 httpd
1594 apache 15 0 367m 44m 29m S 0.0 0.5 0:03.71 httpd
1689 apache 15 0 365m 38m 24m S 0.0 0.4 0:01.47 httpd
1690 apache 15 0 365m 39m 26m S 0.0 0.4 0:01.18 httpd
1710 apache 16 0 363m 34m 23m S 0.0 0.4 0:00.99 httpd
1725 apache 15 0 364m 39m 26m S 0.0 0.4 0:01.80 httpd
1726 apache 16 0 365m 40m 26m S 0.0 0.5 0:00.90 httpd
1737 apache 16 0 364m 30m 17m S 0.0 0.3 0:00.46 httpd
1919 apache 15 0 363m 34m 22m S 0.0 0.4 0:00.83 httpd
1930 apache 16 0 364m 33m 21m S 0.0 0.4 0:00.50 httpd
1934 apache 15 0 364m 40m 27m S 0.0 0.5 0:02.20 httpd
And critical httpd.conf settings:
Timeout 120
KeepAlive On
MaxKeepAliveRequests 200
KeepAliveTimeout 3
<IfModule prefork.c>
StartServers 8
MinSpareServers 5
MaxSpareServers 20
ServerLimit 256
MaxClients 256
MaxRequestsPerChild 4000
</IfModule>
<IfModule worker.c>
StartServers 2
MaxClients 150
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0
</IfModule>
Best Answer
To configure Apache to use more of your CPU, you need to get rid of the bottle-neck that is currently preventing Apache from using more CPU. Because requests are taking 3-8 seconds to complete, you know there is some sort of bottle-neck, you need to find it.
Things to look at are:
uptime
while requests are taking 3-8 seconds, does the load show as being high (in the double or triple digits)? You can't really read a lot into this, because a high load could mean the problem exists elsewhere, but if the load is low while requests are taking 3-8 seconds it probably is a remote issue.If you don't have "munin" installed, you probably should. If you do, look at the graphs to see how the system utilization changes when the system is responding slowly and when it's working well. If you see jumps in the graphs, those might indicate where the bottle-neck is. If you see blank areas in all the graphs, that probably means the system is saturated. If only the Apache graphs are blank, it probably means that Apache has reached it's max connections, probably a side-effect of the performance problems.
Also note that if you have multiple CPUs, but you have a single-threaded application like Zope sitting behind Apache, it could be that one of your CPUs is saturated where the others are idle. If you press "1" in top, that will show you the utilization of each individual core. Look for one that is at 0% idle all the time, where the others are much more idle.
Using these techniques I've been able to isolate and resolve most performance problems similar to this.