We have Apache running with the worker MPM and have MaxClients set to 6, yet when I open up top I can see more than 6 Apache processes running. 13 visible in the screen dump below. Can someone explain this? There's also a screen dump from /server-status/, taken around the same time. Under our normal load there seems to be 2-6 requests processing at a time, so I would expect to see that many apache2 processes running in top. The only way I can reconcile this is to assume at max load, there's 3 servers running (ServerLimit 3, 3 apache2 processes) each with 2 threads (3×2 = 6 apache2 processes), but even this would result in 9 apache processes running at most.
Apache is essentially running away and never releasing memory. We serve about 5-6 requests per second, monitored using /server-status/ so I figured setting MaxRequestsPerChild to 1000 (we've had it as low as 500) would cause the processes to recycle and release memory, but this doesn't appear to happen. We're monitoring apache process memory via New Relic. When we restart Apache it consumes about 550M of memory with our configuration below. Each process will eventually swell to VIRT: 300m RES: 80m and we seemingly can't control the number of processes running, so the apache goes from 550M – 5G within 12-14 hours and wipes us out.
I've checked the /conf.d/ directory to make sure that we're not overriding any settings in our apache config. Does anyone have any advice for getting apache under control? I know we have a fat python application running mod_wsgi that probably has memory leaks and certainly could be optimized, but I'm simply looking to control the number of apache processes that are spawned.
Apache Config:
### Section 1: Global Environment
#
# The directives in this section affect the overall operation of Apache,
# such as the number of concurrent requests it can handle or where it
# can find its configuration files.
#
ServerRoot "/etc/apache2"
ServerName localhost
LockFile ${APACHE_LOCK_DIR}/accept.lock
PidFile ${APACHE_PID_FILE}
Timeout 120
KeepAlive Off
ExtendedStatus On
# worker MPM
# StartServers: initial number of server processes to start
# MinSpareThreads: minimum number of worker threads which are kept spare
# MaxSpareThreads: maximum number of worker threads which are kept spare
# ThreadLimit: ThreadsPerChild can be changed to this maximum value during a
# graceful restart. ThreadLimit can only be changed by stopping
# and starting Apache.
# ThreadsPerChild: constant number of worker threads in each server process
# MaxClients: maximum number of simultaneous client connections
# MaxRequestsPerChild: maximum number of requests a server process serves
<IfModule mpm_worker_module>
StartServers 1
ThreadsPerChild 2
MinSpareThreads 1
MaxSpareThreads 2
MaxClients 6
ServerLimit 3
MaxRequestsPerChild 1000
</IfModule>
# These need to be set in /etc/apache2/envvars
User ${APACHE_RUN_USER}
Group ${APACHE_RUN_GROUP}
AccessFileName .htaccess
<Files ~ "^\.ht">
Order allow,deny
Deny from all
Satisfy all
</Files>
DefaultType None
HostnameLookups Off
ErrorLog ${APACHE_LOG_DIR}/error.log
LogLevel warn
LogFormat "%v:%p %a %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" vhost_combined
LogFormat "%a %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%a %l %u %t \"%r\" %>s %O" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent
# Include module configuration:
Include mods-enabled/*.load
Include mods-enabled/*.conf
# Include ports listing
Include ports.conf
# Include generic snippets of statements
Include conf.d/
Top:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
24775 www-data 20 0 282m 68m 5160 S 104 0.8 3:04.67 apache2
24782 www-data 20 0 283m 66m 5376 S 57 0.8 3:24.31 apache2
24780 www-data 20 0 280m 65m 4976 S 55 0.8 3:20.74 apache2
24778 www-data 20 0 289m 72m 5540 S 29 0.9 3:09.55 apache2
24773 www-data 20 0 278m 64m 5116 S 26 0.8 2:55.66 apache2
24777 www-data 20 0 282m 65m 4664 S 20 0.8 3:08.39 apache2
13433 memcache 20 0 642m 597m 876 S 16 7.4 11:46.62 memcached
24774 www-data 20 0 288m 71m 4672 S 15 0.9 3:12.58 apache2
24781 www-data 20 0 283m 66m 5160 S 11 0.8 3:16.01 apache2
24779 www-data 20 0 281m 64m 4676 S 8 0.8 3:11.44 apache2
24776 www-data 20 0 284m 74m 4660 S 8 0.9 2:56.38 apache2
27105 www-data 20 0 49520 6180 2636 S 2 0.1 0:00.05 apache2
27100 www-data 20 0 49432 6084 2628 S 1 0.1 0:00.06 apache2
9 root 20 0 0 0 0 S 1 0.0 62:05.25 rcu_sched
27007 www-data 20 0 49568 6292 2684 S 1 0.1 0:00.60 apache2
1 root 20 0 3496 872 428 S 0 0.0 0:04.61 init
2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd
3 root 20 0 0 0 0 S 0 0.0 0
/server-status/
Apache Server Status for www.mysite.com
Server Version: Apache/2.2.22 (Ubuntu) mod_ssl/2.2.22 OpenSSL/1.0.1 mod_wsgi/3.3 Python/2.7.3
Server Built: Feb 13 2012 01:37:45
Current Time: Tuesday, 18-Feb-2014 10:53:01 EST
Restart Time: Tuesday, 18-Feb-2014 10:25:32 EST
Parent Server Generation: 0
Server uptime: 27 minutes 28 seconds
Total accesses: 8248 - Total Traffic: 126.6 MB
CPU Usage: u.36 s.15 cu0 cs0 - .0309% CPU load
5 requests/sec - 78.7 kB/second - 15.7 kB/request
2 requests currently being processed, 0 idle workers
................................................................
................................................................
WW..............................................................
Scoreboard Key:
"_" Waiting for Connection, "S" Starting up, "R" Reading Request,
"W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
"C" Closing connection, "L" Logging, "G" Gracefully finishing,
"I" Idle cleanup of worker, "." Open slot with no current process
Srv PID Acc M CPU SS Req Conn Child Slot Client VHost Request
0-0 - 0/0/1569 . 0.02 0 37 0.0 0.00 25.22 67.217.125.252 www.mysite.com GET /imgname.jpg HTTP/1.0
0-0 - 0/0/1502 . 0.03 0 786 0.0 0.00 22.47 65.55.52.119 www.mysite.com GET / HTTP/1.0
1-0 - 0/0/1629 . 0.04 13 260 0.0 0.00 24.85 70.208.67.110 www.mysite.com GET /article/s
1-0 - 0/0/1416 . 0.04 13 469 0.0 0.00 21.42 98.109.237.89 www.mysite.com GET / HTTP/1.0
2-0 27863 0/54/1021 W 0.44 0 0 0.0 0.69 15.95 66.151.5.10 www.mysite.com GET /storm-h
2-0 27863 0/50/1111 W 0.44 0 0 0.0 0.61 16.73 108.88.80.66 www.mysite.com GET /server-status/ HTTP/1.0
UPDATE
There was a multiple step solution to this problem.
1) Identify that mod_wsgi processes were being reported by top as apache2. To correct this add the display-name=my-mod-wsgi-app parameter to your WSGIDaemonProcess config.
2) We discovered that there is some horrible part of our python/Django application that causes a mod_wsgi process to bloat to 600M. Running 5 of these would consume 3G of memory on our VPS and make it very sad.
3) We added inactivity-timeout=300 and maximum-requests=200 to our WSGIDaemonProcess config and mod_wsgi nicely restarts itself when a process is not in use or it gets over 500 requests, which keeps our overweight, sloppy Django application running smoothly.
Thanks to Graham for getting me started in this direction. You can read as I talk my way through this problem over on the mod_wsgi Google group. https://groups.google.com/forum/#!topic/modwsgi/wYScZlqgjgA
Best Answer
The breakdown of processes is:
If you use the display-name option to WSGIDaemonProcess, then some tools such as BSD derived 'ps' command and 'htop' will show the name you specify rather than 'apache2'. This way you can distinguish which are actually the mod_wsgi daemon processes running you web application.
To infer more you would need to show what the mod_wsgi configuration you are using is. Right now though it looks like you have a poor configuration even with the MPM settings as running with such a low number of threads and favouring processes when using Apache worker MPM doesn't make a great deal of sense.
Either way, StackOverflow is not a forum and as a result is a really bad place to try and carry out a long discussion to help sort out your configuration. You would be better off using the mod_wsgi mailing list.
I would also suggest you watch/read: