I have already worked through AWS Elastic Beanstalk – Apache is restarting constantly
Our elastic beanstalk instances are reporting the following message in the error_log
[Mon Jun 26 22:01:01.878892 2017] [mpm_prefork:notice] [pid 8595] AH00173: SIGHUP received. Attempting to restart
*** Error in (wsgi:wsgi) ': double free or corruption (out): 0x00007f564cced560 ***
sometimes the error sequence will look more like this:
[Tue Jun 27 00:01:01.215260 2017] [:error] [pid 6429] [remote XX.XXX.XX.195:29773] mod_wsgi (pid=6429): Exception occurred processing WSGI script '/opt/python/current/app/site/settings/wsgi/__init__.py'.
[Tue Jun 27 00:01:01.215320 2017] [:error] [pid 6429] [remote XX.XXX.XX.195:29773] OSError: failed to write data
[Tue Jun 27 00:01:01.222407 2017] [:error] [pid 6430] [remote XX.XXX.XX.60:53313] mod_wsgi (pid=6430): Exception occurred processing WSGI script '/opt/python/current/app/site/settings/wsgi/__init__.py'.
[Tue Jun 27 00:01:01.222460 2017] [:error] [pid 6430] [remote XX.XXX.XX.60:53313] OSError: failed to write data
[Tue Jun 27 00:01:04.554810 2017] [core:warn] [pid 8595] AH00045: child process 7614 still did not exit, sending a SIGTERM
[Tue Jun 27 00:01:04.554850 2017] [core:warn] [pid 8595] AH00045: child process 7615 still did not exit, sending a SIGTERM
[Tue Jun 27 00:01:05.555958 2017] [mpm_prefork:notice] [pid 8595] AH00173: SIGHUP received. Attempting to restart
*** Error in (wsgi:wsgi) ': double free or corruption (out): 0x00007f5640cae900 ***
*** Error in (wsgi:wsgi) ': double free or corruption (out): 0x00007f78649b7970 ***
This will go on, almost every hour. The common message being:
[Mon Jun 26 22:01:01.878892 2017] [mpm_prefork:notice] [pid 8595] AH00173: SIGHUP received. Attempting to restart
I looked for the mpm_prefork
module conf block… and there is not one, so all the defaults are being used.
I looked for the logrotation
command being executed by elastic beanstalk
/var/log/httpd/* {
size 10M
missingok
notifempty
rotate 5
sharedscripts
compress
dateext
dateformat -%s
create
postrotate
/sbin/service httpd reload > /dev/null 2>/dev/null || true
endscript
olddir /var/log/httpd/rotated
}
Pretty standard stuff. My understanding of reload
is it attempts a graceful restart…
I am able to manually trigger the error message by executing sudo apachectl -k restart
although I can not find where this would be run during the log rotation.
We have downstream services which appear to be throwing exceptions at the point this server hangs up all it's connections.
So my question is, what else could be causing the SIGHUP
within the mpm_prefork
during logrotate
? As far as I can tell, this should not be happening outside of an error condition.
Apache/2.4.18 (Amazon) mod_wsgi/3.5 Python/3.4.3
Best Answer
As a brief, it looks like current Elastic Beanstalk logrotation configuration is broken, which causes service downtime, 504 Gateway Timeout. Let's take a look.
Reproduction
We create simplest Python WSGI application.
application.py
Zip it to application.zip. Then create Elastic Beanstalk Python application and environment, upload the archive. Make sure you use a key pair that you possess. Leave other settings default. Wait until it's done (several minutes).
ssh
into underlying EC2 instance (see instance identifier in EB's log). Type (httpd
's logrotate post-action, see below):Then on your machine run:
And while it runs, repeat reload command a couple of times.
Then you're expected to see something like the following:
Here's what happens when you
reload
.Then it recovers.
Note that ELB doesn't seem to have any effect on the problem, and the same can be reproduced with two SSH sessions to underlying EC2 and (Amazon AMI doesn't have
siege
):Cause
/etc/cron.hourly/cron.logrotate.elasticbeanstalk.httpd.conf
/etc/logrotate.elasticbeanstalk.hourly/logrotate.elasticbeanstalk.httpd.conf
Notice
postrotate
./sbin/service
is just a System V wrapper for scripts in/etc/init.d/
. Its man page says:Note that
reload
is not standard Apache maintenance command. It's the distro's downstream addition. Let's look in the init script, /etc/init.d/httpd. Relevant part follows:As you can see it sends
HUP
signal to Apache, which is interpreted as Restart Now:TERM
explains 504s pretty well. But how it should have probably been done is Graceful Restart, as it also re-opens logs but doesn't terminate requests being served:Workaround
It's possible to use
.ebextensions
to replace/etc/logrotate.elasticbeanstalk.hourly/logrotate.elasticbeanstalk.httpd.conf
. In the root directory create .ebextensions/10_logs.config with the following contents (basically replace "reload" with "graceful"):And re-deploy your Elastic Beanstalk environment. Note, however with subsequent sub-second graceful restarts I was able to (sporadically) produce 503 Service Unavailable, which, though, is not the case with log rotations as with evenly spaced graceful restarts there was no error.