Centos – Monit doesn’t pick up httpd process after reset

centoshttpdmonitmonitoringmunin

I've installed Munin and Monit on one of my servers running CentOS 5. Everything is working well, logging and reporting info, except for when the httpd process is restarted. I have Monit set to restart httpd if it hits 2.5gb of memory usage. If/when this happens, it'll restart just fine, but Monit won't pick up the new process.

I'll get a notice telling me that httpd service does not exist, and then another telling me httpd failed to start, and then a final one saying that the httpd service timed out and won't be monitored anymore.

I'm not sure why I'm getting these reports, because the httpd service IS getting restarted just fine. I've checked the logs and theres no issues there on the restart.

Best Answer

Perhaps have monit run a script that restarts httpd, waits a few seconds, and then restarts monit as well.

It may be that monit is somehow locked on to the particular process ID's associated with the killed httpd processes - and this would allow it to detect the new processes correctly.

I'm not sure how much free memory your system has when it hits the 2.5Gb usage point, but if that amount gets too low (perhaps during the restart?), linux will start randomly killing processes to avoid a total crash. I'm guessing that oomkiller might be killing something essential to monit's functionality.

If this is the case, lowering your restart threshold from 2.5Gb to 2.0Gb, or increasing the amount of memory in the box would be a better solution.

Related Topic