Ubuntu – Prevent Server Overload from Zombie Processes in PHP Session Removal Cron Job

cronPHPzombie

I recently noticed when I logged in that I had several thousand processes marked "zombie". Upon further investigation, I found the following from ps fax:

  701 ?        Ss     0:28 cron
 3363 ?        S      0:00  \_ CRON
 3364 ?        Ss     0:00      \_ /bin/sh -c   [ -x /usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ -depth -mindepth 1 -maxdepth 1 -type f -cmin +$(/usr/lib/php5/maxlifetime) ! -execdir fuser -s {} 2>/dev/null \; -delete
 3371 ?        S      0:00          \_ find /var/lib/php5/ -depth -mindepth 1 -maxdepth 1 -type f -cmin +24 ! -execdir fuser -s {} ; -delete
 3451 ?        S      0:02              \_ fuser -s ./sess_jns5af2mvm81e2fg1rbuctlt54
 3452 ?        Z      0:00                  \_ [fuser] <defunct>
 3453 ?        Z      0:00                  \_ [fuser] <defunct>
 3454 ?        Z      0:00                  \_ [fuser] <defunct>

... many, many lines omitted ...

13642 ?        Z      0:00                  \_ [fuser] <defunct>

As far as I can tell, this is a script in /etc/cron.d/php that is supposed to clean up dead PHP sessions at 10 and 40 minutes past the hour.

Edit: Here's the text of the script. It's installed by default with PHP on Ubuntu.

# /etc/cron.d/php5: crontab fragment for php5
#  This purges session files older than X, where X is defined in seconds
#  as the largest value of session.gc_maxlifetime from all your php.ini
#  files, or 24 minutes if not defined.  See /usr/lib/php5/maxlifetime

# Look for and purge old sessions every 30 minutes
09,39 *     * * *     root   [ -x /usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ -depth -mindepth 1 -maxdepth 1 -type f -cmin +$(/usr/lib/php5/maxlifetime) ! -execdir fuser -s {} 2>/dev/null \; -delete

For some reason (currently I'm guessing a badly behaved web crawler making a new session on each request, but I'm still looking over the logs), sometimes there are many thousands of abandoned php sessions in /var/lib/php/, and when this script runs it will happily spawn a new fuser process for each one. This quickly hits the process limit, and brings things to a crawl.

What can I do, besides just deleting this cron job and cleaning things up things manually?

Best Answer

It would probably be best to move the logic from find to a script that loops through all of the files on the commandline to see if they're being accessed, and if not, delete them:

#!/bin/bash

for x; do
  if ! /bin/fuser -s "$x" 2>/dev/null; then
    rm "$x"
  fi
done

Then change the cron job to just

09,39 *     * * *     root   [ -x /usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ -depth -mindepth 1 -maxdepth 1 -type f -cmin +$(/usr/lib/php5/maxlifetime) -execdir thatscript.sh {} +

This will have find collect all the session files matching the max age, then run thatscript.sh with all of them at once (due to the + instead of ;). The script is then responsible for making sure the file is not in use and deleting it. This way, find should only have one direct child itself, and bash should not have any problem cleaning up the fuser and rm children.

From find's docs, it's not clear whether find will automatically divide up the list of filenames into multiple executions if they exceed shell/OS limits (and 13000 files may do so... older versions of bash had a default command line argument limit of somewhere around 5000) In that case, you may change -execdir thatscript.sh {} + to -print0 | xargs -0 thatscript.sh to have xargs divide up the files.

Alternatively, if you don't have the drive mounted noatime, change -cmin to -amin and ditch the tests entirely:

    09,39 *     * * *     root   [ -x /usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ -depth -mindepth 1 -maxdepth 1 -type f -amin +$(/usr/lib/php5/maxlifetime) -delete

This will remove all the session files last accessed more than [output of the maxlifetime command] minutes ago. As long as you don't have any php processes that open a session then sit around for a long time (default for that maxlifetime on Debian seems to be 24 minutes which would be a very long time for a page to load) doing nothing, this shouldn't zap any sessions currently in use.

Related Topic