Linux – Working around the stale pidfile problem after hard restart kills the daemon

bootdaemonlinux

I'm using Red Hat Linux (RHEL5) on a (VMWare) VM. I've written a daemon which should stay running all the time and automatically run on boot.

Last night the VM host had an unrecoverable hardware problem and the VM abruptly halted. When it came back, my daemon didn't start because the pidfile still existed.

Apparently this is called The Stale pidfile Syndrome but I'm not sure what's the best long-term approach for mitigating it. I'm thinking that the startup script in /etc/rc.d* should delete the pidfile before starting the daemon, but the service management script in /etc/init.d should remain the same so things like service mydaemon start doesn't clobber the pidfile.

/etc/rc.d/rc6.d just has a symlink to the script in /etc/init.d/, so how should I change how it behaves only on boot? I can make an additional script with higher precedence in the rc.d dirs, but it seems hacky. Someone also suggested adding logic like "if uptime is less than 1 minute, delete the pidfile" but that seems hacky too.

Any thoughts or solutions or best practices?

Best Answer