ps
currently shows 17617 zombie processes, all of which have a ppid of 1/init. init should be reaping these defunct processes, but isn't for some reason. The number of defunct processes is growing.
Trying to force them to be reaped using preap
fails with:
preap: Failed to reap 15977: the only non-defunct ancestor is 'init'
Here's how I counted the processes, by the way:
% ps -e -o pid,s,ppid | awk 'index($2,"Z")>0 {ppid[$3]=ppid[$3]+1} END {for (key in ppid) print key,ppid[key]}'
1 17617
I found this troubling log entry:
Jun 20 22:45:34 host genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 1 (init)
OS is Solaris 10 (SunOS host 5.10 Generic_150401-04 i86pc i386 i86pc
).
Best Answer
Turns out that init had simply stopped working properly, probably when the system was having I/O problems with swap.
As it turns out, if init exits outside the process of shutting down the OS, it will simply restart. So I sent init a SIGSEGV (to make sure that it wouldn't mimic however it determines that a shutdown is in progress), it restarted init (still as pid 1), and the new init immediately reaped all of those outstanding zombies.
However, I should probably reboot to clear whatever other problems might exist due to the swap I/O problems.