In my monitoring box, I have lots of zombie process created by nagios and they gets remove quickly also. I am using active checks to perform monitoring of my servers. I accumulated the defunct processes created using the following command:
$ top -d 0.25 -b -n 20 > topout.txt
This collected the output of top with 0.25s delay 20 times.
I did grep on the topout.txt for the defunct process.
$ cat topout.txt | grep defunct
I get the following output.
8957 nagios 20 0 0 0 0 Z 6.0 0.0 0:00.02 nagios <defunct>
8951 nagios 20 0 0 0 0 Z 3.0 0.0 0:00.01 nagios <defunct>
8954 nagios 20 0 0 0 0 Z 3.0 0.0 0:00.01 nagios <defunct>
8945 nagios 20 0 0 0 0 Z 0.0 0.0 0:00.01 nagios <defunct>
8946 nagios 20 0 0 0 0 Z 0.0 0.0 0:00.01 nagios <defunct>
8980 nagios 20 0 0 0 0 Z 0.0 0.0 0:00.01 nagios <defunct>
9000 nagios 20 0 0 0 0 Z 0.0 0.0 0:00.00 nagios <defunct>
9024 nagios 20 0 0 0 0 Z 7.0 0.0 0:00.02 nagios <defunct>
9025 nagios 20 0 0 0 0 Z 3.5 0.0 0:00.01 nagios <defunct>
9040 nagios 20 0 0 0 0 Z 3.1 0.0 0:00.01 nagios <defunct>
9086 nagios 20 0 0 0 0 Z 0.0 0.0 0:00.01 nagios <defunct>
9087 nagios 20 0 0 0 0 Z 0.0 0.0 0:00.01 nagios <defunct>
9123 nagios 20 0 0 0 0 Z 6.1 0.0 0:00.02 nagios <defunct>
9126 nagios 20 0 0 0 0 Z 3.0 0.0 0:00.01 nagios <defunct>
9131 nagios 20 0 0 0 0 Z 3.0 0.0 0:00.01 nagios <defunct>
9091 nagios 20 0 0 0 0 Z 0.0 0.0 0:00.05 nagios <defunct>
9111 nagios 20 0 0 0 0 Z 0.0 0.0 0:00.01 nagios <defunct>
9119 nagios 20 0 0 0 0 Z 0.0 0.0 0:00.01 nagios <defunct>
9118 nagios 20 0 0 0 0 Z 0.0 0.0 0:00.01 nagios <defunct>
9151 nagios 20 0 0 0 0 Z 2.9 0.0 0:00.02 nagios <defunct>
9153 nagios 20 0 0 0 0 Z 2.9 0.0 0:00.02 nagios <defunct>
9150 nagios 20 0 0 0 0 Z 0.0 0.0 0:00.01 nagios <defunct>
9164 nagios 20 0 0 0 0 Z 3.5 0.0 0:00.02 nagios <defunct>
9171 nagios 20 0 0 0 0 Z 3.5 0.0 0:00.02 nagios <defunct>
9154 nagios 20 0 0 0 0 Z 0.0 0.0 0:00.01 nagios <defunct>
9156 nagios 20 0 0 0 0 Z 0.0 0.0 0:00.01 nagios <defunct>
9163 nagios 20 0 0 0 0 Z 0.0 0.0 0:00.01 nagios <defunct>
9167 nagios 20 0 0 0 0 Z 0.0 0.0 0:00.01 nagios <defunct>
9178 nagios 20 0 0 0 0 Z 3.8 0.0 0:00.02 nagios <defunct>
9174 nagios 20 0 0 0 0 Z 0.0 0.0 0:00.01 nagios <defunct>
9179 nagios 20 0 0 0 0 Z 0.0 0.0 0:00.01 nagios <defunct>
9182 nagios 20 0 0 0 0 Z 0.0 0.0 0:00.01 nagios <defunct>
Can somebody help me in finding out the reason of these zombie processes and how i can prevent these zombie processes ?
Best Answer
Nagios hasn't run the signal handler yet for SIGCHLD. This could be because it's waiting in the run queue or busy handling another signal. As long as they go away quickly it's not a cause for concern.