SSH orphaned processes

processssh

On some VMs, it looks like every ssh session ends up as an orphaned process upon exit. I can reproduce it by just login through ssh onto the machine, then doing exit or ^D, then with ps -elf | grep defunct I have one more ssh process.
Our monitoring is using ssh a lot, so that's hundreds and hundreds of orphaned zombies by morning, and that on about 20 VMs ..

Here is an exemple of the ps output :

5 Z user  3197     1  0  80   0 -     0 exit   10:00 ?        00:00:00 [sshd] <defunct>

I tried doing a strace of the parent (sshd) to see, here is the output when I exit the ssh session :

--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=24025, si_status=255, si_utime=0, si_stime=2} ---
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 255}], WNOHANG, NULL) = 24025
wait4(-1, 0x7ffc0d57901c, WNOHANG, NULL) = 0
rt_sigaction(SIGCHLD, NULL, {0x7f164fee7d70, [], SA_RESTORER, 0x7f164db34d40}, 8) = 0
rt_sigreturn() = -1 EINTR (Interrupted system call)
select(7, [3 4], NULL, NULL, NULL

Not that I really know what I'm looking for in there, but I don't see what could be the problem. Any ideas ?
I also see quite a lot of nrpe defunct processes, but restarting sshd cleans up both the ssh and the nrpe zombies, for some reason.

Don't know if that can be relevant, but there is very very slow I/O's on those machines, a simple dd of a few hundreds megabytes takes hundreds of seconds to complete sometimes.

EDIT : As asked, it's ubuntu trusty with openssh 6.6p1-2ubuntu2.7

Best Answer

So disabling UsePrivilegeSeparation in sshd_config seems to do the trick. Not really a huge fan of doing that, but it works ..