Since a reboot yesterday, one of our virtual servers (Debian Lenny, virtualized with Xen) is constantly running out of entropy, leading to timeouts etc. when trying to connect over SSH / TLS-enabled protocols. Is there any way to check which process(es) is(/are) eating up all the entropy?
Edit:
What I tried:
- Adding additional entropy sources: time_entropyd, rng-tools feeding urandom back into random, pseudorandom file accesses – netted about 1 MiB additional entropy per second, problems still persisted
- Checking for unusual activity via lsof, netstat and tcpdump – nothing. No noticeable load or anything
- Stopping daemons, restarting permanent sessions, rebooting the entire VM – no change in behaviour
What in the end worked:
- Waiting. Since about yesterday noon, there are no connection problems anymore. Entropy is still somewhat low (128 Bytes peak), but TLS/SSH sessions have no noticeable delay anymore.
I'm slowly switching our clients back to TLS (all five of them!), but I don't expect any change in behavior now.All clients are now using TLS again, no problems. Really, really strange.
Best Answer
With
lsof
out as a source of diagnostic utility, would setting up something using audit work? There's no way to deplete the entropy pool without opening/dev/random
, so if you audit on processing opening/dev/random
, the culprit (or at least the set of candidates for further examination) should drop out fairly rapidly.