Linux – Desperate: statd timed out, lockd cannot monitor / unmonitor

debian-squeezelinuxnfsrpc

Since this afternoon something is wrong with the server. On the server side I see messages in dmesg as follows:

statd: server rpc.statd not responding, timed out
lockd: cannot unmonitor <client>
statd: server rpc.statd not responding, timed out
lockd: cannot monitor <client>

On the client side I see in dmesg:

lockd: server <server> not responding, still trying
lockd: server <server> OK

This is paralysing the entire network! I have tried this solution suggested by Xian, but it makes no difference.

Server, Debian Linux, Squeeze 64-bit:

>> uname -a
Linux <server> 2.6.32-5-amd64 #1 SMP Fri May 10 08:43:19 UTC 2013 x86_64 GNU/Linux

Clients, Linux Mint 13-64bit:

>> uname -a
Linux <client> 3.2.0-49-generic #75-Ubuntu SMP Tue Jun 18 17:39:32 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

I have not run an update on the server, so I don't know what could have changed. I did upgrade one of our client machines, but can't see why that would mess with the server, since all machines seem affected. Any ideas on how to fix this?

UPDATE 1

The server stalls for a while at

Starting portmap deamon
Starting NFS common utilities: statd idmapd

This takes about 2 minutes until boot continues…

UPDATE 2

It is indeed the client machine that was upgraded that caused this. It seems it somehow stalled statd on the server, causing all other machines to have issues. I rebooted the entire network, leaving that one machine off and I did not encountered any problems. Not really a fix, but I have since downgraded that machine again, and everything seems to be stable.

Best Answer

Here comes couple of suggestions:

I once managed to break the loopback interface (lo) and thanks to it several services, such as NFS, stopped working properly. See with ifconfig if you still have your beloved lo interface up and running. If it's not, go see /etc/network/interfaces and see what's going on.

Also as some people already mentioned, check the commands pgrep -v statd and netstat -tlnpu to see if statd is running.

Or perhaps someone has changed something under /etc at the server side? If you do not have /etc under version control, see if any files have been recently modified: find /etc -mtime -14 would show files changed during last 14 days, for example.

Related Topic