On of our servers — running CentOS 6 x86_64 — we're seeing a lot unusual activity with rpc.statd
. We have rpc.statd
configured to run on a static port via /etc/sysconfig/nfs
:
MOUNTD_PORT=892
STATD_PORT=662
QUOTAD_PORT=875
And this does result in rpc.statd
running and listening on this port as expected:
# ps -fe | grep rpc.statd | grep 662
rpcuser 23129 1 0 Apr30 ? 00:00:00 rpc.statd -p 662
What's odd is that on this system, there are also numerous other rpc.statd
instances running with the --no-notify
flag:
rpcuser 808 1 0 02:23 ? 00:00:00 rpc.statd --no-notify
rpcuser 2052 1 0 07:17 ? 00:00:00 rpc.statd --no-notify
rpcuser 3558 1 0 Apr30 ? 00:00:00 rpc.statd --no-notify
rpcuser 5787 1 0 Apr30 ? 00:00:00 rpc.statd --no-notify
rpcuser 6499 1 0 Apr30 ? 00:00:00 rpc.statd --no-notify
rpcuser 8834 1 0 03:21 ? 00:00:00 rpc.statd --no-notify
rpcuser 9661 1 0 Apr30 ? 00:00:00 rpc.statd --no-notify
rpcuser 13702 1 0 00:08 ? 00:00:00 rpc.statd --no-notify
rpcuser 14813 1 0 Apr30 ? 00:00:00 rpc.statd --no-notify
rpcuser 15375 1 0 08:39 ? 00:00:00 rpc.statd --no-notify
rpcuser 15376 1 0 04:26 ? 00:00:00 rpc.statd --no-notify
rpcuser 19782 1 0 09:36 ? 00:00:00 rpc.statd --no-notify
rpcuser 20491 1 0 05:36 ? 00:00:00 rpc.statd --no-notify
rpcuser 23136 1 0 Apr30 ? 00:00:00 rpc.statd --no-notify
rpcuser 23320 1 0 Apr30 ? 00:00:00 rpc.statd --no-notify
rpcuser 26145 1 0 10:10 ? 00:00:00 rpc.statd --no-notify
rpcuser 26480 1 0 06:24 ? 00:00:00 rpc.statd --no-notify
rpcuser 26598 1 0 Apr30 ? 00:00:00 rpc.statd --no-notify
rpcuser 26821 1 0 01:15 ? 00:00:00 rpc.statd --no-notify
rpcuser 28255 1 0 Apr30 ? 00:00:00 rpc.statd --no-notify
Also odd is that one of these processes has apparently usurped the
original rpc.statd
process as far as rpcbind is concerned. Running
rpcinfo
reports statd on the following ports:
# rpcinfo -p
...
100024 1 udp 34322 status
100024 1 tcp 41686 status
These correspond to PID 26145 (which you can see is one of the
rpc.statd
instances in the above output from ps
).
This wouldn't be a problem if everything is working, but yesterday the
system began to experience a problem with NFS mounts…any attempt to
mount a new filesystem would result in:
mount.nfs: mount system call failed
Killing off all the rpc.statd
services "resolved" the problem, but
we're puzzled as to what's going on here. We've never seen this
behavior on our similarly configured CentOS 5 systems.
Best Answer
Well, this appears to be partly our fault and partly a bug in RedHat's
authconfig
command. Our Puppet configuration was causingauthconfig --updateall
to be run every hour. This was unnecessary but generally it shouldn't be a problem...except thatauthconfig
restarts therpcbind
service.Restart
rpcbind
causes it to forget about all the services that have registered with it. Whileauthconfig
will then restart NIS-related services, this results in a situation whererpc.statd
is still running but no longer registered withrpcbind
-- which makes it effectively invisible from the point of view of applications that attempt to find it viarpcbind
.I've fixed our Puppet configuration so that it is no longer calling
authconfig
like this, and I've opened bug 818246 with RedHat.