Nfs – network service failing on boot in SLES 10.2, possibly leading to NFS client problems

networkingnfssles10

On a box recently upgraded from SLES 9.3 to 10.2, I'm seeing the following issue:

Prior to the upgrade, an NFS mount (defined via yast, i.e., it appeared in /etc/fstab) worked correctly. Following the upgrade, however, it is failing. A network trace shows that it is making the initial connection to the NFS server over TCP (for the portmapper RPC), but then it switches to UDP for the subsequent MOUNT call; since the NFS server doesn't allow UDP (with good reason, due to the possible issues with data corruption, as in nfs(5)), the connection will not go through.

Adding the TCP option (whether in fstab, or at the command line, etc.) has no effect.

In the course of troubleshooting this, I've found that /var/adm/messages is reporting the following as occurring during boot:

Failed services in runlevel 3: network

(I should note that despite this error message, apparently at least some network services are started, since the box is accessible via SSH.)

My questions, then:

  1. What should I be looking at to determine the cause of the service startup failure?
  2. Would this indeed be likely to cause the problem with NFS described above?
  3. If the answer to (2) is no, then any suggestions on what to look for?

Editing to add some information relating to the answers below.

It turns out that the network service is failing on bootup because one of the interfaces (there are two on this box) uses DHCP, and that's not available yet at this time. So I've disabled it for now, stopped/restarted the network service and the NFS client services, but still get the same results.

There's no firewall on the client side. Also, iptables -L on the client side shows that everything is accepted; and there are no entries in /etc/hosts.allow or /etc/hosts.deny.

On the NFS server side, nothing has changed. The remote nfsserver is indeed advertising that it allows both TCP and UDP for all of the NFS services (though there is an iptables rule blocking UDP).

/etc/fstab entry is pretty basic – what you'd get from setting it up in yast:

x.x.x.x:/volume      /localdir   nfs     defaults 0 0

rpcinfo -p for the client box shows only portmapper v2 running, advertising both TCP and UDP. For the server, it shows all of the usual services:

   program vers proto   port
    100000    2   tcp    111  portmapper
    100000    2   udp    111  portmapper
    100024    1   udp   4047  status
    100024    1   tcp   4047  status
    100011    1   udp   4049  rquotad
    100021    1   udp   4045  nlockmgr
    100021    3   udp   4045  nlockmgr
    100021    4   udp   4045  nlockmgr
    100021    1   tcp   4045  nlockmgr
    100021    3   tcp   4045  nlockmgr
    100021    4   tcp   4045  nlockmgr
    100005    1   udp   4046  mountd
    100005    1   tcp   4046  mountd
    100005    2   udp   4046  mountd
    100005    2   tcp   4046  mountd
    100005    3   udp   4046  mountd
    100005    3   tcp   4046  mountd
    100003    2   udp   2049  nfs
    100003    3   udp   2049  nfs
    100003    2   tcp   2049  nfs
    100003    3   tcp   2049  nfs

The mount call, with the /etc/fstab entry above, is simply:

mount /localdir

although I've also tried it with various options such as tcp, v3, etc.

Both the /etc/fstab entry (hence the mount) and the rpcinfo -p call are using the IP address, so there are no DNS resolution issues involved.

Best Answer

Check to make sure /etc/hosts.deny does not contain an entry for mountd, and check hosts.allow, for similar reasons. For what it's worth, I usually clear out hosts.deny and use iptables to control access.

Use rpcinfo -p nfsserver to ensure that mountd is indeed advertising TCP — there's an option -n to disable TCP-listening, which (IIRC on SuSE) would likely be set in /etc/sysconfig/nfs or thereabouts.