Nfs – DRDB and NFS: is there any efficient way to make failover transparent for NFS

drbdmountnfs

We are implementing DRDB + heartbeat with two servers to have a file system with failover.
These servers exposes a NFS service for other servers

Currently DRDB is working just fine, but when testing we switch from one server to another the mounted folders trough NFS in the other servers just hangs.

Is there any transparent way to make this failover? Make it transparent to NFS or we need to necessary re-mount those nfs-mounted folders?

Best Answer

The problem here is that you have made a redundant storage array using DRBD, but you have two disjointed NFS daemons running with the same shared data. NFS is stateful - as long as you cannot transfer the state as well, you will have serious problems on failover. Solaris HA setups do have daemons that take care of this problem. For a Linux installation, you will have to make sure that your NFS state directory (configurable, typically /var/lib/nfs) is located on the shared disk for both servers.

Stick with Heartbeat or Corosync for failure detection and failover - it generally does the Right Thing (tm) when configured with a Quorum. Other failover techniques might be too focused on just providing a virtual IP (e.g. VRRP) and would not suit your needs. See the http://linux-ha.org for further details and additional components for a cluster setup.