Nfs – Linux-HA Pacemaker.. NFS Resource starting “unmanaged”

linux-hanfspacemaker

My cluster I've been working on just starting acting up out of no where… It looks like I'm having an issue with the exportfs resource.

Any ideas on how to troubleshoot this? I can find nothing for a "-2" return code

============
Last updated: Mon Jan  7 09:18:18 2013
Last change: Fri Jan  4 16:02:13 2013 via crmd on emserver1
Stack: openais
Current DC: emserver1 - partition with quorum
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
2 Nodes configured, 2 expected votes
9 Resources configured.
============

Online: [ emserver1 emserver2 ]

 Master/Slave Set: ms_drbd_nfs [p_drbd_nfs]
     Masters: [ emserver1 ]
     Slaves: [ emserver2 ]
 Clone Set: cl_lsb_nfsserver [p_lsb_nfsserver]
     Started: [ emserver1 emserver2 ]
 Resource Group: g_nfs
     p_fs_nfs   (ocf::heartbeat:Filesystem):    Started emserver1
     p_exportfs_nfs     (ocf::heartbeat:exportfs):      Started emserver1 (unmanaged) FAILED
     p_ip_nfs   (ocf::heartbeat:IPaddr2):       Stopped
 Clone Set: cl_exportfs_root [p_exportfs_root]
     Started: [ emserver1 ]
     Stopped: [ p_exportfs_root:1 ]

Failed actions:
    p_drbd_nfs:1_promote_0 (node=emserver2, call=22, rc=-2, status=Timed Out): unknown exec error
    p_exportfs_root:1_start_0 (node=emserver2, call=10, rc=-2, status=Timed Out): unknown exec error
    p_exportfs_nfs_stop_0 (node=emserver1, call=32, rc=-2, status=Timed Out): unknown exec error
    p_drbd_nfs:0_demote_0 (node=emserver1, call=19, rc=1, status=complete): unknown error

Best Answer

The ubuntu server package had outdated resource agents. There was a bug in the exportfs resource agent that caused the nfs rmtab to grow to an immense size (which is why the time outs were occurring).

I upgraded the resource agents from github and removed the 2GB rmtab. Everything was fine after that.