I have set up three Centos 7 KVM VMs on a Centos 7 host for the purpose of testing various aspects of the clustering configuration before we deploy a production system.
The nodes are called clua, club and cluc. There are few resources configured:
- A fence_virsh STONITH resource clone set
- dlm, clvmd and GFS2 FileSystem resource clone sets
I have been testing various failure scenarios. The one that has been causing problems is where I cause the nodes to lose contact with each other by ifdown'ing interfaces on two of the three nodes.
In the test here I have ifdown
ed the interfaces on clua and cluc, leaving club alone. I have confirmed that I cannot ping between the nodes in this state.
On club, it does more or less what I expect:
root@itkclub ~ # pcs status
Cluster name: tclu
Stack: corosync
Current DC: club (version 1.1.15-11.el7_3.4-e174ec8) - partition WITHOUT quorum
Last updated: Thu Apr 6 16:23:28 2017 Last change: Thu Apr 6 16:18:33 2017 by root via cibadmin on clua
3 nodes and 12 resources configured
Node clua: UNCLEAN (offline)
Node cluc: UNCLEAN (offline)
Online: [ club ]
Full list of resources:
Clone Set: dlm-clone [dlm]
dlm (ocf::pacemaker:controld): Started clua (UNCLEAN)
dlm (ocf::pacemaker:controld): Started cluc (UNCLEAN)
Started: [ club ]
Clone Set: clvmd-clone [clvmd]
clvmd (ocf::heartbeat:clvm): Started clua (UNCLEAN)
clvmd (ocf::heartbeat:clvm): Started cluc (UNCLEAN)
Started: [ club ]
Clone Set: varopt_fs-clone [varopt_fs]
varopt_fs (ocf::heartbeat:Filesystem): Started clua (UNCLEAN)
varopt_fs (ocf::heartbeat:Filesystem): Started cluc (UNCLEAN)
Started: [ club ]
Clone Set: virsh-fencing-clone [virsh-fencing]
virsh-fencing (stonith:fence_virsh): Started clua (UNCLEAN)
virsh-fencing (stonith:fence_virsh): Started cluc (UNCLEAN)
Started: [ club ]
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
But, on both of the other nodes (the ones where I ifdown
ed the interfaces) it does not seem to detect that anything is wrong:
root@itkcluc ~ # pcs status
Cluster name: tclu
Stack: corosync
Current DC: club (version 1.1.15-11.el7_3.4-e174ec8) - partition with quorum
Last updated: Thu Apr 6 16:26:01 2017 Last change: Thu Apr 6 16:18:33 2017 by root via cibadmin on clua
3 nodes and 12 resources configured
Online: [ clua club cluc ]
Full list of resources:
Clone Set: dlm-clone [dlm]
Started: [ clua club cluc ]
Clone Set: clvmd-clone [clvmd]
Started: [ clua club cluc ]
Clone Set: varopt_fs-clone [varopt_fs]
Started: [ clua club cluc ]
Clone Set: virsh-fencing-clone [virsh-fencing]
Started: [ clua club cluc ]
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
root@itkcluc ~ # ping club
PING club (192.168.1.12) 56(84) bytes of data.
^C
--- club ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 2999ms
root@itkcluc ~ # ping clua
PING clua (192.168.1.2) 56(84) bytes of data.
^C
--- clua ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 2999ms
Why has pacemaker on clua and cluc not detected that it cannot talk to any of the other nodes?
Once it gets into a state like this, what is the correct recovery procedure?
Best Answer
None of the remaining nodes have quorum, therefore, no STONITH actions can be taken, and therefore no cluster actions are considered "safe".
What did you set the clusters
no-quorum-policy
property to? Is itfreeze
perchance? You cannot usestop
, which is the default, because a node without quorum won't be able to stop their resources since GFS2 requires quorum to unmount, or otherwise, access its data.Also,
club
is the DC (Designated Controller) in your example; it's keeping track of the cluster's resources. The other nodes would have to achieve quorum before they could elect a new DC.In a three node cluster, the chances of two nodes' NICs failing at the exact same time is highly unlikely. However, if you are still concerned for whatever reason, you can add more nodes to the cluster to act solely as quorum nodes (use
-inf:
location constraints to keep resources off of them) until that risk becomes satisfactorily small.To get out of this I would simply reset all three boxes "manually":
echo b > /proc/sysrq-trigger