I have the next scenario in Corosync + Pacemaker
Node1:
eth0: 10.143.0.21/24
eth1: 10.10.10.1/30 (Corosync Comunication)
eth2: 192.168.5.2/24
Node2:
eth0: 10.143.0.22/24
eth1: 10.10.10.2/30 (Corosync Comunication)
eth2: 192.168.5.3/24
Floating IP's
eth0: 10.143.0.23/24
eth2: 192.168.5.1/24
The interface eth1 is only use for corosync comunication.
For example I disconnected the network cable from interface eth0 but nothing happens, other example I disconnected the network cable from interface eth2 and I have the same result but I disconnected the network cable from interface eth1 (corosync comunication) and the Floating IP's pass to the other node.
How can I make when disconnecting any interface the resources pass to the other node?
Regards
UPDATE
I tested with the following settings
crm configure primitive PING-WAN ocf:pacemaker:ping params host_list="10.143.0.1" multiplier="1000" dampen="1s" op monitor interval="1s"
crm configure primitive Failover-WAN ocf:heartbeat:IPaddr2 params ip=10.143.0.23 nic=eth0 op monitor interval=10s meta is-managed=true
crm configure primitive Failover-LAN ocf:heartbeat:IPaddr2 params ip=192.168.5.1 nic=eth2 op monitor interval=10s meta is-managed=true
crm configure group Cluster Failover-WAN Failover-LAN
crm configure location Best_Connectivity Cluster rule pingd: defined pingd
It works for me, when disconnecting the network cable from the eth0 and losing the ping to the destination 10.143.0.1 (Gateway) resources were moved to the other node but my scenario is 3 interfaces so I decided to add a ping test more
crm configure primitive PING-LAN ocf:pacemaker:ping params host_list="192.168.5.4" multiplier="1000" dampen="1s" op monitor interval="1s"
But now it is necessary to lose the connection with the two hosts (10.143.0.1 and 192.168.5.4) so that the resources are moved to the other node.
I'm looking for information but I can not make the following scenario work:
If the node loses connectivity to any host that adds to the ping test, the other resources pass to the other node without the need to lose the connectivity of all ping tests at the same time.
Best Answer
You need to tell Pacemaker you care about interfaces failing. Look at the
ocf:pacemaker:ping
resource. You can use that resource-agent to ping other host lists on the different interfaces' networks, and Pacemaker will react if those pings fail.If you group the
ocf:pacemaker:ping
resources, or use constraints to relate them, to whatever else you're managing in Pacemaker they'll all move together.Also, I would bet that when you unplugged
eth1
in your previous tests that the IP wasn't "moving", but rather it was being started on BOTH cluster nodes at the same time; to the cluster nodes, they both thought that their peer had gone missing. You were essentially testing what would happen if the cluster partitioned.On that note, you should definitely configure a second redundant ring in your Corosync config as suggested in another answer, but that isn't going to have the effect you were looking for.
UPDATE 0: You should add both IPs to the same
ping
primitive'shost_list
rather than adding an additionalping
primitive, and set afailure_score
on that primitive to whatever is acceptable.From the
ocf:pacemaker:ping
resource agent (# crm ra info ocf:pacemaker:ping
):Something like:
# crm configure primitive PING-O-DOOM ocf:pacemaker:ping params host_list="10.143.0.1 192.168.5.4" failure_score="2" op monitor interval="10s"