Iptables – pacemaker virtual ip loadbalancing with clone and clusterip

corosynciptablesload balancingpacemaker

I am trying to make a loadbalanced gateway for a group of natted machines.
I have 3 centos nodes. Initially only one node was supposed to have the internal gateway ip and that works well. Traffic flows.

Then, Im trying out loadbalancing the gateway via clusterip_hash/clone option. At the bottom there is a resource creation with pcs, my small location constraint (dont move the ip to a machine that has no "internet") and lastly the clone command.

Once i clone the resource, i can see them running correctly on two hosts, and each one has iptables rule added:

Chain INPUT (policy DROP)
target     prot opt source               destination         
CLUSTERIP  all  --  anywhere             gateway              CLUSTERIP hashmode=sourceip-sourceport clustermac=81:48:85:71:7F:47 total_nodes=2 local_node=2 hash_init=0

The problem is that as soon as arp entry changes from current real physical mac of either gateway machine to the clustermac shown in iptables, all of the natted machines loose internet connectivity.

I added iptables logging for dropped packets, but nothing seems to be dropped. At the same time, nothing seems to go through. ( is a randomly picked natted host trying to ping google, if the virtual-ip-clone is removed and changed to single floating-ip, then the traffic flows again)

[root@three ~]# tcpdump -nni enp1s0 icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp1s0, link-type EN10MB (Ethernet), capture size 65535 bytes
16:40:36.898612 IP > ICMP echo request, id 18875, seq 188, length 64
16:40:37.906651 IP > ICMP echo request, id 18875, seq 189, length 64

Pacemaker config, done via pcs:

pcs  resource create ip_internal_gw ocf:heartbeat:IPaddr2 params ip="" cidr_netmask="24" nic="enp1s0" clusterip_hash="sourceip-sourceport" op start interval="0s" timeout="60s" op monitor interval="5s" timeout="20s" op stop interval="0s" timeout="60s"

pcs resource clone ip_internal_gw meta globally-unique=true master-max="2" master-node-max="2" clone-max="2" clone-node-max="1" notify="true" interleave="true"

pcs constraint location ip_internal_gw rule id=ip_internal_gw_needs_internet score=-INFINITY not_defined pingd or pingd lte 0

[root@three ~]# pcs status                                                                                                                                                                                                                                                                                                    
Cluster name: 
Last updated: Wed May 25 16:51:15 2016          Last change: Wed May 25 16:35:53 2016 by root via cibadmin on two.gateway.shire
Stack: corosync
Current DC: two.gateway.shire (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
3 nodes and 5 resources configured

Online: [ one.gateway.shire three.gateway.shire two.gateway.shire ]

Full list of resources:

Clone Set: ping-clone [ping]
    Started: [ one.gateway.shire three.gateway.shire two.gateway.shire ]
Clone Set: ip_internal_gw-clone [ip_internal_gw] (unique)
    ip_internal_gw:0   (ocf::heartbeat:IPaddr2):       Started three.gateway.shire
    ip_internal_gw:1   (ocf::heartbeat:IPaddr2):       Started two.gateway.shire

What is blocking the traffic? Im sure im missing something basic.

Best Answer

It seems that:

iptables -A PREROUTING -t mangle -i eth0 -m cluster --cluster-total-nodes 2 --cluster-local-node 1 --cluster-hash-seed 0xdeadbeef -j MARK --set-mark 0xffff
iptables -A PREROUTING -t mangle -i enp1s0 -m cluster --cluster-total-nodes 2 --cluster-local-node 1 --cluster-hash-seed 0xdeadbeef -j MARK --set-mark 0xffff

helped to get it running.