VIP (using corosync + pacemaker) not accepting traffic until ifdown / ifup is called

corosyncifconfigpacemakerubuntu-14.04

I'm running Corosync + Pacemaker on Ubuntu 14.4. I set up two nodes with two VIPs, and when I bring pacemaker down in one node, the VIPs do go to the other, but no traffic actually goes through the system until I manually run ifdown eth1 and ifup eth1 (the VIPs are on eth1) on that node. This is true for both nodes.

I can see that the VIPs did get transferred ok using ifconfig. But running bmon I see that no traffic goes through until I run ifdown / ifup.

crm configure show:

node $id="6" node1
node $id="3" node2
primitive VIP_3 ocf:heartbeat:IPaddr2 \
    params ip="10.0.1.112" nic="eth1" iflabel="3" \
    op monitor interval="10s" on-fail="restart" \
    op start interval="0" timeout="1min" \
    op stop interval="0" timeout="30s"
primitive VIP_4 ocf:heartbeat:IPaddr2 \
    params ip="10.0.1.111" nic="eth1" iflabel="4" \
    op monitor interval="10s" on-fail="restart" \
    op start interval="0" timeout="1min" \
    op stop interval="0" timeout="30s"
property $id="cib-bootstrap-options" \
    dc-version="1.1.10-42f2063" \
    cluster-infrastructure="corosync" \
    no-quorum-policy="ignore" \
    stonith-enabled="false"

crm_mon -1:

Last updated: Mon Feb 16 16:16:42 2015
Last change: Mon Feb 16 15:43:30 2015 via crmd on node1
Stack: corosync
Current DC: node1 (6) - partition with quorum
Version: 1.1.10-42f2063
2 Nodes configured
2 Resources configured


Online: [ node1 node2 ]

 VIP_4  (ocf::heartbeat:IPaddr2):   Started node1 
 VIP_3  (ocf::heartbeat:IPaddr2):   Started node1 

ifconfig (on node1):

eth0      Link encap:Ethernet  HWaddr 00:0c:29:b2:19:ba  
          inet addr:10.0.0.192  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:29ff:feb2:19ba/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:253948 errors:0 dropped:73 overruns:0 frame:0
          TX packets:116222 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:95400133 (95.4 MB)  TX bytes:20760101 (20.7 MB)

eth1      Link encap:Ethernet  HWaddr 00:0c:29:b2:19:c4  
          inet6 addr: fe80::20c:29ff:feb2:19c4/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:24763724 errors:0 dropped:19558 overruns:0 frame:0
          TX packets:23253310 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:15916162148 (15.9 GB)  TX bytes:15816322712 (15.8 GB)

eth1:3    Link encap:Ethernet  HWaddr 00:0c:29:b2:19:c4  
          inet addr:10.0.1.112  Bcast:10.0.1.240  Mask:255.255.255.255
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

eth1:4    Link encap:Ethernet  HWaddr 00:0c:29:b2:19:c4  
          inet addr:10.0.1.111  Bcast:10.0.1.239  Mask:255.255.255.255
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:62428 errors:0 dropped:0 overruns:0 frame:0
          TX packets:62428 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:8020634 (8.0 MB)  TX bytes:8020634 (8.0 MB)

/etc/network/interfaces:

auto lo
iface lo inet loopback
auto eth0
iface eth0 inet static
    address 10.0.0.192
    netmask 255.255.255.0
    gateway 10.0.0.138
    dns-nameservers 8.8.8.8 8.8.4.4
auto eth1
iface eth1 inet manual
    post-up ip route add 10.0.1.0/24 dev eth1 table 11
    post-up ip rule add from 10.0.1.0/24 table 11
    post-up ip rule add to 10.0.1.0/24 table 11
    pre-down ip rule delete table 11
    pre-down ip rule delete table 11
    pre-down ip route flush table 11

Any ideas of what I'm doing wrong? I would expect that after Pacemaker starts up the IP addresses, traffic will start flowing through them without need for running ifdown eth1 and ifup eth1.

Thanks!

Best Answer

I'd say that this is an ARP cache issue. Your "previous" host keeps the ARP entry for MAC address vs. IP address of your active node.

You have 3 options :

  • flush the arp cache on every client that has previously communicated with the VIP
  • create a new sub-interface of eth1 and configure the same MAC address for both servers but ensure that only one interface is active at any given time
  • broadcast an arp request to the affected network - where clients should update arp tables with new MAC:IP entry for the VIP.

For reference please see how VRRP protocol handles the VIP transfer.