IP Failover – Cross Datacenter IP Failover and Migration

brocaderoutingswitching

I am trying to wrap my head around this (very) simple issue of IP failover from one physical location to another. I have a few Virtual machines running at Site A where I need to rebuild the NAS. In order to do so I want to vMotion/move these VM(s) to site B but want to retain the IP addresses of the VMs/Network configuration. SiteA and SiteB's routers have a GRE tunnel between them and I defined static route for the next-hop of the IP being migrated.

In order for this to work, I have created a virtual interface on Site B's routerB and use the same gateway IP address which the VM is expecting. It works fine but when I try to access any machine from the same subnet from migrated VM, I can't do so because my requested subnet is physically not present on router B. It falls under /24 broadcast domain of the subnet on router B.

NSX solved this problem with VxLAN but What is the alternative to achieve IP failover across multiple datacenters? I run BGP on both locations and running over Brocade NetIron routers. I was thinking to create a MPLS cloud over my exiting IP network and have a virtual router interface defined accessible from both routers A and B? Or some sort of extended VLAN?

Or I would say I have 2 sites, 2 routers, I need to have a extended LAN between both sites so I can use 1 IP address from any of the locations at the same time.

Thanks.

Best Answer

One way of doing this kind of thing with pure routing is to have the "service" IP addresses be secondaries on all the servers, and have routes as appropriate.

Consider this straightforward network where the clients AC and BC go through a router to get to the servers A1, A2, B1, B2. Everything has default gateway aimed at their network's .1. RA and RB also have the obvious inter-site routes: RA sends 10.0.0.2/24 and 10.1.2.0/24 to RB; and RB sends 10.0.1.0/24 and 10.1.1.0/24 to RA.

       servers                 servers
       A1  A2                  B1  B2
       |.4 |.5                 |.4 |.5
===+===+===+===+===     ===+===+===+===+====
10.1.1.0/24    |.1         |.1   10.1.2.0/24
               RA---------RB
10.0.1.0/24    |.1         |.1   10.0.2.0/24
===+===========+===     ===+===========+====
   |.64                                |.64
   AC                                  BC
  clients                             clients

If the servers are also given addresses in 10.0.0.0/24 as aliases on loopback interfaces (such as lo:1) A1=10.0.0.4, B1 also 10.0.0.4, A2=10.0.0.5, B2 also 10.0.0.5.

Then you control which server gets the work by the routing on RA and RB.

If the whole network moves from site A to B, you do routes on 10.0.0.0/24; if it's one host at a time you do host routes eg 10.0.0.4/32.

This structure is useful in two cases:

  • Where you want to use a local server in a content distribution method (like Google's DNS on 8.8.8.8) you can have different routes on RA and RB. (10.0.0.4->10.0.1.4 on RA, 10.0.0.4->10.0.2.4 on RB).
  • Where you want a given server in only one place at a time, you have 10.0.0.4->10.0.1.4 on RA, 10.0.0.4->RA on RB.

The scheme lends itself naturally to some virtualisation schemes, where the service networks are inside the virtual host.

Does that help at all?

Other methods include

  • MPLS, as you've said
  • VPN tunnels

EDIT: there's certainly some horrible way to do this with NAT, and there's probably a horrible way to do it with proxy ARP. The pure routing method has the advantage that you can manage the routes by whatever method you have available, such as the BGP you mentioned, but any routing update method will work. Additionally, the routing method is (I think) the only one which supports "send to my most local" CDP-type functionality, if that's something which might be helpful.