Routing – Use BGP as an IGP – Multisite w Redundant Layer 3 MPLS WANs

bgplayer3mplsmpls-vpnrouting

The requirements are

  • Hundreds of sites.
  • Each has a router and firewall and a switch stack.
  • Layer 3 MPLS WAN with BGP to provider VRF.

BGP as an IGP?

iBGP

Would you distribute a public ASN across all branch sites and use iBGP to route inter-branch packets? Route reflector(s); MED based routing?

eBGP

Would you assign a private ASN per branch?

… or Don't Use BGP?

DMVPN

DMVPN (minus encryption?) to hide/abstract away the BGP, allowing use of IGP for all inter-site routing?

Best Design?

Anyone think eBGP or iBGP is preferable? I find DMVPN attractive because

  1. My routing is separated from their routing.
  2. We're able to use a faster converging Interior Gateway Protocol.
  3. We're not using Border Gateway Protocol as an IGP.

however lean towards iBGP because it fits the use case and adds no GRE overhead. What is the best way to route traffic given this topology?

Best Answer

All I can do is explain how a pretty successful large network does it.

Each of the hundreds to thousands of end-sites on an MPLS VPN is in the same private BGP AS, so site-to-site traffic is switched directly by the carrier MPLS cloud. The data centers each have their own private BGP ASes. So, the WAN is a mixture of iBGP and eBGP. Each end-site and data center runs its own separate IGP, injecting the default and specific routes from the MPLS cloud(s), although the standard defines only one, each site's IGP is independent from all the other sites'.

Some end-sites have one WAN circuit, and some sites have two WAN circuits. Of the sites with two WAN circuits, some have both circuits on one carrier (required to terminate at separate carrier POPs), and some have one circuit on each of two different carriers. Obviously, the data centers have large-pipe connections to all the carriers, but the end-site circuits are right-sized for the traffic to/from the particular site.

Each end-site gets a default route to the MPLS cloud, and a few specific prefixes from the data centers.

This was arrived at after many years of various arrangements. Using an IGP across the WAN for hundreds to thousands of sites just proved too problematic (actually slowing IGP convergence to a crawl), and forcing traffic to a central site, even if the traffic was site-to-site added too much latency.