How to signal a VPLS multihoming change to a L2 CE device

junipermplsvpls

We have the following setup:

enter image description here

Two MX routers connect to the same L2 site. Loop protection / redundancy is done via VPLS multihoming. On the other end are two switches (EX4200 for example).

When the blue link fails the two switches and the rest of the L2 infrastructure have to know that traffic must now go through the yellow link (and consequently trough the EX switch on the right).

The problem is, the yellow mac-table only gets filled when there is traffic arriving from the VPLS through the yellow link. If no traffic is received from a certain MAC address, traffic for that address will still be sent over the blue link and noone knows that that link is now broken (except perhaps the EX switch on the left if the link fails physically).

I can't find a good solution to fix this problem.

A few approaches:

  1. You can somewhat lessen the impact by not making the blue/yellow link portfast so that spanning-tree can send a topology change when the interface goes down/up. When the interface does not go down physically you're out of luck. On the other hand the spanning-tree solution will bite you when the port comes up again. VPLS will take the site online but the port needs to go trough the STP learning stages before it forwards traffic.

  2. You could stack the two switches. This will fix the problem for the rest of the L2 infrastructure as they always send to the same switch (stack). Still the stack needs to know when to switch to the other uplink interface with the active VPLS instance.

  3. When doing planned maintenance (and if you have a stack), you can deactivate the primary link manually to switch over to the secondary link. Then you can lower the site-preference for the deactivated link on the router so that the now active site becomes the new primary. Same thing when switching back. Not ideal and does not work for unforeseen outages.

Any input on how to solve this is appreciated. (Waiting for EVPN/TRILL does not count. ;))

Best Answer

  1. Disable Portfast on the PE facing ports (On CE)
  2. Enable RSTP across the CE network
  3. Favor "Blue link" w/ interface cost

Working this out in my head I believe it should react as follows:

When blue link dies the CE will stop sending/receiving BPDUs from the blue interface. Default RSTP hello timer is 2 seconds. It waits for three missed hellos before calling that link "dead". Once three hellos (6 seconds) have been missed it will then re-establish the STP tree and age the MAC addresses.

This is basically option 1 you've stated above except the way I've read the comments and your original post it sounds like you want the PE to participate in STP. I'm suggesting allowing the customer to build its own tree between all CEs.

Your network should fail over smoothly, and the client network would follow suit a few seconds later.

This feels too simple to be the answer...but that's what I can see based on your write up.