Reasons to Avoid Using BFD – Network Design Considerations

ciscodesignethernetjunipermpls

In looking to implement Bidirectional Forwarding Detection (BFD) it seems to be very flexible in terms of timer tuning, light weight regarding any overhead and it's flexibility in terms of overall application appears very impressive.

So if for example it can be applied to detect link failure over Ethernet, MPLS over multiple hops, at the network edge, for IGP convergence, for tunnels etc etc – why would it not be used in certain scenarios perhaps and are there other emerging alternatives to be aware of?

Best Answer

I am only directly aware of one issue with BFD, which is CPU demand. I am currently investigating an issues with a Cisco 7301 which when pushing more traffic during our peak hours, compared to the rest of the day, BFD is sometimes timing out and routing trips over to the next link.

It seems that under high traffic volumes the router CPU usage is rising (which isn't unusual) but at about 40-50% CPU BFD packets aren't receiving enough resources.

However I have found the following information which suggests additional issues with BFD (From this NANOG presentation, there is more in the presentation, it's a good one, give it a read!)

What are the caveats?

Two main ones:
1. BFD can have high resource demands depending on your scale.
2. BFD is not visible to Layer 2 bundling protocols. (Ethernet LAGs or POS bundles)

BFD Resource Demands

The number of BFD sessions on each linecard or router can impact how well BFD scales for you. -Each unique platform has its own limits.
Bundled interfaces supporting min tx/rx of 250ms or 2 seconds have been seen.
In some cases, BFD instances on a router may need to be operated on the route-processor depending on the implementation (non-adjacency based BFD sessions).
Test your platform first before deploying BFD. Attempt to put load on the RP or LC CPU with your configured settings. This can be done by:
Executing CPU-heavy commands
Flooding packets to TTL expire on the destination

BFD Resource Demands (cont’d)

What values are safe to try?
Based upon speaking to several operators, 300ms with a multiplier of 3 (900ms detection) appears to be a safe value that works on most equipment fairly well.
This is a significant improvement over some of the alternatives.

BFD and L2 link-bundling

BFD is unaware of underlying L2 link bundle members.
A 4x10GigE L2 bundle (802.3ad) would appear as a single L3 adjacency. BFD packets would be transmitted on a single member link, rather than out all 4 links.
A failure of the link with BFD on it would result in the entire L3 adjacency failing.
However, in some scenarios the failed member link may result in only a single BFD packet being dropped. Subsequent packets may route over working member links.

Summary

You didn't specify a protocol, so the answer is "it depends". If you're not using MPLS TE anywhere, the value of the Router ID doesn't seem to matter for OSPF, OSPFv3, BGP, ISIS or LDP. Technically in these cases, you can assign "255.255.255.255" as the 32-bit portion of the Router ID.

While these protocols are not strictly considered a routing protocol, you cannot divorce underlying IGP choices from your ability to deploy MPLS TE. Therefore, if you are using MPLS TE with OSPF TE Extensions, CR-LDP, etc... then it's recommended to assign your Router IDs as an address on the same router.

Overall Guidance: Keep it simple for your coworkers and future service deployments

While IGPs allow you to chose any value for Router IDs, you shouldn't make life harder than necessary. While you could theoretically assign Router-1's Router ID to be a Loopback address on Router 2, don't do that unless you already plan to make a bad reputation for yourself.

Anyone who has to support the infrastructure after you will hate the aforementioned decision. Furthermore, you'd be making implementation of some MPLS TE services much harder, because people would have to reassign the Router IDs to get several of the MPLS TE services up.

RFC 1142, (ISIS) - A variable length field, from 1 to 8 octets

ID      System identifier  a variable length field from 1 to

  8 octets (inclusive). Each routeing domain employ
  ing this protocol shall select a single size for the ID
  field and all Intermediate systems in the routeing do
  main shall use this length for the system IDs of all
  systems in the routeing domain.

### [RFC 2328, Section 5 (OSPF)] - A 32-bit number

Only defines the Router ID as a 32-bit number, thus any 32-bit number can be used:

Router ID

    A 32-bit number that uniquely identifies this router in the AS.
    One possible implementation strategy would be to use the
    smallest IP interface address belonging to the router. If a
    router's OSPF Router ID is changed, the router's OSPF software
    should be restarted before the new Router ID takes effect.  In
    this case the router should flush its self-originated LSAs from
    the routing domain (see Section 14.1) before restarting, or they
    will persist for up to MaxAge minutes.

RFC 4271, Section 4.2 (BGP) - A 4-octet unsigned integer representing a valid unicast IP host address

The BGP ID is defined as a "4-octet unsigned integer" in the OPEN message.

 BGP Identifier:

    This 4-octet unsigned integer indicates the BGP Identifier of
    the sender.  A given BGP speaker sets the value of its BGP
    Identifier to an IP address that is assigned to that BGP
    speaker.  The value of the BGP Identifier is determined upon
    startup and is the same for every local interface and BGP peer.

Nevertheless, in order to be syntactically correct it must be "a valid unicast IP host address".

 6.2.  OPEN Message Error Handling

  ...
  If the BGP Identifier field of the OPEN message is syntactically
  incorrect, then the Error Subcode MUST be set to Bad BGP Identifier.
  Syntactic correctness means that the BGP Identifier field represents
  a valid unicast IP host address.

RFC 2740, Section 2.2 (OSPFv3) - A 32-bit number

Explicitly disallows any relationship between the addresses in the protocol (IPv6) and the Router ID, which is only a 32-bit number.

2.2.  Removal of addressing semantics
 ...
 o   OSPF Router IDs, Area IDs and LSA Link State IDs remain at
     the IPv4 size of 32-bits. They can no longer be assigned as
     (IPv6) addresses.

RFC 4577, Section 4.2.2 (OSPF for BGP/MPLS IP VPNs) - A 32-bit number (valid OSPF RID)

4.2.2.  Router ID

    If a PE and a CE are communicating via OSPF, the PE will have an OSPF
    Router ID that is valid (i.e., unique) within the OSPF domain.  More
    precisely, each OSPF instance has a Router ID.  Different OSPF 
    instances may have different Router IDs.

RFC 5036, Section 3.1 (LDP) - 6 bytes, 4 of the bytes should be a valid IGP Router ID

LDP Identifier
  Six octet field that uniquely identifies the label space of the
  sending LSR for which this PDU applies.  The first four octets
  identify the LSR and MUST be a globally unique value.  It SHOULD
  be a 32-bit router Id assigned to the LSR and also used to
  identify it in Loop Detection Path Vectors.  The last two octets
  identify a label space within the LSR.  For a platform-wide label
  space, these SHOULD both be zero.

RFC 3480, Section 2 (CR-LDP) - A stable IP address that is always reachable

Defines a Router ID as "a stable IP address of an LSR that is always reachable if there is any connectivity to the LSR." Thus it pretty much has to be a loopback address

     In the context of this document, the term "Router ID" means a stable
     IP address of an LSR that is always reachable if there is any
     connectivity to the LSR.  This is typically implemented as a
     "loopback address"; the key attribute is that the address does not
     become unusable if an interface on the LSR is down.  In some cases,
     this value will need to be configured.  If one is using OSPF or ISIS
     as the IGP in support of traffic engineering, then it is RECOMMENDED
     for the Router ID to be set to the "Router Address" as defined in
     [OSPF-TE], or "Traffic Engineering Router ID" as defined in [ISIS-
     TE].

RFC 3630, Section 2.4.1 (OSPF-TE) - A stable IP address of the advertising router

RFC 3630, Section 2.4.1 (OSPF-TE), requires a "stable IP address of the advertising router"

 2.4.1.  Router Address TLV

  The Router Address TLV specifies a stable IP address of the
  advertising router that is always reachable if there is any
  connectivity to it; this is typically implemented as a "loopback
  address".  The key attribute is that the address does not become
  unusable if an interface is down.  In other protocols, this is known
  as the "router ID," but for obvious reasons this nomenclature is
  avoided here.  If a router advertises BGP routes with the BGP next
  hop attribute set to the BGP router ID, then the Router Address
  SHOULD be the same as the BGP router ID.