Cisco 6500 VSS – Does It Improve or Hurt MTBF?

ciscovss

I understand the benefits of the Cisco 6500 VSS with the obvious selling points of single management, single routing instance, STP elimination, port-channels across chassis, etc. With two standalone Cisco 6500s that may have L3 and L2 port-channels between them, they at least have no operational dependency on one another through the control-plane.

In a VSS world — and I have no direct experience with this — we now have software and other protocols that control both switches. In my designs that expect control-plane software to have bugs, does the VSS lower the MTBF as I suspect and is a trade-off with the capababilities gained or am I missing how MTBF is improved?

Best Answer

Short version of the answer: a little bit of both, but it's not meant to be a technology to directly improve availability

Long version of the answer: As others have pointed out, manufacturer's traditional definitions of MTBF and availability focus on hardware failures. Other factors -- human error, buggy software, planned maintenance, etc. -- are considerations in developing an architecture but are made at the individual user level.

For a hardware-only perspective, VSS doesn't impact availability. It's the same hardware being used, so the same MTBF/MTTR numbers are utilized and the end availability equations are the same.

For a more holistic perspective, it's really a toss-up and will depend largely on your individual wants and needs. On one hand, you could consider it less reliable since it's a complex piece of technology and a single "virtual point of failure" (i.e., the VSS control plane) will impact both pieces of redundant gear. On the other hand, it can be viewed to increase availability since a single virtual device renders the network much simpler, making it less likely for other things to go wrong (fewer devices to manage, no HSRP/VRRP, non-looped STP domain, simpler L3 topology, etc.).

The market has pretty much shown that most network engineers view VSS and similar technologies as an improvement over a traditional L2 distro/access topology, but there are other technologies you could go with. For example, a routed L3 access layer could achieve most of the benefits of VSS, but VLANs would be unable to span multiple access layer devices, making the solution potentially useless in some scenarios (e.g., virtualized data centers).