High-Resolution Measurement of Link Utilization on Cisco Nexus Switch Interfaces

bandwidthciscocisco-nexus-5ksnmp

For a research project I need low timescale interface utilization measurement on a Cisco Nexus 5000 devices and some fabric extenders.

Background: The switch is used to "simulate" an optical network, and traffic is forced pass this switch. I can then simulate fiber cuts by simply shutting down the interface.

To measure results, I need to measure interface utilization in a rather timescale. Sub-second would be ideal, but is not necessary. Getting utilization in 1 second intervals is however the goal.

My first idea was to use IFMIB and query the Interface the Byte Counter. Doing this on a regular interval allows me to calculate the average utilization in the last timeframe. I however observed that counters do not get updated in realtime, but instead each 10 seconds or so, which definitely is not fine-grained enough. Further, if the switch even has little load, the SNMP responses may get in irregular time intervals, which distorts the results.

Another idea was to query the device via Netconf / XML, however the agent is too slow. Second interval polling is not possible here.

My final idea is to use ERSPAN and capture the different VLAN traffic and send it to a management station. There, the VLAN tag can be used to distinguish the Interface, and I need to write some utility which actually counts the packet sizes. In my opinion this may allow second level measurements, but represents some overhead on the other hand (additional machine, and I don't know if Nexus will forward ~7GB/s ERSPAN traffic).

Do you have any other idea how to measure bandwidth on an interface?

Best Answer

Your best bet is to tap the links themselves. Use an in-line optical tap or a powered copper tap. This will split the connection off to a secondary cable, which you can then plug into your server for monitoring. You'll have an instantly-accurate, infinitely-granular picture of what's going across the wire.

Edit: If you've got too many links to tap, your only other option I can think of is to SPAN all the ports to a single 10G or 40G interface and monitor that. Keep in mind that you're adding in the processing time for the SPAN, and you may drop traffic depending on oversubscription ratios, etc. Just depends on how much accuracy you really need.

The only way to get a 100% complete, guaranteed measurement of the bandwidth crossing a given link is to tap the link.