Cisco – Reliable method to detect short traffic bursts

ciscosnmp

I have a server connected to Cisco WS-C3560G-24TS switch GigE port. I need to check if the server creates few seconds >500Mbps traffic bursts. The server is not under my management.

At first, I made a simple script which stores the interface ingress bytes into a variable (bps1), sleeps one second, stores the ingress bytes into another variable (bps2), calculates the amount of bits received, stores it in variable named delta, and prints a log message if delta is larger than 500Mbps:

while :; do
  for i in {1..2}; do
    declare bps"$i"=$(snmpwalk -Ov -v 2c -c public switch ifHCInOctets.10101 | sed 's/^.* //');
    sleep 1;
  done;
  delta=$(( (bps2 - bps1)*8 ));
  echo "$bps1" "$bps2" "$delta";
  (( delta > 500000000 )) && printf '%s\n' "$(date -u "+%d.%m.%y %H:%M:%S") UTC ingress traffic from customer was "$delta" bps";
done | tee -a bps.log

The echo "$bps1" "$bps2" "$delta"; line above is just for troubleshooting purposes. Cisco IOS updates the ifHCInOctets counter with 1ts interval.

However, as one while loop takes longer to complete than exactly one second, then occasionally the script reports bps of two intervals. For example:

155268562689729 155268611695817 392048704
155268714010296 155268764441853 403452456
155268862787657 155268910277237 379916640
155269008492724 155269103039983 756378072
14.05.15 14:59:19 UTC ingress traffic from customer was 756378072
155269148645940 155269195558201 375298088
155269295068336 155269395399778 802651536
14.05.15 14:59:26 UTC ingress traffic from customer was 802651536
155269492138530 155269538915854 374218592
155269631823265 155269679591240 382143800

I guess such a method works only for longer polling periods? What are the other possibilities to detect short traffic bursts? Policer with counters in switch/router? Some other clever method?

Best Answer

Depending on how accurate you want all this to be, and assuming you want to stay with the SNMP polling approach, you might want to recalculate the bps value yourself, by taking the system timestamp each time you poll, and the calculating the resulting average bandwidth use between two timestamps :

BW = (InOctets2 - InOctets1) / (timestamp2 - timestamp1)

It's not entirely accurate because the timestamp is taken on your machine rather than on the switch at the point when it calculates InOctets, but at least you'll remove the implicit dependency on the while loop's duration.