Diagnosing network latency issues

latency

I have two servers both with gigabit network cards that were experiencing severe latency issues communicating with one another. The culprit eventually turned out to be that one of the servers was patched into a 100mb switch port.

pinging the servers always returned <1ms

Is there a tool that could show the actual latency / rate of transfer between the servers compared to the maximum that should be possible?

Best Answer

Perform state monitoring, collect data, visualise.

  1. Your OS has tools to report the current state of your network interfaces. Use them and compare them to the expected state. Automate this.
  2. Use snmp or native counters to collect samples. Use 64bit counters for fast interfaces, or a really small sample interval.
  3. When you collect stats, graph them. Graphite is quite the thing these days.

Then, realise with monitoring, it's never real-time. You're always looking at the past.

And watch Jason Dixon at devopsdays Rome: The State of Open Source Monitoring: The good, the bad, the terrible, and a glimpse into our future.

Related Topic