Traceroute Analysis – Tips for Reading and Understanding Results

traceroute

I'm pretty new to networking and I'm currently playing about with the traceroute command on the Windows command line. Do you happen to have any tips/know any resources that are helpful in learning how to analyse the data that comes from traceroutes?

For instance I have this data:

Tracing route to poiparau.oyster.net.ck [202.65.32.127]
over a maximum of 30 hops:

  1     2 ms     2 ms     4 ms  gw.wireless.iqsalford.quintain.lan [10.222.208.1] 
  2    14 ms     8 ms     8 ms  gw-vlan1577.man-xmr.tcw.ask4.net [78.109.190.225] 
  3     6 ms     2 ms     3 ms  man-xmr-edge1-r1.tcw.ask4.net [81.23.63.237] 
  4     *       48 ms    18 ms  lon-xmr-10ge.thn-tcw.core.ask4.net [81.23.51.246] 
  5     *        9 ms    15 ms  te0-7-0-16.ccr21.lon01.atlas.cogentco.com [149.6.185.233] 
  6     8 ms    11 ms     8 ms  be2871.ccr42.lon13.atlas.cogentco.com [154.54.58.185] 
  7   148 ms    83 ms    88 ms  be2490.ccr42.jfk02.atlas.cogentco.com [154.54.42.85] 
  8    84 ms    84 ms    85 ms  be2807.ccr42.dca01.atlas.cogentco.com [154.54.40.110] 
  9    97 ms    96 ms     *     be2113.ccr42.atl01.atlas.cogentco.com [154.54.24.222] 
 10   142 ms   112 ms   114 ms  be2690.ccr22.iah01.atlas.cogentco.com [154.54.28.130] 
 11   149 ms   149 ms   148 ms  be2066.ccr22.lax01.atlas.cogentco.com [154.54.7.54] 
 12   147 ms   148 ms   147 ms  be2017.rcr21.lax04.atlas.cogentco.com [154.54.0.237] 
 13   148 ms   150 ms   148 ms  te0-0-0-3.agr12.lax04.atlas.cogentco.com [154.24.35.14] 
 14   240 ms     *        *     38.104.210.158 
 15   213 ms     *      243 ms  72.234.202.81 
 16   226 ms   205 ms   218 ms  72.234.202.82 
 17   216 ms   222 ms   211 ms  64.110.51.131 
 18   339 ms   336 ms   336 ms  64.110.51.132 
 19   333 ms   341 ms   365 ms  poiparau.oyster.net.ck [202.65.32.127] 

Trace complete.

I'm aware that the number at the far most left refers to an individual hop and that 3 packets are sent for each hop with the times representing the time it took to reach the destination and come back. And I can see that the right most column has the name/ip address but can one tell anymore from traceroute readings?

Are there ways of comparing one traceroute reading to another effectively? Are there particular times in the day when tracerouting makes for more interesting results? Does tracerouting from a computer with a slow connection affect the time taken?

Best Answer

I highly recommend the Traceroute Guide posted by dareuja. There isn't a more complete single resource on interpreting the results (that I know of, at least).

Here are a few tips I've picked up over the years from correctly reading the output. If you're unfamiliar with how tracreoute works, there is some good info in this thread.

1. Understand the difference between traffic going TO a router vs THROUGH a router

Each line in a traceroute represents the time in milliseconds (ms) it took for that particular router to respond to the initiating client with a TTL Expired message. Each hop is usually tested 3 times, so you get 3 different values.

The 10th router in the chain received the packet VIA the 9th router in the chain. And often, the TTL Expired response passed back through the 9th router in the chain on the way back to the initiating client (not always, but usually).

Either way, in this example, the 10th router in the chain is processing a packet sent TO it (sort of, I'll explain in a moment). While the 9th router in the chain is processing a packet that was sent THROUGH it.

 9    97 ms    96 ms     *     be2113.ccr42.atl01.atlas.cogentco.com [154.54.24.222] 
10   142 ms   112 ms   114 ms  be2690.ccr22.iah01.atlas.cogentco.com [154.54.28.130] 

A lot of people look at the output you posted and at the missed response in the 3rd attempt on hop 9 and claim there is a problem at hop 9.

But there is none, because the next three packets sent (to hop 10) all went THROUGH hop 9, and they got there just fine. So there isn't a problem with the missed response at router 9.

But why the missed response? Good question... and the subject of the next tip:

2. A single missed response is usually not reason for concern

Router vendors these days spend millions of dollars on improving their hardware and software to the point where the routers are able to receive and forward packets at near-line speed.

When a packet is passing through a router like this, it is passing through a specially designed channel that is built specifically for super fast processing. This is usually known as the data plane.

When a router has to do something special to a packet, that is outside simply forwarding it, it has to be passed to the router's CPU (or brain, if you will). This type of traffic has to be processed by the router's control plane.

In all cases, the Router will put more effort into delivering a packet through its data plane, as it does into processing a packet sent to its control plane.

As a result, what can happen is at the moment of the Traceroute attempt, that particular router may be dealing with processing millions of other packets going THROUGH it, and won't bother itself with interrupting that process to process the packet going TO it. So the packet is simply discarded, and shows up in the traceroute as a missed hop *.

So what SHOULD you concern yourself with when reading the traceroute? Good question... read on.

3. What SHOULD you concern yourself with in a typical traceroute

Traceroute is best suited to detect and determine where latency exists in the end to end path. But a single spike in latency usually means nothing. For example:

 1     2 ms     2 ms     4 ms  gw.wireless.iqsalford.quintain.lan [10.222.208.1] 
 2    14 ms     8 ms     8 ms  gw-vlan1577.man-xmr.tcw.ask4.net [78.109.190.225] 
 3     6 ms     2 ms     3 ms  man-xmr-edge1-r1.tcw.ask4.net [81.23.63.237] 

You can look at the first result of hop 2, and see a jump to 14ms and think that was a big proportionate jump. But if you look at the response times for hop 3, you see 6ms, 2ms, and 3ms. Well, each of these 3 attempts went THROUGH the router at hop2, and if there were able to get TO hop 3, THROUGH 1 and 2, and back all in 2/3/6ms, you have no problems there.

(note, this isn't the best example, because 14ms is still lighting fast, but it was the best example of it in the provided output).

What you want to be concerned with is if you see a CONSISTENT increase in latency at a SPECIFIC hop where the response times increases throughout the rest of the traceroute.

HOWEVER, sometimes the latency increase is expected, and perfectly normal. Specifically when you are crossing long WAN links. For example:

 6     8 ms    11 ms     8 ms  be2871.ccr42.lon13.atlas.cogentco.com [154.54.58.185] 
 7   148 ms    83 ms    88 ms  be2490.ccr42.jfk02.atlas.cogentco.com [154.54.42.85] 

Here you are jumping from LON to JFK, across the Atlantic Ocean. The jump in latency is expected. Had this hop been between two routers in closer proximity, and because the latency increased persisted throughout the rest of the traceroute, this would have been reason for concern, and a good indication of latency at a particular router.

Related Topic