On a large private network there seems to be excessive latency between two boxes which I control. This is stopping the server providing a web service to the client as its causing timeouts.
Here's a slow running request logged in Charles
Timing
Request Start Time 03/11/10 23:21:33
Request End Time 03/11/10 23:21:33
Response Start Time 03/11/10 23:21:42
Response End Time 03/11/10 23:21:42
Duration 8.99 sec
Request Duration 16 ms
Response Duration 0 ms
Latency 8.97 sec
Speed 1.30 KB/s
Response Speed ∞ KB/s
Size
Request Header Size 412 bytes
Response Header Size 151 bytes
Request Size -
Response Size 11.17 KB (11436 bytes)
Total Size 11.72 KB (11999 bytes)
Request Compression -
Response Compression -
As a test I just tried getting an 423k image file via a browser, but this downloads so slow it stalls.
How can I troubleshoot where the problem lies?
I tried using pingplotter
Looks like hops 3 and 4 are the culprits? Where do I go from here?
ANSWER
This turned out to be a problem with the client box running Windows Web Server 2008 R2 and a known network speeds problem if the router is not able to handle network TCP window scaling. Updated router software and problem was resolved.
Best Answer
Attempt to replicate the issue from as many different points as possible. You'll probably find something in common between them (be it the server they are using, a switch, etc). Collect traceroute data from each endpoint.
It's always better to use the onion approach: add/remove layers. You can start form the client and come closer and close to the server. Or the inverse.
Make sure to check the interface statistics for error counts. I find it very useful to fire up tcpdump/Wireshark on both ends and capture the packets from the TCP session, then compare both. Wireshark does a good job of pointing out the most obvious problems (like checksum errors or retransmissions).