Windows – IIS: How to tell if a slow time-taken is due to a slow network connection

iisiis-7.5networkingperformancewindows

According to http://support.microsoft.com/kb/944884, "when a large response or large responses are sent to a client over a slow network connection, the value of the time-taken field may be more than expected".

I have a situation where a client will say, "I sent a request to your web server at 10:03:24 and it took 20 seconds, why?". I can see this in the IIS logs as well, but the server's ASP.NET module logged it as taking 100ms, and CPU and Disk counters were low.

I suspect that it's due to a slow network connection. How can I prove this?

Update:

1) These are SOAP Web Service requests, therefore no embedded graphics, just an HTTP POST with a single XML page of results.

2) Also, I've reproduced this by throttling network speed on the client side and the symptoms are exactly the same.

3) The problem is intermittent, meaning the same request is normally fast for the client but occasionally slow. I can't reproduce this myself other than by throttling the network. The server's ASP.NET logging shows it always fast, but IIS logging shows it slow when the client says it's slow.

4) I only have access to the server, and need to provide as much information as possible to the client so they accept that the issue was not on the server and know what logging/tools to run on the client to find root cause.

Best Answer

I have a situation where a client will say, "I sent a request to your web server at 10:03:24 and it took 20 seconds, why?". I can see this in the IIS logs as well, but the server's ASP.NET module logged it as taking 100ms, and CPU and Disk counters were low.

I suspect that it's due to a slow network connection. How can I prove this?

It starts with looking for packet drops between your client's browser and all the sources of images / scripts / html for the aforementioned web page. If you find consistent packet drops, then you know for sure there is something in the network that needs to be fixed... even if it is just a link that's overloaded. Packet drops are not the only reason for a slow network, but it's the most common source in my experience. Other sources could be a misconfigured proxy or cache engine. Sadly, I can't list all possible network culprits here.

However, people often blame the network, when in-fact the speed issues are well-within their own control. Possible explanations:

  • Suppose the HTML for that page was written poorly and it loads required scripts in the wrong order so the whole page renders slowly, even though almost all resources were in-place.
  • The page is waiting for a resource that simply doesn't exist and times-out while waiting.
  • A script is in a slow loop that blocks for a while
  • A cache engine takes a long time delivering an image
  • Your CGI is looking up something in a database, and the lookup itself is slow
  • You're using google analytics, which slows things down due to the way the page is written

I could go on, but the point is you have to nail down the exact reason for why the page is slow yourself. A flawed network is possible; it is also possible that other factors are contributing to the slow performance.

To diagnose further:

  • If the page loads well in Firefox, then the Network tab in Firebug is your friend (Hit F12, then go to the Network tab and reload the page). Firebug gives you a nice waterfall diagram for how the page loads and where the delays are Firebug waterfall
  • If the page loads well in Chrome, you can do something similar (Hit CntlShiftI, click on the network tab, and reload the page). Chrome
  • If the page is only supported in IE (btw, shame on your HTML developers), your best bet is to start loading each of these ASP page elements individually with curl until you find something that looks way too slow, then find out why that particular element is slow.

BTW, the Chrome and Firefox examples used a CGI query from Debian.org; this is a good example of a delay that comes from a CGI lookup.

When all else fails, you can get a .pcap from wireshark and run it through tcptrace; however, while tcptrace is very good at analyzing packet dumps, there are no guarantees that you can isolate the issue with tcptrace alone. See this answer for information on using tcptrace diagnostics.