TCP Round-Trip Time – Why It Decreases 100-Fold with Larger Message Sizes

tcp

For a university telecommunications project my partner and I built a simple command-line messaging application using internet sockets in C. Part of the project was to use the application to test how the transmission time and transmission rate of a message, sent via TCP, changes with message size.

To test this, we directly connected two computers together with an Ethernet cable. Computer A starts a timer, sends a message to computer B, which, after receiving the whole message, resends the message back to A, which stops the timer after receiving the entire message. The size of the message in the first test was 3 bytes, and larger and larger message sizes were tested until a message size of 32MB (2^25 bytes). We also ran the same test code using the loopback ip address, where the roles of computer A and computer B were filled by the one computer and no network transmission took place: the messages just moved around in memory. This was done as a control to test the impact of network unrelated computer processing and our software implementation choices on the transmission time. Here are the results.

test results

The round trip transmission time of a 3 byte message is roughly a millisecond, with message sizes of 4 or 5 to 1460 bytes (the usual maximum segment size of TCP, noted in the graph legend as MSS) taking roughly the same time, just under 100 milliseconds, before the transmission time drops two or so orders of magnitude, after the MSS, back to a millisecond. Then there is a period of high volatility (most obvious in the third graph), where the transmission rate increases until roughly 2^17 bytes, then plateaus, smoothing out. From then on the round trip transmission time increases in a roughly linear fashion. I should add that these results were operating system independent: running the tests on any combination of windows, mac os x and ubuntu operating systems made no difference in the overall trend. Additionally, the message sizes here include data only and include neither TCP, IP nor Ethernet headers. Since the round trip transmission time of the loop-back IP address test remains multiple orders of magnitude below the LAN test over almost all message sizes, I've concluded that computer processing time made a negligible impact on the LAN test.

It seems to me that the trend after the MSS can be explained by the fact that, in the long run, the ratio between the number of message bytes and the number of header bytes approaches some constant, and that the larger the messages size, the lower the variance of this ratio. There might also be other constant time processing which would have more of an impact on the transmission time when there are fewer bytes and few segments being sent.

As for the trend before the MSS, I'm totally stumped. I've tried googling the issue, but nothing's come up; maybe I'm googling for the wrong things. Why is it that transmission time is roughly constant before the MSS, and drops so dramatically after the MSS? And, given this, why is it faster to send 3 bytes than 10 bytes or 20 bytes? Thanks in advance.

Best Answer

I will assume from your questions tag that you are using TCP.

A likely explanation that does require some assumptions about your code is that it's an interaction between Nagle's algorithm (an algorithm designed to prevent unnessacery small packets) and TCP delayed acknowlagements (an algorithm designed to avoid sending unnessacery bare acks when an ack could share a packet with application data).

Lets say your applicaiton sends data to the OS in two (or more) parts, first it sends a header, then it sends the actual data. I don't know for sure that this is what you are doing but it seems the most plausible explanation to me. For small messages the following sequence of events happens.

  1. The client application sends the header to the client OS.
  2. The client OS sends the header to the server.
  3. The client application sends the body to the OS
  4. Since there is unacknowlaged data in-flight Nagle's algorithm kicks in and the OS holds back the data hoping for more.
  5. The server OS receives the header, since it's a single packet it holds back the ack hoping for an application response or a second packet to arrive from the client.
  6. We now have something of a deadlock. The client doesn't want to send out another small packet because there is already unacknowlaged data in-flight. The server doesn't want to send out an ack because there has been no application response.
  7. Eventually a timeout breaks the deadlock.

At larger message sizes the interaction changes. Depending on the exact rules the server uses for delayed acknowlagements there can still sometimes be delays but they don't seem to be too significant in your case.

Possible fixes for this problem include using the TCP_NODELAY socket option and/or reworking your application code to build the complete packet before sending it to the OS in one peice.

Related Topic