TCP – How Does TCP React to a Retransmission Timeout?

congestionlayer4protocol-theorytcptransport-protocol

enter image description hereHi! Could someone explain to me what happens at the time-out? Then the window size completely drops. I thought that at time-out the same thing as in 3 duplicate ACKs would happen; half the cwnd and go into additive increase. Is there a difference in how these indicators of congestion are handled?

Best Answer

What happens at the time-out is actually pretty clear from the drawing... The congestion window size drops back to its original value of 1 and slow start is run again.

The specifics of how a TCP stack will handle congestion events depend on what variant you are using. This drawing looks like an example of the TCP Reno algorithm.

When seeing 3 duplicate ACKs, TCP Reno concludes there is congestion, but the network is still working since some segments were ack-ed. In case of a time-out, the situation is worse: the network seems completely unresponsive. Actually, the fact that duplicate acks are being received before a retransmit timer expires means segments are still being received by the other side, even though some may have been lost (or re-ordered).

So, in case of the 3 duplicate acks, the congestion window is cut in half and then increased linearly. This is known as fast recovery, and its goal is actually to avoid waiting for retransmission timeouts.

When a retransmit time-out does occur, the reaction is more drastic. TCP Reno starts over with slow start from a congestion window of size 1, and a slow start threshold of half of the value of the congestion window when the time-out occurred. When the threshold is reached, increase becomes linear again (additive increase).

TCP Tahoe did not include fast recovery, and would react the same way in both cases, resetting the congestion window to its initial value and executing slow start. TCP Reno's fast recovery basically skips the slow start, immediately setting the congestion window to the threshold value and starting the linear increase.

Note that many more variants exist and actual implemantations can be more complex. Also observing these algorithms at work is not easy because other TCP mechanisms can interfere.

I do not know whether what you had in mind (using fast recovery in both situations) exists as a known and implemented congestion avoidance algorithm. It was probably tested and discarded when Reno was implemented. Feel free to do some digging around in the scientific papers in this area.

Related Topic