TCP Socket no connection timeout

network-programmingnetworkingtcptimeout

I open a TCP socket and connect it to another socket somewhere else on the network. I can then successfully send and receive data. I have a timer that sends something to the socket every second.

I then rudely interrupt the connection by forcibly losing the connection (pulling out the Ethernet cable in this case). My socket is still reporting that it is successfully writing data out every second. This continues for approximately 1hour and 30 minutes, where a write error is eventually given.

What specifies this time-out where a socket finally accepts the other end has disappeared? Is it the OS (Ubuntu 11.04), is it from the TCP/IP specification, or is it a socket configuration option?

Best Answer

Pulling the network cable will not break a TCP connection(1) though it will disrupt communications. You can plug the cable back in and once IP connectivity is established, all back-data will move. This is what makes TCP reliable, even on cellular networks.

When TCP sends data, it expects an ACK in reply. If none comes within some amount of time, it re-transmits the data and waits again. The time it waits between transmissions generally increases exponentially.

After some number of retransmissions or some amount of total time with no ACK, TCP will consider the connection "broken". How many times or how long depends on your OS and its configuration but it typically times-out on the order of many minutes.

From Linux's tcp.7 man page:

   tcp_retries2 (integer; default: 15; since Linux 2.2)
          The maximum number of times a TCP packet is retransmitted in
          established state before giving up.  The default value is 15, which
          corresponds to a duration of approximately between 13 to 30 minutes,
          depending on the retransmission timeout.  The RFC 1122 specified
          minimum limit of 100 seconds is typically deemed too short.

This is likely the value you'll want to adjust to change how long it takes to detect if your connection has vanished.

(1) There are exceptions to this. The operating system, upon noticing a cable being removed, could notify upper layers that all connections should be considered "broken".