Security – Real benefits of tcp TIME-WAIT and implications in production environment

network-monitoringnetworkingperformanceSecuritytcp

SOME THEORY

I've been doing some reading on tcp TIME-WAIT (here and there) and what I read is that it's a value set to 2 x MSL (maximum segment life) which keeps a connection in the "connection table" for a while to guarantee that, "before your allowed to create a connection with the same tuple, all the packets belonging to previous incarnations of that tuple will be dead".

Since segments received (apart from SYN under specific circumstances) while a connection is either in TIME-WAIT or no longer existing would be discarded, why not close the connection right away?

Q1: Is it because there is less processing involved in dealing with segments from old connections and less processing to create a new connection on the same tuple when in TIME-WAIT (i.e. are there performance benefits)?

If the above explanation doesn't stand, the only reason I see the TIME-WAIT being useful would be if a client sends a SYN for a connection before it sends remaining segments for an old connection on the same tuple in which case the receiver would re-open the connection but then get bad segments and and would have to terminate it.

Q2: Is this analysis correct?
Q3: Are there other benefits to using TIME-WAIT?

SOME PRACTICE

I've been looking at the munin graphs on a production server that I administrate. Here is one:
enter image description here

As you can see there are more connections in TIME-WAIT than ESTABLISHED, around twice as many most of the time, on some occasions four times as many.

Q4: Does this have an impact on performance?
Q5: If so, is it wise/recommended to reduce the TIME-WAIT value (and what to)?
Q6: Is this ratio of TIME-WAIT / ESTABLISHED connections normal? Could this be related to malicious connection attempts?

Best Answer

In short, don't worry about TIME_WAIT. The overhead is almost none, and usually poses no problems.

On a busy server, port exhaustion is possible, and in that case there is the sysctl option of net.ipv4.tcp_tw_reuse = 1, which allows the kernel to reuse old ports that are still in TIME_WAIT as needed.

TIME_WAIT is part of the TCP specification, and is there to catch packets that may still be in transit (remember, not all connections are reliable, and that is what TCP aimed to solve). The timeout value may be very high for most modern uses, but it doesn't normally interfere with anything other than the output of netstat.

If you are in control of the socket yourself, and are certain you aren't waiting for data (e.g. you're final sender, or you don't care about a response), you can close the socket after setting the SO_NOLINGER option, which will terminate the connection with an RST, and immediately discard the socket.

So your questions:

Q1,Q2,Q3: It's there to collect late packets, "just in case", because links can be unreliable. It's part of the spec, it prevents packet loss, and adjusting it has no real benefit.

Q4: No

Q5: Don't worry about it, and you have the option of force the reuse of these sockets if need be.

Q5: TIME_WAIT and ESTABLISHED aren't correlated, other than the more short-lived connections you have, the greater that ratio will be. It could be cause by something malicious, but it's not an indicator any more than "excessive network activity" would be.

Related Topic