What is the relationship between keep-alive on a HTTP request and a tcp socket in TIME_WAIT – should they be correlated?
Furthermore, should system and web server settings be aligned e.g. server.max-keep-alive-idle = 60
? According to How to reduce number of sockets in TIME_WAIT? in Linux the TIME_WAIT state is hardcoded at 60 seconds (at least for Ubuntu/Debain values of Linux).
In lighttpd the default value server.max-keep-alive-idle = 5
and they recommend even lower for high load. It seems a waste to close a http request after 5 seconds if the tcp socket is available – assuming of course that the setting net.ipv4.tcp_tw_reuse = 1
does what it says on the tin.
This related question – How does tcp keep a connection alive? [closed] touches on the issue but doesn't fully answer it for me.
Best Answer
TCP is layer 4, HTTP layer 7.
In HTTP 1.0, HTTP Keep-Alive is used at layer 7 to simulate persistent connections using
Connection
header.In HTTP 1.1, connections are assumed persistent by default and then rely on TCP only to do that job. Requests can be pipelined in the same TCP connection, then one side will set
Connection: close
in the last request or response headers, so both side knows that no more HTTP request can be exchanged and the connection will then be closed.Usually in the case of a web server, the
TIME_WAIT
state will be the state after which, once decided to actively close the connection, it received client'sFIN
packet and is sending the lastACK
back in the four-way tear-down. After this, it waits for2 * MSL
: it's a way to be sure that the connection is closed. That's where the60s
compiled in the kernel comes from. In this way we are sure that we won't receive in a new connection, using the same 4 tuple, packets out of sequence arising from the previous connection.You don't want to change it.
In the other side
server.max-keep-alive-idle
is the timeout after which anESTABLISHED
connection will be considered idle if no HTTP request comes in and will be actively closed by the web server. When this decision is made, as you understand now, the TCP tear-down will take place.Be very careful with
tcp_tw_recycle
, if your visitors come from behind a wide NATed network then it could lead to multiple TCP connections with the same 4 tuple taking place with out of order timestamps resulting in silently dropping client connections attempts on the server side.So the best option is to adjust the parameter you saw in lighttpd. System-wide, you can safely lower
FIN_WAIT2
state and raise buckets for sockets inTIME_WAIT
state withnet.ipv4.tcp_fin_timeout
andnet.ipv4.tcp_max_tw_buckets
.