I am stress testing one of my servers by hitting it with a constant stream of new network connections, the tcp_fin_timeout
is set to 60, so if I send a constant stream of something like 100 requests per second, I would expect to see a rolling average of 6000 (60 * 100) connections in a TIME_WAIT
state, this is happening, but looking in netstat
(using -o) to see the timers, I see connections like:
TIME_WAIT timewait (0.00/0/0)
where their timeout has expired but the connection is still hanging around, I then eventually run out of connections. Anyone know why these connections don't get cleaned up? If I stop creating new connections they do eventually disappear but while I am constantly creating new connections they don't, seems like the kernel isn't getting chance to clean them up? Is there some other config options I need to set to remove the connections as soon as they have expired?
The server is running Ubuntu and my web server is nginx. Also it has iptables with connection tracking, not sure if that would cause these TIME_WAIT
connections to live on.
Thanks
Mark.
Best Answer
This problem was interesting as I've often wondered myself. I did a couple tests and found some interesting results. If I open up one connection to a server and wait 60 seconds it was invariably cleaned up(never got to 0.00/0/0). If I opened 100 connections, they too were cleaned up after 60 seconds. If I opened 101 connections I would start to see connections in the state you menitoned(that I've also seen before). And they appear to last roughly 120s or 2xMSL(which is 60) regardless of what fin_timeout is set to. I did some digging in the kernel source code and found what I believe is the 'reason'. There appears to be some code that tries to limit the amount of socket reaping that happens per 'cycle'. The cycle frequency itself is set on a scale based on HZ:
In the actual timewait code you can see where it uses the quote to stop killing off TIME_WAIT connections if its already done too many:
Theres more information here on why HZ is set to what it is: http://kerneltrap.org/node/5411 But it isn't uncommon to increase it. I think however its usually more common to enable tw_reuse/recycling to get around this bucket/quota mechanism(which seems confusing to me now that I've read about it, increasing HZ would be a much safer and cleaner solution). I posted this as an answer but I think there could be more discussion here about what the 'right way' to fix it is. Thanks for the interesting question!