Linux – tcp_tw_reuse and tcp_tw_recycle not working in specific environment

haproxylinuxlinux-networkingMySQLtcp

We are running HAProxy to direct mysql traffic from PHP to a set of mysql servers. However, due to an issue with the way the mysql-client sends TCP packets, we get a ton of connections in TIME_WAIT with concurrent port utilization at 50k+, and end up with SOCKERR messages on HAProxy due to port exhaustion under extremely heavy load as described here – http://blog.haproxy.com/2012/12/12/haproxy-high-mysql-request-rate-and-tcp-source-port-exhaustion/

The article above suggests enabling tcp_tw_reuse and tcp_tw_recycle on the HAProxy server, which we've done on one test environment and this practically solved our issue and kept TIME_WAIT connections below 1000 under heavy load. However, the same cannot be said for another environment we have where we also enabled these two TCP settings – the TIME_WAIT is still high and port utilization is still 50k+.

Both environments are on the same kernel, same haproxy version, and we can't figure out what could be contributing to this specific environment not accepting the tcp_tw_reuse and tcp_tw_recycle changes.

On both environments, we expanded the port_range to 1024-65535. This is on CentOS 6.4.

Please assist – we are spinning our heads here and if you need more information, I can provide. Thank you.

Best Answer

Found the root cause. tcp_timestamps must be enabled on the local server as well as whatever outbound server you are trying to reach. tcp_tw_reuse and tcp_tw_recycle depend on tcp_timestamps to determine which ports to reuse. See this - http://vincent.bernat.im/en/blog/2014-tcp-time-wait-state-linux.html