Linux – Hitting Ephemeral TCP Port Exhaustion

linuxnetworkingtcp

We run a high traffic website. Over the past few days, we've had several customers complain of sporadic downtime that we cannot reproduce. We have several web servers elected to receive traffic from our load balancer, and while investigating I realized that all severs were dropping over 20 connections per second. A sample of connections from one sever looked like this:

  38452 TIME_WAIT
   7815 ESTABLISHED
    570 FIN_WAIT2
    105 FIN_WAIT1
    101 LAST_ACK
     36 SYN_RECV
     25 CLOSING
      4 SYN_SENT
      2 CLOSE_WAIT
      1 Foreign

Our configured port range is currently set to 15000 61000 on all servers. It would appear, then, that all possible ports must be getting exhausted since the number of connections either established or waiting to close is equal to 46267.

While we investigate traffic, what should we do about the dropped connections? Might it be wise to increase our port range? Decrease the amount of time closing connections wait? Both? Would doing either have any potentially negative consequences?

Best Answer

There are a couple of ways you can tackle this one.

The easiest way is to increase ephemeral range, but you already kinda did this and there are obviously limits how far you can go with this solution.

Another solution would be round-robin DNS and adding multiple IP addresses to your load balancer nodes. Sometimes this is not easily applicable (you need to wait to get additional IPs, wait for DNS propagation period etc).

Two things you can safely do immediately while you consider other long term solutions would be to lower your TCP timers and turn on tcp_reuse.

tcp_reuse is pretty safe to use on load balancer, and what it does is enables kernel to reuse sockets in TIME_WAIT state for new connections. To turn it on, run on your linux box:

# sysctl -w net.ipv4.tcp_tw_reuse=1

To make it boot persistent:

# echo "net.ipv4.tcp_tw_reuse = 1" >> /etc/sysctl.d/net.ipv4.tcp_tw_reuse.conf

Another kernel tuning params that may help are:

  • net.core.somaxconn (size of the listen queue)
  • net.ipv4.tcp_max_syn_backlog (number of remembered connection requests without ACK)

Also, you can lower net.ipv4.tcp_fin_timeout to 1 or 2 (how long to keep sockets in the state FIN-WAIT-2 if your host is the one closing connection).

Hope it helps.