Linux – Why would a server not send a SYN/ACK packet in response to a SYN packet

linuxtcpweb-server

Lately, we've become aware of a TCP connection issue that is mostly limited to mac and Linux users who browse our websites.

From the user perspective, it presents itself as a really long connection time to our websites (>11 seconds).

We've managed to track down the technical signature of this problem, but can't figure out why it is happening or how to fix it.

Basically, what is happening is that the client's machine is sending the SYN packet to establish the TCP connection and the web server receives it, but does not respond with the SYN/ACK packet. After the client has sent many SYN packets, the server finally responds with a SYN/ACK packet and everything is fine for the remainder of the connection.

And, of course, the kicker to the problem: it is intermittent and does not happen all the time (though it does happen between 10-30% of the time)

We are using Fedora 12 Linux as the OS and Nginx as the web server.

Screenshot of wireshark analysis

Screenshot of wireshark analysis

Update:

Turning off window scaling on the client stopped the issue from happening. Now I just need a server side resolution (we can't make all the clients do this) 🙂

Final Update:

The solution was to turn off both TCP window scaling and TCP timestamps on our servers that are accessible to the public.

Best Answer

We had this exact same problem. Just disabling TCP timestamps solved the problem.

sysctl -w net.ipv4.tcp_timestamps=0

To make this change permanent, make an entry in /etc/sysctl.conf.

Be very careful about disabling the TCP Window Scale option. This option is important for providing maximum performance over the internet. Someone with a 10 megabit/sec connection will have a suboptimal transfer if the round trip time (basically same as ping) is more than 55 ms.

We really noticed this problem when there were multiple devices behind the same NAT. I suspect that the server might have been confused seeing timestamps from Android devices and OSX machines at the same time since they put completely different values in the timestamp fields.

Related Topic