Ubuntu 12.04 randomly dropping network connections

networkingUbuntu

Good day,

Few days ago I've installed new server on Ubuntu 12.04 and migrated all my data and services to it from an old one (pretty outdated 2011 year server). There are many different services installed on it: nginx, mysql, memcached, varnish, etc.

Now I have an issue with network connections: limited amount of connections works normally but other would not, they just time out. Such requests are not visible in web server's access log at all. Moreover, some outgoing requests with cURL are not working too, they time out at 30 second. Sometimes I cannot connect to the server — SSH connections timeout sometimes too! Pinging google:

--- google.com ping statistics ---
33 packets transmitted, 0 received, 100% packet loss, time 32017ms

I have not updated or installed any additional firewall nor changed system network config options.

You can easily reproduce the problem by accessing any page (ex: http://en.advisor.travel/poi/Wroclaw_Palace-15356) and quickly refreshing the page multiple times. You'll end with "Server is not available" browser error.

It almost seems that the server has some weird network connections limit or rules to block quick connections succession from one IP but I cannot find any useful setting :/ Nothing interesting can be found in syslog or dmseg.

/etc/sysctl.conf has everything commented out but default lines:

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

Default firewall ufw is stopped.

Any ideas what's happening and what to do are much appreciated!

Best Answer

When I make connections to the server I find that some client port numbers work and other port numbers fail. It is the same port numbers that work every time, but since normal clients will be using a new port number every time, it will appear random.

This behavior could be a symptom of a broken bundle of network links. You want traffic to be load balanced over the links in a bundle, but you don't want packets in a TCP connection to overtake each other on the network, so each TCP connection should remain on the same link in the bundle.

That is why a broken link in a bundle cause some port numbers to work and other port numbers to fail.

traceroute shows that SYN packets are reliably send from my computer all the way to the last hop before the server. But I cannot reliably get a SYN-ACK back. That means the problem is either on the last hop from the last router to the server, or it is somewhere on the return path.

From the server you can run traceroute -n -p 80 name-of-another-server to see how far SYN packets from the server can be send reliably.

Related Topic