Lots of packets pruned and packets collapsed because of socket buffer low/overrun

netstattcp

i've set up a test machine (debian squeeze 2.6.32 on a linode 2048 machine) that interact with an api that returns large chunks of json. It calls the API 3000/minutes asynchronously, the api is returning payloads of ~450kb.
There's also an http server on the box to display the calls results.

While doing netstat -s (uptime is 20 days):

 254329 packets pruned from receive queue because of socket buffer overrun
 50678438 packets collapsed in receive queue due to low socket buffer

This didn't sound good to me so I've followed these tutorials to tweak the TCP parameters:

http://fasterdata.es.net/fasterdata/host-tuning/linux/test-measurement-host-tuning/

and

http://www.acc.umu.se/~maswan/linux-netperf.txt

but it doesn't seems to help.

Any advice/tutorial/explaination about socket buffers that might help understanding and fixing the problem?

thanks

Best Answer

It sounds like you are reaching the maximum network traffic your VPS can handle. Tweaking TCP parameters isn't magic - it can help a little, but probably not enough. Some tweaks may even be negated by running in a virtual machine - the traffic still gets passed through the hypervisor's real network card and is affected by it's settings.

You say the incoming payload is 450kb per request. Is that in kilo bits or kilo bytes? Most tools measure the size in bytes, but I'll do both calculations.

Assuming kilobits:

  • 3000 requests/minute = 50 requests/second
  • 50*450kbit = 22,500kbit/s = approx 22Mbit/s

Assuming kilobytes, it's approx 176Mbit/s.

If it's kilobytes, you aren't going to be able to consistently do that on most VPS servers. Each server is going to have at least 10-20 VPSs on it. Linode uses two gigabit bonded connections to each server. That means your "fair share" on a full servers would be around 100Mbit/s at best.

Even if it is kilobits, 22Mbit is a fair bit for most VPSs.

By doing so many requests so fast, you are probably doing the equivalent of DOSing your own server. Checking your actual incoming network traffic should give you an idea of how much you are actually using. If you need real 100mbit or even gigabit speeds, you may need to look at a dedicated server. Otherwise, you need to slow down the requests until it slows down enough that the server can handle it.

You also need to check your memory and CPU usage. If either of those are maxed out, your server will start dropping packets because it simply doesn't have the resources to handle them. Start by looking at top and ntop to watch your CPU, memory and network usage for awhile.