we are seeing the following:
[root@primary data]# netstat -s | grep buffer ; sleep 10 ; netstat -s | grep buffer
20560 packets pruned from receive queue because of socket buffer overrun
997586 packets collapsed in receive queue due to low socket buffer
20587 packets pruned from receive queue because of socket buffer overrun
998646 packets collapsed in receive queue due to low socket buffer
[root@primary data]#
Bare in mind, the above is a freshly rebooted box… About 1hour uptime. We recently had a box that was up 2 months, and these counteres will into the high millions (XXX million).
We have tried changing various sysctl variables…
Here are our sysctl variables which I believe are related:
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
Does anyone know how to resolve these pruned pakcets due to socket buffer overrun / packets colapsing (which I understand isnt as bad as the pruned packets)?
Thanks.
Best Answer
Judging from the information you have provided, and since you seem to have already increased buffers, the problem most likely lies at your application. The fundamental problem here is that even though the OS receives the network packets, they are not processed fast enough and hence fill up the queue.
This does not necessarily mean that the application is too slow by itself, it's also possible that it doesn't get enough CPU time because of too many other processes running on that machine.