Openvz: What exactly does it mean when tcpsndbuf failcnt increases? Why must there be a minimum difference between limit and barrier

openvz

When the failcnt of tcpsndbuf increases, what does this mean? Does it mean the system had to go past the barrier, or past the limit? Or, maybe, that the system failed to provide enough buffers, either because it needed to go past the limit, or because it needed to go past the barrier but couldn't because other VMs were using too many resources?

I understand the difference between barrier and limit only for disk space, where you can specify a grace period for which the system can exceed the barrier but not the limit. But in resources like tcpsndbuf, which have no such thing as a grace period, what is the meaning of barrier vs. limit?

Why does the difference between barrier and limit in tcpsndbuf have to be at least 2.5KB times tcpnumsock? I could understand it if, e.g., tcpsndbuf should be at least 2.5KB times tcpnumsock (either the barrier or the limit), but why should I care about the difference between the barrier and the limit?

Best Answer

You have soo many questions here.

When the failcnt of tcpsndbuf increases, what does this mean? failcnt is in it's simplest term whereby it goes beyond it resource limit. So in conjunction to tcpsndbuf, it means its would increment this counter the number of times it goes past the tcp buffer size resource. If you are constantly hitting this counter for a prolong period of time, you will start to notice and experience network performance issues.

There is a difference between the barrier and limit.

Going past the barrier, you will just see a degradation of network performance. Application will still function but network performance will be slow.

Going past the limit for a certain time period, you will start seeing dropped connections.

As for the 2.5KB, this is required to ensure enough buffer space to allow current network connections to be able to send data successfully, if not, then it may end up hanging connections half way sending data.

__

  • Is failcnt increased or not? The field failcnt shows the number of refused "resource allocations" per lifetime of the container. This increases per number of times tcpsndbuf has hit the limit (this is what I meant by “past the tcp buffer size”).

  • Suppose the container asks for tcpsndbuf beyond barrier. Is this granted or not? This is granted, however bare this in mind that if you enter a higher value for tcpsndbuf, it will not automatically mean it will improve network performance. It is also limited to hardware limits. It starts to degrade because of UBC Consistency checks. If the constraints are not satisfied then transmission of data over the sockets may hang in some circumstances (not always, just sometimes).