Linux tc: unstable rate using tbf qdisc

linuxrate-limitingtctraffic-shaping

This is the first time I am experimenting with tc. What I am trying to achieve is to limit the download rate that passes through a Virtual Machine that acts as gateway. The VM has two Ethernet interfaces. Traffic coming from the users enters eth0 and it exits through eth1 to reach the Internet. These two interfaces are bridged. The set-up works fine and without using tc to impose any limits I can achieve ~10-11MB/s download/upload.

In order to limit the download traffic I apply a tbf qdisc on eth0. For example when I want to limit the download rate I use the following:

tc qdisc add dev eth0 handle 10: root tbf rate $R burst 15k latency 25ms

This works fairly OK for limiting the rate up to 2MB/s, however when placing a higher limit on the rate I start to see fluctuations. I experimented with different values for latency and burst and I noticed that if for larger rates I increase burst to 40K and latency to 50ms in some cases I get less fluctuation but this is not always the case.

In general, I notice the following:

  • In all cases the maximum rate I can achieve even with fluctuations is ~90% of the rates I set on the tbf.
  • Even for rates <2MB/s I see a small fluctuation. For example, placing a 1MB/s rate limit will give a fairly stable rate of 920KB/s but with minor instant drops down to 880KB/s and then back to 920-921KB/s. After investigation, while these fluctuations happen I notice using watch tc -p -s -d qdisc show dev eth0 that the backlog increases and when the drop in the rate happens the backlog decreases. In the case of the 1MB/s the backlog increases up to 30-40p then when it drops the download rate shows an instant fluctuation.
  • In the cases of larger rates which cause more fluctuation, when setting a rate limit of 6MB/s for example, the rate can drop down to 2MB/s and then constantly fluctuate between 2MB/s and 4MB/s. Not stabilizing at all. In this case I see that the backlog can reach up to 120p and when it drops to 0 the dropped packets increase and that causes large fluctuations.

The way I am testing whether the set-up works is using a machine whose traffic traverses this Gateway VM and I perform the following:

  • I use speedtest.com to measure speed which gives correct values regarding the rates i.e. the rate is always close to what I set.
  • I use wget to download large files from the Internet (linux ISO files) and monitor the download rate. These are the cases where I see most fluctuations, and the lower the limit the less the fluctuation. Also, downloading from some servers gives less fluctuation than others. I can understand that things have to do with the server but when I remove the tbf qdisc from eth0 the download rate I get is fairly stable at 10.2-10.6 MB/s. So the tbf and its configuration must have to do something with why the download rate fluctuates.
  • Finally, in order to rule out traffic congestion on the internet I use SCP to download ISO files from another PC in my office's network. Using SCP the rate seems to be stable. At least a lot more stable than in the approach with wget.

I have tried to read a lot of material on tc and tbf, I also tried using htb and cbq but these qdiscs give more unstable download rates in my case. In my tests until now only tbf has been the most stable, but problems start showing up with large rates.

Therefore I would like to ask:

  • In general, how stable should I expect a rate that i set using tc?
  • Is it normal to achieve only ~90% of the rate I specify using a tbf qdisc?
  • What is the meaning of the backlog? I have tried to find out what it is but I was not able to find any documentation on what it expresses. However, I believe it maybe the reason of the unstable rate since when it is at low values the rate seems stable at 90% of the specified one, only when it gets too big and then returns to zero/low values I notice a fluctuation in the rate.
  • How can I move towards better optimizing the tbf to be more stable?
  • Is the approach I follow to test that the rate limiting correct? If not, is there a better and more accurate way to test and make sure that my rate limiting works?

I know this is a lot of information, I have tried to be as detailed as possible in order to better explain my situation. I would be extremely grateful for any pointers on how to achieve stable download rates using tbf. Thank you in advance.

Best Answer

There is no "in general" expectation with tc, it is depending on the qdisc you are using. About TBF, you can read in "Journey in the center of the kernel" some of the limits of tbf (all traffic go through a single queue).

On lartc, you can read you have to increase the tbf bucket size according to your bandwidth.

The backlog is basically the length of the queue, you can read more information here for example.

Your bandwidth drop while the backlog is emptying (which it is not supposed to happens) remember this story, where the garbage collection in a TCP queue was imacting latency. You could try to understand what is happening to you.

The 90% point could come from the overhead you specify, and from your protocol ratio : if you are shaping the upload at 1MB/s and you are receiving 10 packets for each packet you send, you are shaping your download at 10MB/s.

I don't know much about tbf though, I can't help you with the configuration. Good luck !

Related Topic