Iperf3 – discrepancy between TCP and UDP results

linux-networking

Recently I've done some network testing with iperf3 (3.1.7 on Centos 7.5 kernel 3.10, then I tried 3.7+ compiled from the source) and I'm confused about the results.

The syntax I'm using is iperf3 -c -b 500M, both for TCP and UDP testing. My understanding is that by default, iperf3 does some kind of internal pacing to maintain desired rate.

I have read that in newer versions of iperf3 timers that control the pacing mechanism have pretty good resolution (1 ms), so the traffic shouldn't be very bursty.

Thing is that for UDP testing I'm getting packet loss (1%-2%), while TCP testing reaches maximum desired bandwidth and reports no retransmissions.

I don't have a good insight about how the pacing exactly works, but I tend to think that if it's similar both for TCP/UDP and the issue was on the network, I should expect consistent results for UDP/TCP (so, retransmissions in case of TCP).

Has anyone experienced similar results? If so, what was the reason for such a discrepancy?

Best Answer

Packets can get lost at different points in the network. I do not see any reason why the network hardware should have more problems with UDP.

I can imagine that it is more difficult for the receiving system to get the UDP packets to the receiving application in time. I am not familiar with the details but the buffers for incoming packets are limited.

With TCP it is probably easier for the kernel to aggregate the incoming packets into larger data chunks which are given to the application. If the application sends small amounts of data in UDP mode so that each data unit fits into a single UDP packet then there is nothing the receiving kernel can aggregate. With UDP being stateless each packet has to be read separately.

I guess that the higher number of transfers between the kernel and the application in UDP mode may be the bottleneck and that the receiving kernel drops those packets because they are not read by the application in time. Have a look at the Recv-Q column in the output of ss -un on the receiving system.

You could also start tcpdump -p with a filter for just those UDP packets on both systems before you start the benchmark and compare the number of seen packets. If the numbers are the same then you know that the receiving kernel has dropped the packets. If the app sends data chunks which do not fit into a single UDP packet then you may have to disable GSO on both interfaces for seeing the correct numbers of UDP packets. Due to the additional CPU load I would expect the packet loss to increase while tcpdump is running.

You can also have a look at the output of ip -s link, RX: overrun: If that number is greater than zero then packets were dropped by the receiving hardware because the kernel did not read them in time. Those packets would not show up in tcpdump.

Related Topic