How is congestion avoided when using UDP

congestiontransport-protocoludp

I understand how TCP goes about this – through various methods such as Congestion Window (CWND), Sliding Window, Slow Start and Fast Recovery – it's basically built into the protocol. I understand that when shaping or MSS clamping is applied along the path, TCP can adapt its segment size and transfer rate accordingly.

I also understand that QUIC (which runs over UDP) implements its own version of loss detection and congestion control. QUIC is sort of like a highly-tuned, low-latency, more extensible version of TCP, built for HTTP3, but it can also transport any other application data.

However it's not clear to me how applications – which do not use QUIC – can determine:

How many UDP datagrams can be sent at once
How large each UDP datagram can be (if PMTUD does not work due to ICMP replies being disabled/blocked by middleboxes)

It's not really clear to me how UDP, being a best effort protocol, doesn't just flood the link/path between the sender and recipient, resulting in significant packet loss.

Or put in other way: How can an application, which needs to send a high volume of data, either statically or dynamically determine the available bandwidth on the path between the sender and receiver, and adjust its datagram size and transfer rate accordingly?*

I understand that RTP has things like RTCP that allow it to limit/increase flow rates or change codecs to reduce/increase packet sizes, for quality control. When running Iperf tests using UDP, you can set the available bandwidth with the "-b" attribute. DNS and DHCP, which use UDP, have comparatively tiny datagram sizes, and will just retry until they receive a response or time out.

*The answer, based on my current understanding, would be: The application would just be designed to use TCP if it needed to send/receive a high volume of data (such as backups or high-definition streaming), and simply wouldn't use UDP unless it also had a supporting out-of-band protocol like RTCP or could be certain of the bandwidth available (i.e. set by the user, like with Iperf).

Best Answer

In addition to Steffen's fine answer perhaps more direct replies:

how applications can determine:

How many UDP datagrams can be sent at once

They can't. Neither IP nor UDP provide any mechanism to determine that. A host can send UDP datagrams at any rate that its interface(s) allow. An application can exceed that rate as long as the OS's stack can buffer the send requests.

Generally, an application utilizing significant network bandwidth with UDP needs to implement some kind of congestion control on the application layer.

How large each UDP datagram can be (if PMTUD does not work due to ICMP replies being disabled/blocked by middleboxes)

Using fragmentation, an application can send UDP datagrams up to an encapsulating IP packet's maximum size (64 KiB).

Without fragmentation and no feedback at all from either the network or the destination, the application can only guess. For IPv4 the minimum guaranteed IP fragment size is just 68 bytes (detailed here). For IPv6 that's 1280 bytes.

Related Solutions

Tcp – Using StreamSocket() mrthod of Scapy to send ECN enabled TCP packets

When using a StreamSocket, the packets sent are sent over the TCP connection. In Scapy notation, when you type serverstream.sr(IP(src='a.a.a.a',dst='b.b.b.b')/TCP(flags=0x050)) (BTW, you can replace flags=0x050 by flags="AE"), it will result in sending a packet like:

IP(src="a.a.a.a", dst="b.b.b.b") / TCP(flags="PA") / \
    IP(src="a.a.a.a", dst="b.b.b.b")/TCP(flags="AE")

The first IP()/TCP() layers are not under Scapy's control, but under the server's network stack control (e.g., you cannot change the flags there with Scapy).

The second IP()/TCP() layers, converted as an str() object, are the packet crafted by Scapy (so you have a total control over it). On the network, and for Wireshark, the packet is only raw data transported over TCP. You can guess that the fourth and sixth packets shown on your screenshot are IP()/TCP()/IP()/TCP() thanks to the length of the data (40 bytes, len(IP()/TCP()) returns 40).

You can save your capture (as PCAP, not PCAPNG) from Wireshark, and then load it in Scapy to decode it:

>>> x = rdpcap('your_capture.cap')
>>> PacketList(IP(str(p[TCP].payload)) for p in x
...            if TCP in p and p[IP].len >= 80).show()

If you want to control the TCP options with Scapy, then you have to create the packets and send them with sr() (but then you will have to open the connection by yourself).

Tcp – Congestion control,flow control, MTU and MSS, Rwnd-Cwnd and slow-start

A host indicates the largest TCP segment size (local MTU - (IP overhead + TCP overhead)) it can receive by the MSS. The MTU is the largest IP packet the underlying L2 transport can send/receive. Usually, they are directly related.
The receive window is another TCP connection parameter not directly related to MSS or MTU. It's rather about path throughput and latency.
Likely there are slower hops in the path than either receiver or sender can see. They must not be overrun. Often both receiver and sender have fast local connections (at least 100 Mbit/s) while they are connected over much slower Internet connections (say 10 Mbit/s).

Best Answer

Related Solutions

Tcp – Using StreamSocket() mrthod of Scapy to send ECN enabled TCP packets

Tcp – Congestion control,flow control, MTU and MSS, Rwnd-Cwnd and slow-start

Related Topic