In discussions that have spurred from other questions on this site, I've realised that I don't have a solid understanding of when Path MTU Discovery (PMTUD) is performed.
I know what it does — discover the lowest MTU on a path from Client to Server).
I know how it does it — send progressively larger packets with their "Don't Fragment" bit set, and see how big of a packet you can get through without getting a "ICMP Need to Fragment" error.
My question is specifically then, when will a host perform PMTUD?
I'm looking for specific cases. Not just something generic like "when a host wants to discover the path MTU". Bonus points if you can provide a packet capture of a host doing it, or provide instructions for generating such a packet capture.
Also, I am specifically referring to IPv4. I know in IPv6 transient routers aren't responsible for fragmentation, and can imagine that PMTUD happens much more commonly. But for now, I'm looking for specific examples of PMTUD in IPv4. (although if the only packet capture you can put together of PMTUD is in IPv6, I would still love to see it)
Best Answer
The answer is simple: whenever the host pleases. Really. It's that simple.
The explanation below assumes an IPv4-only environment, since IPv6 does away with fragmentation in the routers (forcing the host to always deal with fragmentation and MTU discovery).
There is no strict rule that governs when (or even if) a host does Path MTU Discovery. The reason that PMTUD surfaced is that fragmentation is considered harmful for various reasons. To avoid packet fragmentation, the concept of PMTUD was brought to life as a workaround. Of course, a nice operating system should use PMTUD to minimize fragmentation.
So, naturally, the exact semantics of when PMTUD is used depend on the sender's operating system - in particular, the socket implementation. I can only speak for the specific case of Linux, but other UNIX variants are probably not very different.
In Linux, PMTUD is controlled by the
IP_MTU_DISCOVER
socket option. You can retrieve its current status withgetsockopt(2)
by specifying the levelIPPROTO_IP
and theIP_MTU_DISCOVER
option. This option is valid forSOCK_STREAM
sockets only (aSOCK_STREAM
socket is a two-way, connection-oriented, reliable socket; in practice it's a TCP socket, although other protocols are possible), and when set, Linux will perform PMTUD exactly as defined in RFC 1191.Note that in practice, PMTUD is a continuous process; packets are sent with the DF bit set - including the 3-way handshake packets - you can think of it as a connection property (although an implementation may be willing to accept a certain degree of fragmentation at some point and stop sending packets with the DF bit set). Thus, PMTUD is just a consequence of the fact that everything on that connection is being sent with DF.
What if you don't set
IP_MTU_DISCOVER
?There's a default value. By default,
IP_MTU_DISCOVER
is enabled onSOCK_STREAM
sockets. This can be read or changed by reading/proc/sys/net/ipv4/ip_no_pmtu_disc
. A zero value means thatIP_MTU_DISCOVER
is enabled by default in new sockets; a non-zero means the opposite.What about connectionless sockets?
This is tricky because connectionless, unreliable sockets do not retransmit lost segments. It becomes the user's responsibility to packetize the data in MTU-sized chunks. Also, the user is expected to make the necessary retransmits in case of a Message too big error. So, essentially user code must reimplement PMTUD. Nevertheless, if you're up for the challenge, you can force the DF bit by passing the
IP_PMTUDISC_DO
flag tosetsockopt(2)
.The bottomline