Key to answering your question is making a very important distinction: Duplicate Address Detection (DAD) and Multicast Listener Discovery (MLD) are completely separate protocols.
Now, that said, I think this paragraph of RFC2710 explains the "duplicate" transmission:
When a node starts listening to a multicast address on an interface,
it should immediately transmit an unsolicited Report for that address
on that interface, in case it is the first listener on the link. To
cover the possibility of the initial Report being lost or damaged, it
is recommended that it be repeated once or twice after short delays
[Unsolicited Report Interval]. (A simple way to accomplish this is
to send the initial Report and then act as if a Multicast-Address-
Specific Query was received for that address, and set a timer
appropriately).
RFC3810 supersedes RFC2710, but still agrees with the above. (Thanks to @logion for pointing this out!). I'm leaving the quote from RFC2710 in place because I think it lends itself to be more easily understood without additional context. In particular, the part of RFC 3810 that confirms the behavior is here:
To cover the possibility of the State Change Report being missed by
one or more multicast routers, [Robustness Variable] - 1
retransmissions are scheduled, through a Retransmission Timer, at
intervals chosen at random from the range (0, [Unsolicited Report
Interval]).
Which leaves us to having to discuss DAD, and specifically, why DAD is completely independent from MLD.
Before an IPv6 node intends to use an IP address, it must first check if the address in use. This check is done by sending a Neighbor Solicitation (NS) message to the Solicited Node Multicast (SNM) address of the target IP address. The SNM is a multicast address that anyone who would own the target IP address would be listening to.
The Destination IP of the Neighbor Solicitation packet is FF02::1:FFyy:yyyy
(where yy:yyyy
is the last 24 bits of the target IPv6 address). And the Source IP is ::
, the "unspecified" IPv6 address. It must use the Unspecified address, to prevent using the target address itself, since it might trigger a duplicate. At this time, the initial host doesn't know whether the address is unique or not.
If another host on the network already owns the target IPv6 address, they would receive the Neighbor Solicitation message described above, and would respond by sending a Neighbor Advertisement (NA) indicating the IP is already in use.
Here is the key: Since the initial NS was sent from the unspecified address (::
), the other host which owns the target IPv6 address will have no way of knowing who to send the responding NA to. As a result, it must send the result to FF02::1
, the "All Nodes" Multicast address.
This is how DAD can still perform, if you are blocking all MLD messages. Because joining the All Nodes multicast group doesn't require a MLD transaction, all switches that understand IPv6 will automatically forward the "All Nodes" destined frames to all ports in the respective VLAN. DAD, does not require listening on a particular multicast group to receive notification of an address already in use.
If no one claims the IPv6 address, DAD then determines the IPv6 address is unique, and allows the initiating Host to start using the address itself. This is what prompts the joining of the correlating SNM group for the target address with some MLD messages sent to the MLD multicast address of FF02::16
. (and according to the RFC above, this message is intended to be sent twice).
The windows capture seems to use the Source IP before sending out the Neighbor Advertisement that claims the IP on the network which occurs at the end of DAD. I can't help but feel this is jumping the gun a little, since the NA is what marks the end of the DAD process, and the confirmation that the address is indeed unique on the network.
The linux capture has MLD messages sent before the DAD NS, but these are sent from the unspecified address, so can't cause an IP conflict. If any MLD Sniffing switches on the network only need to correlate a MAC address to a joined MLD group, then this would work just fine (Disclaimer: I am not intimately familiar with the intricacies of MLD sniffing, and can not claim whether this is exactly how it works)
The BSD capture doesn't include the NA, but the two MLD messages occur .15 seconds after the DAD NS is sent out. Which to me, seems a bit too short, since the DAD process calls for a wait of just shy of a second before considering an address unique (according to the captures and testing I've done).
Best Answer
A1: Because normally you need to run PIM protocol in order to achieve end-to-end connectivity for multicast network. PIM doesn't normally run between ISP boundaries. It would be pretty complicated and protocol itself wasn't designed for that scale.
A2: It depends. From networking point of view I would say that Internet is merely a collection of public routes, i. e. BGP full view.