Responding to individual concerns in the post...
Regarding Path MTU Discovery
Ideally i would be relying on Path MTU discovery. But since the ethernet packets being generated are too large for any other machine to receive, there is no opportunity for IP Packet too big fragmentation messages to be returned
Based on your diagram, I agree that PMTUD cannot function between two different PCs in the same LAN segment; PCs do not generate ICMP Error messages required by PMTUD.
Jumbo frames
Some vendors (such as Cisco) have switch models which support ethernet payloads larger than 1500 bytes. Officially IEEE does not endorse this configuration, but the industry has valid needs to judiciously deviate from the original 1500 byte MTU. I have storage LAN / backup networks which leverage jumbo frame for good reason; however, I made sure that all MTUs matched inside the same vlan when I deployed jumbo frames.
Mismatched MTUs within a broadcast domain
The bottom line is that you should never have mismatched ethernet MTUs inside the same ethernet broadcast domain; if you do, it's a bug or configuration error. Regardless of bug or error, you have to solve these problems, sometimes manually.
All that discussion leads to the next question...
Why is there a spec that intentionally creates invalid ethernet frames?
I'm not sure that I agree... I don't see how the IEEE 802.3 series, or RFC 894 create invalid frames. Host implementations or host misconfigurations create invalid frames. To understand whether your implementation is following the spec, we need a lot more evidence...
This diagram is at least prima facie evidence that your MTUs are mismatched inside a broadcast domain...
+------------------+ +----------------+ +------------------+
| Realtek PCIe GBe | | NetGear 10/100 | | Realtek 10/100 |
| (on-board) | | Switch | | (on-board) |
| | +----------------+ | |
| Windows 7 | ^ ^ | |
| | | | | |
| 192.168.1.98/24 |-----------+ +------------| 192.168.1.10/24 |
| MTU = 1504 bytes | | MTU = 1500 bytes |
+------------------+ +------------------+
How should an 802.3-compliant implementation respond to MTU mismatches?
What was it they [the writers of 'the spec'] expected people to do with devices that generate these too large packets?
MTU 1504 and MTU 1500 within the same broadcast domain is simply a misconfiguration; it should never be expected to work any more than mismatched IP netmasks, or mismatched IP subnets can be expected to work. Your company will have to knuckle-down and fix the root-cause of the MTU mismatches... at this time it's hard to say whether the root cause is user error, an implementation bug, or some combination of the above.
If the affected Windows machines are successfully logging into to an Active Directory Domain, one could write Windows login scripts to automatically fix MTU issues based on some well-constructed tests inside the domain login scripts (assuming the Domain Controller isn't part of the MTU issues).
If the machines are not logging into a domain, manual labor is another option.
Other possibilities to contain the damage
Use a layer3 switchNote 1 to build a custom vlan for anything that has broken MTUs and set the layer3 switch's ethernet MTU to match the broken machines; this relies on PMTUD to resolve MTU issues at the IP layer. Layer3 switches generate the ICMP errors required by PMTUD.
This option works best if you can re-address the broken machines with DHCP; and you can identify the broken machines by mac-address.
... why did they bump it up to 1504 bytes, and create invalid packets, in the first place?
Hard to say at this point
802.1ad vs 802.1q
How is IEEE 802.1ad (aka VLAN Tagging, QinQ) valid, when the packets are too large?
I haven't seen evidence so far that you're using QinQ; from the limited evidence I have seen so far, you're using simple 802.1q encapsulation, which should work correctly in Windows, assuming the NIC driver supports 802.1q encap.
End Notes:
Note 1Any layer 3 switch should do... Cisco, Juniper, and Brocades all could perform this kind of function.
I's not exactly the answer at your question, but that a simple (but limited) way to do (in certain case) what you want.
I'm coping-post the option -R of ping man page:
-R Record route. Includes the RECORD_ROUTE option in the ECHO_REQUEST packet and displays the route buffer on returned packets.
Note that the IP header is only large enough for nine such routes.
Many hosts ignore or discard this option.
So you can see also the return path of the ECHO_REQUEST, that is not the exit interface (that you are asking about) unless the outgoing path is the same of the come back path. Only in this case, the returning path is the IP address of the outgoing interface you are asking for.
That's an real example on my internet provider net, maybe not so clear, but I don't have just now some router to link each other :)
traceroute 10.2.105.178
traceroute to 10.2.105.178 (10.2.105.178), 30 hops max, 60 byte packets
1 192.168.1.254 (192.168.1.254) 3.418 ms 3.575 ms 4.021 ms
2 10.189.48.1 (10.189.48.1) 11.237 ms * *
3 10.2.105.178 (10.2.105.178) 15.235 ms * *
ping -R 10.2.105.178 PING 10.2.105.178 (10.2.105.178) 56(124) bytes of
data.
64 bytes from 10.2.105.178: icmp_req=5 ttl=253 time=74.1 ms NOP RR:
192.168.1.133
10.189.51.61
10.2.105.177
10.2.105.178
10.2.105.178
10.189.48.1
192.168.1.254
192.168.1.133
----omitted----
64 bytes from 10.2.105.178: icmp_req=6 ttl=253 time=13.0 ms NOP RR:
192.168.1.133
10.189.51.61
10.2.105.177
10.2.105.178
10.2.105.218 ##change every time, Idon't know why##
10.189.48.1
192.168.1.254
192.168.1.133
Best Answer
Ping uses ICMP, but traceroute doesn't necessarily use ICMP so different results are not necessarily unexpected. Some OSes use ICMP and some use something else like UDP.
The traceroute seems to stop in the 185.52.26.xxx network on its way to the next network. Who owns that network, the ISP? It would be beneficial to have the owner of that network see what is happening since that is where the traceroute is failing. The network to which you are trying to trace could be blocking the traceroute, too.
Both ping and traceroute use small packets, so the odds of an MTU problem are fairly small.
Notice that the trace to Google goes a completely different route. There is some sort of routing problem.
EDIT:
For what it's worth, I can't ping or trace to the website, but I can load it in my browser. That has everything to do with my router configuration.