ICMP replies – ingress or egress interface (e.g. from a traceroute)

icmpipv4networkingrfctraceroute

When a traceroute is initiated and receives an ICMP reply from the nodes, which interface

should the be replying be from according to RFC 1812.
they are actually replying from ingress (where they receive the packet) or egress (where the packet would have been sent out
to – i.e. to the next node if ttl was higher)

Personal comments and research:

According to a NANOG published slides, the RFC 1812 states that it should be the egress interface. I've read the ICMP section of RFC 1812 and could not find where it states that (I suspect my understanding of the terminology is off).
I've read various routers (Junos, Cisco) reply from different interfaces, yet most reply form ingress (as stated by the NANOG slide 10).

I don't have a virtual Cisco lab nor do I think I have enough RAM to set up several VM routers in VirtualBox.

Best Answer

Hi I'm late but in the event you're still curious...

The quote from R Steenbergen's NANOG slide is correct. The behaviour is defined in Section 4.3.2.4 of RFC1812, which states:

the IP source address in an ICMP message originated by the router MUST be one of the IP addresses associated with the physical interface over which the ICMP message is transmitted

Depending on one's Traceroute implementation, the response to a trace may be an ICMP Destination Unreachable (i.e. Unix-implemented traceroute) or an ICMP Time Exceeded (Windows-initiated traceroute). I believe this is covered in Steenbergen's presentation. Since neither of these sections make any provision for specifying source address of the ICMP response, we assume that Section 4.3.2.4 holds for these specific response types.

Picture this scenario and assume the following:

Assume all links between circles (routers) are equal-cost layer-3 links (in particular, that the link between R1 and R2 is not a LAG/EtherChannel/etc)
Routing within example network is such that packets go from Sender S to Receiver R over the lower path, and return via the higher path, in the directions indicated

The Traceroute under modern day implementations would look like this:

traceroute to R 
 1  A 0.329 ms  A 0.425 ms A 0.471 ms
 2  C 0.349 ms  C 0.435 ms C 0.473 ms
 3  F 0.359 ms  G 0.445 ms F 0.481 ms
 4  R 0.369 ms  R 0.455 ms R 0.491 ms

And the trace if routers were coded to the spec would look like this:

traceroute to R 
 1  B 0.329 ms  B 0.425 ms B 0.471 ms
 2  D 0.349 ms  E 0.435 ms D 0.481 ms
 3  H 0.369 ms  H 0.445 ms H 0.491 ms
 4  R 0.389 ms  R 0.455 ms R 0.496 ms

So in a more colloquial sense, modern implementations tell us how we reached a particular host. The original specification would tell us how we left a router, but would not tell us how we got there.

Note that we might think this would cause Ping to break, but the specification covers that case explicitly:

The IP source address in an ICMP Echo Reply MUST be the same as the
specific-destination address of the corresponding ICMP Echo Request
message.

In other words, for Ping, the ICMP Echo Reply source address shouldn't be an address associated with the egress interface as specified by Section 4.3.2.4, but should instead use a source address derived from the destination address of the original ICMP Echo Request.

Related Solutions

Networking – Why Network Stack Ignores ICMP Replies from Non-Default Interface

Thanks to Rafał Ramocki - solution is simple - you have to turn off rp_filter-ing on eth2 interface:

echo 0 > /proc/sys/net/ipv4/conf/eth2/rp_filter

From kernel docs:

rp_filter
---------

Integer value determines if a source validation should be made. 1 means yes, 0
means no.  Disabled by default, but local/broadcast address spoofing is always
on.

If you  set this to 1 on a router that is the only connection for a network to
the net,  it  will  prevent  spoofing  attacks  against your internal networks
(external addresses  can  still  be  spoofed), without the need for additional
firewall rules.

While nice for preventing spoofing attacks (at least some), it definitely kills some functionality if you have more internet connections.

Iptables – Does traceroute use UDP or ICMP or both

The type of packet that is sent differs depending on the implementation. By default Windows tracert uses ICMP and both Mac OS X and Linux traceroute use UDP. I don't have BSD or Solaris machines or any other OS on hand to check but the man page for the Mac OS X version mentions its provenance is BSD 4.3.

The Mac and Linux versions I have offer the ability to choose a variety of different protocols including ICMP, TCP, UDP and GRE packets. Other protocols can be specified by their name or number but traceroute doesn't know anything about how other protocols work. It just blindly sends them.

They can also both change the payload and the source and destination ports in order to avoid firewalls or discover which router along the path is dropping packets of a certain size.

All versions of traceroute rely on ICMP type 11 (Time exceeded) responses from each hop along the route. If ICMP type 11 responses are being blocked by your firewall, traceroute will not work. These packets are inbound, not outbound.

ICMP type 30 is specifically designated for traceroute and is labeled as an "Information Request". I haven't been able to find anywhere where this is actually used. The man page for the Mac OS X and Linux versions says that -I will send ICMP type 8 (echo request). Wikipedia says that Windows tracert also uses ICMP echo requests. ICMP type 30 or type 8 are outbound packets, not inbound.

ICMP type 0 (echo response) may come back as the very last packet when the TTL exactly equals the number of hops. Traceroute will know it has finished when it receives one of these. This is an inbound packet.

TCP SYN packets will cause either a RST packet or a SYN ACK packet in response when they reach their destination. If you receive a SYN ACK packet, it's polite to follow up with a RST packet so as not to leave a half-open connection on the server.

It is possible to get ICMP type 3 code 4 responses back instead of ICMP type 11 responses if you send a large packet with the "Do not fragment" flag set, however this is likely only to allow you to find the hop with the smallest MTU. You will normally only get this sort of response from one hop along the route. Not all of them.

Best Answer

Related Solutions

Networking – Why Network Stack Ignores ICMP Replies from Non-Default Interface

Iptables – Does traceroute use UDP or ICMP or both

Related Topic