The difference between aggregate labels and normal labels is such that normal labels directly point to L2 rewrite details (an interface and L2 address). This means a normal label will be label switched by the egress PE node directly out, without doing an IP lookup.
Adversely, aggregate labels can potentially represent many different egress options, so L2 rewrite information is not associated with the label itself. This means that an egress PE node must perform an IP lookup for the packet, to determine appropriate L2 rewrite information.
Typical reasons why you might have an aggregate label instead of normal label are:
- Need to perform neighbor discovery (IPv4 ARP, IPv6 ND)
- Need to perform ACL lookup (egress ACL in customer interface)
- Running whole VRF under single label (table-label)
Some of these restrictions (particularly 2) are not valid to all platforms.
How traceroute is affected in MPLS VPN environment is by the transit P, when generating the TTL exceeded message, will not know how to return it (it does not have routing table entry to the sender). So a transit P node will send the TTL exceeded message with original label stack all the way to the egress PE node, in hope that the egress PE note has an idea of how to return the TTL exceeded message to the sender.
This feature is automatically on in Cisco IOS but needs 'icmp-tunneling' configured in Juniper JunOS.
Based on this, I would suspect that perhaps your CE devices are not accepting packets when source address is a P node link network, and as they are not accepting the ICMP message, they are not able to return it to the sender.
A Possible way test to this theory would be to enable per-vrf label:
- IOS: mpls label mode all-vrfs protocol bgp-vpnv4 per-vrf
- JunOS: set routing-instances FOO vrf-table-label
Generally speaking I do not recommend propagating TTL, especially on VPN environment, at least in our case customers get confused and anxious about it. They worry why their VPN has foreign addresses showing.
Another thing which confuses people causing them to open a support ticket, is when they are running a traceroute from say the UK to the US, because they see >100ms latency between two core routers in UK, not realizing that the whole path has same latency all the way to the west coast of the US, because all the packets take a detour from there.
This issue is mostly unfixable by design, however in IOS you can determine how many labels at most to pop (mpls ip ttl-expiration pop N) when you are generating TTL exceeded. This gives you a somewhat decent approximation if INET == 1 label, VPN == >1 label, so you can configure it so that VPN traffic is tunnelled and INET traffic gets directly returned without egress PE node detour. But as I said, this is just an approximation of desired functionality, as features like in-transit repairs may cause your label stack not to be always same size for the same service.
Best Answer
LFA - Loop-Free Alternate(s) (also sometimes called LFA-FRR which may add to your confusion) is a feature of IGPs such OSPF and IS-IS to allow nodes to calculate alternative paths to each prefix in the event of a link failure. These backup paths are maintained by each node and rapidly installed upon link/node failure so that traffic flow is preserved while the SPF algorithm is re-run.
In the case of an MPLS network, LFA ensures that the IP underlay of the network recovers quickly, preventing, say an LDP-based path from re-converging until the underlay is again stable.
FRR - Fast Re-Route is the name traditionally given to a similar process used in RSVP-based MPLS networks. Headend nodes for an LSP configured with FRR calculate an alternate, diverse (where possible) path for the LSP to follow, and quickly switch to it in the event of the primary path failing.
To summarise though, the difference between LFA-FRR and MPLS-FRR is essentially that LFA-FRR operates on the underlying IP network, while MPLS-FRR operates on the Label-Switched overlay (provided you are using RSVP for label distribution).