Cisco – Losing unicast traffic with “Input DA reject”

bridgeciscojuniperkvm-virtualizationvlan

I have a rather complicated issue with an old Cisco 5509 Switch and a few kvm/qemu based virtual machines. First, the setup of the system looks like this:

|--------------------------------------------     -------------|
||----------|             VMHOST            |     |5509        |
||VM1       |                               |     |            |
||    ------|                               |     |         101|-------Juniper
||    |vmnic|---vnet0<->br0<->eth3.101--eth3|-----|Trunk       |
||----------|                               |     |            |
|                                           |     |            |
|--------------------------------------------     --------------

So, the virtual machine is not vlan aware, but connects over a bridge that tags up the traffic with vlan 101, this vlan is then sent in on a trunk port on the 5509, which in sends out the traffic on a port (101) removing the tagging delivering it juniper.

The problem is that this setup works well for broadcast traffic. I can arping between vm1 and Juniper. However, unicast is lost somewhere between juniper and eth3, but only traffic from vm1 to juniper!

Some logs:
Arping and ping from vm1 to host

sudo arping 192.168.0.2
ARPING 192.168.0.2
60 bytes from 00:05:85:cc:f2:10 (192.168.0.2): index=0 time=3.354 msec
60 bytes from 00:05:85:cc:f2:10 (192.168.0.2): index=1 time=3.739 msec
60 bytes from 00:05:85:cc:f2:10 (192.168.0.2): index=2 time=1.511 msec
^C
--- 192.168.0.2 statistics ---
3 packets transmitted, 3 packets received,   0% unanswered (0 extra)
PING 192.168.0.2 (192.168.0.2) 56(84) bytes of data.
^C
--- 192.168.0.2 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2004ms

And the dump from eth3 when running the two commands

sudo tcpdump -ei eth3
tcpdump: WARNING: eth3: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth3, link-type EN10MB (Ethernet), capture size 65535 bytes
19:25:06.871102 00:16:3e:3e:02:11 (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 46: vlan 101, p 0, ethertype ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
19:25:06.872563 00:05:85:cc:f2:10 (oui Unknown) > 00:16:3e:3e:02:11 (oui Unknown), ethertype 802.1Q (0x8100), length 64: vlan 101, p 0, ethertype ARP, Reply 192.168.0.2 is-at 00:05:85:cc:f2:10 (oui Unknown), length 46
19:25:07.871848 00:16:3e:3e:02:11 (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 46: vlan 101, p 0, ethertype ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
19:25:07.874369 00:05:85:cc:f2:10 (oui Unknown) > 00:16:3e:3e:02:11 (oui Unknown), ethertype 802.1Q (0x8100), length 64: vlan 101, p 0, ethertype ARP, Reply 192.168.0.2 is-at 00:05:85:cc:f2:10 (oui Unknown), length 46
19:25:08.872454 00:16:3e:3e:02:11 (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 46: vlan 101, p 0, ethertype ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
19:25:09.028734 00:05:85:cc:f2:10 (oui Unknown) > 00:16:3e:3e:02:11 (oui Unknown), ethertype 802.1Q (0x8100), length 64: vlan 101, p 0, ethertype ARP, Reply 192.168.0.2 is-at 00:05:85:cc:f2:10 (oui Unknown), length 46
19:25:13.686148 00:16:3e:3e:02:11 (oui Unknown) > 00:05:85:cc:f2:10 (oui Unknown), ethertype 802.1Q (0x8100), length 102: vlan 101, p 0, ethertype IPv4, 192.168.0.1 > 192.168.0.2: ICMP echo request, id 1002, seq 1, length 64
19:25:14.690923 00:16:3e:3e:02:11 (oui Unknown) > 00:05:85:cc:f2:10 (oui Unknown), ethertype 802.1Q (0x8100), length 102: vlan 101, p 0, ethertype IPv4, 192.168.0.1 > 192.168.0.2: ICMP echo request, id 1002, seq 2, length 64
19:25:15.690788 00:16:3e:3e:02:11 (oui Unknown) > 00:05:85:cc:f2:10 (oui Unknown), ethertype 802.1Q (0x8100), length 102: vlan 101, p 0, ethertype IPv4, 192.168.0.1 > 192.168.0.2: ICMP echo request, id 1002, seq 3, length 64

Ping in the other direction:

run ping 192.168.0.1    
PING 192.168.0.1 (192.168.0.1): 56 data bytes
^C
--- 192.168.0.1 ping statistics ---
3 packets transmitted, 0 packets received, 100% packet loss

And the associated packet dump from eth3, showing that the packets get from Juniper to vm1 and all the way back to eth3 before they disapear.

sudo tcpdump -ei eth3
tcpdump: WARNING: eth3: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth3, link-type EN10MB (Ethernet), capture size 65535 bytes
19:27:46.960138 00:05:85:cc:f2:10 (oui Unknown) > 00:16:3e:3e:02:11 (oui Unknown), ethertype 802.1Q (0x8100), length 102: vlan 101, p 0, ethertype IPv4, 192.168.0.2 > 192.168.0.1: ICMP echo request, id 61736, seq 0, length 64
19:27:46.970773 00:16:3e:3e:02:11 (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 46: vlan 101, p 0, ethertype ARP, Request who-has 192.168.0.2 tell 192.168.0.1, length 28
19:27:46.972689 00:05:85:cc:f2:10 (oui Unknown) > 00:16:3e:3e:02:11 (oui Unknown), ethertype 802.1Q (0x8100), length 64: vlan 101, p 0, ethertype ARP, Reply 192.168.0.2 is-at 00:05:85:cc:f2:10 (oui Unknown), length 46
19:27:46.973052 00:16:3e:3e:02:11 (oui Unknown) > 00:05:85:cc:f2:10 (oui Unknown), ethertype 802.1Q (0x8100), length 102: vlan 101, p 0, ethertype IPv4, 192.168.0.1 > 192.168.0.2: ICMP echo reply, id 61736, seq 0, length 64
19:27:47.959952 00:05:85:cc:f2:10 (oui Unknown) > 00:16:3e:3e:02:11 (oui Unknown), ethertype 802.1Q (0x8100), length 102: vlan 101, p 0, ethertype IPv4, 192.168.0.2 > 192.168.0.1: ICMP echo request, id 61736, seq 1, length 64
19:27:47.960300 00:16:3e:3e:02:11 (oui Unknown) > 00:05:85:cc:f2:10 (oui Unknown), ethertype 802.1Q (0x8100), length 102: vlan 101, p 0, ethertype IPv4, 192.168.0.1 > 192.168.0.2: ICMP echo reply, id 61736, seq 1, length 64
19:27:49.048280 00:05:85:cc:f2:10 (oui Unknown) > 00:16:3e:3e:02:11 (oui Unknown), ethertype 802.1Q (0x8100), length 102: vlan 101, p 0, ethertype IPv4, 192.168.0.2 > 192.168.0.1: ICMP echo request, id 61736, seq 2, length 64
19:27:49.048618 00:16:3e:3e:02:11 (oui Unknown) > 00:05:85:cc:f2:10 (oui Unknown), ethertype 802.1Q (0x8100), length 102: vlan 101, p 0, ethertype IPv4, 192.168.0.1 > 192.168.0.2: ICMP echo reply, id 61736, seq 2, length 64
8 packets captured
8 packets received by filter
0 packets dropped by kernel

Some relevant configuration. First vm1:

eth1      Link encap:Ethernet  HWaddr 00:16:3e:3e:02:11  
          inet addr:192.168.0.1  Bcast:192.168.0.255  Mask:255.255.255.0
          inet6 addr: fe80::216:3eff:fe3e:211/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1953 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3933 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:153032 (153.0 KB)  TX bytes:315162 (315.1 KB)
          Interrupt:10 Base address:0x6000 

Then vmhost:

brctl show
bridge name bridge id       STP enabled interfaces
br0     8000.001e68a9b341   no      eth3.101
                            vnet0

And the interfaces and bridges (non relevant stuff removed)

eth3      Link encap:Ethernet  HWaddr 00:1e:68:a9:b3:41  
          inet6 addr: fe80::21e:68ff:fea9:b341/64 Scope:Link
          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
          RX packets:4306 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4870 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:329486 (329.4 KB)  TX bytes:419680 (419.6 KB)
          Interrupt:47 Base address:0xc000 

eth3.101  Link encap:Ethernet  HWaddr 00:1e:68:a9:b3:41  
          inet6 addr: fe80::21e:68ff:fea9:b341/64 Scope:Link
          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
          RX packets:2082 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3697 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:159118 (159.1 KB)  TX bytes:306482 (306.4 KB)

br0       Link encap:Ethernet  HWaddr 00:1e:68:a9:b3:41  
          inet6 addr: fe80::490:41ff:fea8:25bd/64 Scope:Link
          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
          RX packets:4006 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:296858 (296.8 KB)  TX bytes:468 (468.0 B)

vnet0     Link encap:Ethernet  HWaddr fe:16:3e:3e:02:11  
          inet6 addr: fe80::fc16:3eff:fe3e:211/64 Scope:Link
          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
          RX packets:3940 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2004 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:500 
          RX bytes:315680 (315.6 KB)  TX bytes:184138 (184.1 KB)

The relevant part of the 5509 configuration

set vlan 100-104
set spantree disable all
set trunk 3/8  on dot1q 101-104,201-204,301-304,401-404,501-504,1002-1005
set vlan 101  4/1

And finally the juniper box

fe-0/0/0 {
    unit 0 {
        family inet {
            address 192.168.0.2/24;
        }
    }
}

At this point I am starting to strongly suspect a configuration issue somewhere in the 5509 or bridge setup, but I can't even start to imaging what might cause this issue. Anybody with networking experience that can suggest a way to attack this problem? If you need any more information just ask.

Edit:

Some further debugging hints that this is related to the filters in the Juniper box. I still don't know what's going on, but the "Input DA rejects" counter increases with each lost packet.

run show interfaces fe-0/0/0 extensive    
Physical interface: fe-0/0/0, Enabled, Physical link is Up
  Interface index: 129, SNMP ifIndex: 118, Generation: 130
  Link-level type: Ethernet, MTU: 1514, Link-mode: Half-duplex, Speed: 100mbps,
  MAC-REWRITE Error: None, Loopback: Disabled, Source filtering: Disabled,
  Flow control: Enabled
  Device flags   : Present Running
  Interface flags: SNMP-Traps Internal: 0x4000
  CoS queues     : 8 supported, 8 maximum usable queues
  Hold-times     : Up 0 ms, Down 0 ms
  Current address: 00:05:85:cc:f2:10, Hardware address: 00:05:85:cc:f2:10
  Last flapped   : 2011-01-24 19:03:05 CET (16:10:25 ago)
  Statistics last cleared: Never
  Traffic statistics:
   Input  bytes  :               109620                    0 bps
   Output bytes  :               331366                    0 bps
   Input  packets:                 2035                    0 pps
   Output packets:                 5611                    0 pps
  Input errors:
    Errors: 1, Drops: 0, Framing errors: 0, Runts: 0, Policed discards: 0,
    L3 incompletes: 1, L2 channel errors: 0, L2 mismatch timeouts: 0,
    FIFO errors: 0, Resource errors: 0
  Output errors:
    Carrier transitions: 7, Errors: 0, Drops: 0, Collisions: 0, Aged packets: 0,
    FIFO errors: 0, HS link CRC errors: 0, MTU errors: 0, Resource errors: 0
  Egress queues: 8 supported, 4 in use
  Queue counters:       Queued packets  Transmitted packets      Dropped packets
    0 best-effort                 5611                 5611                    0
    1 expedited-fo                   0                    0                    0
    2 assured-forw                   0                    0                    0
    3 network-cont                   0                    0                    0
  Active alarms  : None
  Active defects : None
  MAC statistics:                      Receive         Transmit
    Total octets                             0           331926
    Total packets                            0             5611
    Unicast packets                          0             3234
    Broadcast packets                        0             2377
    Multicast packets                        0                0
    CRC/Align errors                         0                0
    FIFO errors                              0                0
    MAC control frames                       0                0
    MAC pause frames                         0                0
    Oversized frames                         0
    Jabber frames                            0
    Fragment frames                          0
    VLAN tagged frames                       0
    Code violations                          0
  Filter statistics:
    Input packet count                   64407
    Input packet rejects                 62371
    Input DA rejects                     62371
    Input SA rejects                         0
    Output packet count                                       0
    Output packet pad count                                   0
    Output packet error count                                 0
    CAM destination filters: 1, CAM source filters: 0
  Autonegotiation information:
    Negotiation status: Complete
    Link partner:
        Link mode: Full-duplex, Flow control: None, Remote fault: OK,
        Link partner Speed: 100 Mbps
  Packet Forwarding Engine configuration:
    Destination slot: 0
    Direction : Output 
    CoS transmit queue               Bandwidth               Buffer Priority   Limit
                              %            bps     %           usec
    0 best-effort            95       95000000    95              0      low    none
    3 network-control         5        5000000     5              0      low    none

  Logical interface fe-0/0/0.0 (Index 68) (SNMP ifIndex 136) (Generation 133)
    Flags: SNMP-Traps Encapsulation: ENET2
    Traffic statistics:
     Input  bytes  :               222600
     Output bytes  :               331366
     Input  packets:                 2035
     Output packets:                 5611
    Local statistics:
     Input  bytes  :               112980
     Output bytes  :               328006
     Input  packets:                 1995
     Output packets:                 5571
    Transit statistics:
     Input  bytes  :               109620                    0 bps
     Output bytes  :                 3360                    0 bps
     Input  packets:                   40                    0 pps
     Output packets:                   40                    0 pps
    Protocol inet, MTU: 1500, Generation: 139, Route table: 0
      Flags: None
      Addresses, Flags: Is-Preferred Is-Primary
        Destination: 192.168.0/24, Local: 192.168.0.2, Broadcast: 192.168.0.255,
        Generation: 140

Best Answer

The problem turned out to be a bad CompactFlash card in the Juniper box. The flash card that stored the system image had been corrupted, possibly after doing too many writes. Most likely the corrupt image loaded broken code on the line cards, which in turn made them behave weirdly.

Replacing the compact flash with a new one, loading a fresh image on it and then restoring the configuration got everything working.