TCP Handshake error: SYN and SYN/ACK packets are not recognised

linux-networkingnetworkingtcptcpipwireshark

I have very interesting problem:

I have Proxmox hypervisor and two linux vms on it:

  • First vm have several nics in main bridge, each nic added to vm with certain vlan tag on hypervisor.
  • Second vm have only one nic in main bridge, but it have vlan-interfaces within vm.

Network settings are identical, but on second vm something TCP-Handshake is not work. On the other hand ICMP and UDP protocols are works fine.

The problem occurs in all directions of traffic relating the second machine:

  • from second vm to external world.
  • from second vm to router. (and reverse)
  • from second vm to first vm. (and reverse)

How I tested it?

  • ping: works fine
  • nslookup (udp mode): works fine
  • nslookup (tcp mode): timeout error
  • telnet or ssh: timeout error

Then I decided to capture traffic and analyze it in wireshark:

I've seen the same problem everywhere:
SYN and SYN/ACK packets from the second virtual machine is not recognized.
It looks like this pakets are not comes, but they are comes perfectly. (see below)

I show 4 capture here, when I tried to connect on 80 port on vm via telnet.

  • Router: 192.168.32.1
  • First vm: 192.168.32.70
  • Second vm: 192.168.32.80

Successful connecting to first vm (capture from client):

No.     Time        Source                Destination           Protocol Length Info
      2 1.311927    192.168.32.1          192.168.32.70         TCP      74     38873→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54904054 TSecr=0 WS=32
      3 1.347181    192.168.32.70         192.168.32.1          TCP      74     80→38873 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1384 SACK_PERM=1 TSval=57170781 TSecr=54904054 WS=128
      4 1.347223    192.168.32.1          192.168.32.70         TCP      66     38873→80 [ACK] Seq=1 Ack=1 Win=13856 Len=0 TSval=54904058 TSecr=57170781

Successful connecting to first vm (capture from server):

No.     Time        Source                Destination           Protocol Length Info
      1 0.000000    192.168.32.1          192.168.32.70         TCP      74     38873→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54904054 TSecr=0 WS=32
      2 0.000128    192.168.32.70         192.168.32.1          TCP      74     80→38873 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=57170781 TSecr=54904054 WS=128
      3 0.051272    192.168.32.1          192.168.32.70         TCP      66     38873→80 [ACK] Seq=1 Ack=1 Win=13856 Len=0 TSval=54904058 TSecr=57170781

Unsuccessful connecting to second vm (capture from client):

No.     Time        Source                Destination           Protocol Length Info
     25 0.889659    192.168.32.1          192.168.32.80         TCP      74     37740→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54864760 TSecr=0 WS=32
     27 0.925075    192.168.32.80         192.168.32.1          TCP      74     80→37740 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1384 SACK_PERM=1 TSval=210548 TSecr=54864760 WS=128
     34 1.880028    192.168.32.1          192.168.32.80         TCP      74     [TCP Spurious Retransmission] 37740→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54864860 TSecr=0 WS=32
     35 1.915204    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=4294967049 Ack=1 Win=28960 Len=0 MSS=1384 SACK_PERM=1 TSval=210795 TSecr=54864760 WS=128
     51 2.912418    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=4294966799 Ack=1 Win=28960 Len=0 MSS=1384 SACK_PERM=1 TSval=211045 TSecr=54864760 WS=128
     63 3.880067    192.168.32.1          192.168.32.80         TCP      74     [TCP Spurious Retransmission] 37740→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54865060 TSecr=0 WS=32
     64 3.917480    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=4294966549 Ack=1 Win=28960 Len=0 MSS=1384 SACK_PERM=1 TSval=211295 TSecr=54864760 WS=128
     67 5.912529    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=4294966049 Ack=1 Win=28960 Len=0 MSS=1384 SACK_PERM=1 TSval=211795 TSecr=54864760 WS=128
     73 7.890030    192.168.32.1          192.168.32.80         TCP      74     [TCP Spurious Retransmission] 37740→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54865461 TSecr=0 WS=32
     74 7.925401    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=4294965546 Ack=1 Win=28960 Len=0 MSS=1384 SACK_PERM=1 TSval=212298 TSecr=54864760 WS=128

Unsuccessful connecting to second vm (capture from server):

No.     Time        Source                Destination           Protocol Length Info
      1 0.000000    192.168.32.1          192.168.32.80         TCP      74     37740→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54864760 TSecr=0 WS=32
      2 0.000105    192.168.32.80         192.168.32.1          TCP      74     80→37740 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=210548 TSecr=54864760 WS=128
      3 0.990176    192.168.32.1          192.168.32.80         TCP      74     [TCP Spurious Retransmission] 37740→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54864860 TSecr=0 WS=32
      4 0.990240    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=210795 TSecr=54864760 WS=128
      5 1.987305    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=211045 TSecr=54864760 WS=128
      6 2.991251    192.168.32.1          192.168.32.80         TCP      74     [TCP Spurious Retransmission] 37740→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54865060 TSecr=0 WS=32
      7 2.991317    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=211295 TSecr=54864760 WS=128
      8 4.987338    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=211795 TSecr=54864760 WS=128
     11 7.000116    192.168.32.1          192.168.32.80         TCP      74     [TCP Spurious Retransmission] 37740→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54865461 TSecr=0 WS=32
     12 7.000184    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=212298 TSecr=54864760 WS=128

I am not understand why it's can't work. Any ideas?

At the MarkoPolo's request I add the output of commands from second vm:

ifconfig

ens18     Link encap:Ethernet  HWaddr 12:7c:7f:a1:8a:b4  
          inet addr:10.10.100.80  Bcast:0.0.0.0  Mask:255.255.255.0
          inet6 addr: fe80::107c:7fff:fea1:8ab4/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:185759 errors:0 dropped:27 overruns:0 frame:0
          TX packets:1186 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:23593488 (23.5 MB)  TX bytes:173539 (173.5 KB)

ens18.32  Link encap:Ethernet  HWaddr 12:7c:7f:a1:8a:b4  
          inet addr:192.168.32.80  Bcast:0.0.0.0  Mask:255.255.255.0
          inet6 addr: fe80::107c:7fff:fea1:8ab4/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1821 errors:0 dropped:0 overruns:0 frame:0
          TX packets:52 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:195803 (195.8 KB)  TX bytes:3718 (3.7 KB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:160 errors:0 dropped:0 overruns:0 frame:0
          TX packets:160 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1 
          RX bytes:11840 (11.8 KB)  TX bytes:11840 (11.8 KB)

route -n

Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.10.100.1     0.0.0.0         UG    0      0        0 ens18
10.10.100.0     0.0.0.0         255.255.255.0   U     0      0        0 ens18
192.168.32.0    0.0.0.0         255.255.255.0   U     0      0        0 ens18.32

Best Answer

I resolved this issue by adding ens18 to bridge br0 and create vlan interface on bridge br0.32.

It seems that it is ubuntu kernel bug, I tested this issue on arch linux iso and it's work properly. I will send a bug report on launchpad...