Iis – TCP Initial Handshake Incomplete (ACK Missing)

iismikrotikmvctcpip

I am irritated about the following scenario:

I am running a MVC application on an IIS 10 Webserver. While calling the URI, it takes about 10 seconds until the application starts to get called (IDLE time). Having no real explanation for the occuring idle, I started to dig deeper using wireshark and stumble across the following phenomenon:

The Initial TCP Handshake is incomplete (Missing ACK)

  1. Client -> Server (SYN)
  2. Server -> Client (SYN, ACK)
  3. Client -> Server (ACK never reaches the server)

Note: The topology is as described below. The client gets redirected to Server2 after the first request. Client -> Gateway -> Router
-> Server1 -> Server2

I used wireshark on client-side as well as on server side. The server resends the SYN,ACK tuple two times, after about 10s IDLE time, the connection is established nonetheless. Looking at the appropriate RFC, this behavior is normal (the ACK is sent by client indirectly while sending the data).
The ACK never reaches the router, so where could it get possibly lost? And why does it get lost EACH time?
Could it be some router (Mikrotik) setting, like No-ACK? Is the missing ACK the reason for the 10s IDLE delay?

EDIT:

I edited the topology above. You find the wireshark traces below:

Due to the anonymization process in TraceWrangler, the IP addresses vary from trace to trace in the following way:

Client IP: 192.168.248.249 <=> 172.23.147.181 <=> 192.168.201.209 <=> 10.206.108.221
Router IP: 10.194.30.227 <=> 172.17.84.111
Server1 IP: 172.31.124.208 <=> 10.100.24.4
Server2 IP: 172.20.78.56 <=> 192.168.204.149

You may use the following filters to get a clear glimpse on the initial handshake:

Filter Client: (((ip.dst ==192.168.248.249) && (ip.src ==10.194.30.227)) || ((ip.dst ==10.194.30.227) && (ip.src ==192.168.248.249))) && (tcp.flags.syn==1 ) || (tcp.flags == 0x0010 && tcp.seq==1 && tcp.ack==1)

Filter Router: (tcp.flags.syn==1 ) || (tcp.flags == 0x0010 && tcp.seq==1 && tcp.ack==1)

Filter Server1: (((ip.dst ==172.31.124.208) && (ip.src ==192.168.201.209)) || ((ip.dst ==192.168.201.209) && (ip.src ==172.31.124.208)) || ((ip.dst ==172.31.124.208) && (ip.src ==172.20.78.56)) || ((ip.dst ==172.20.78.56) && (ip.src ==172.31.124.208))) && (tcp.flags.syn==1 ) || (tcp.flags == 0x0010 && tcp.seq==1 && tcp.ack==1)

Filter Server2: (((ip.dst ==10.100.24.4) && (ip.src ==10.206.108.221)) || ((ip.dst ==10.206.108.221) && (ip.src ==10.100.24.4)) || ((ip.dst ==10.100.24.4) && (ip.src ==192.168.204.149)) || ((ip.dst ==192.168.204.149) &&(ip.src ==10.100.24.4))) && (tcp.flags.syn==1 ) || (tcp.flags == 0x0010 && tcp.seq==1 && tcp.ack==1)

Best Answer

You won't believe it, but the problem got solved in the meantime, I don't really understand, why the following change in RouterOS ensures that the ACK is not lost anymore, and the application gets loaded in a hurry, but I am really relieved about it: Inside the RouterOS Route lists, the Gateway IP was entered as an IP address instead of an interface name / DNS name.

Do you have any explanation for this issue? Does the translation/lookup take so many seconds, and the ACK is ignored in the meantime? I always had a feeling, that it has to be an issue in the RouterOS, but I had no idea how to track it down. What a lucky coincidence that our admin was playing with the router tables, and asked me to check the loadup time again. Can this really be the only change? I was able to confirm in wireshark, that the ACK is not lost any longer.

EDIT:

I exported the ip route settings below to see the differences in the route:

  • Working: add distance=1 dst-address=33.2.1.0/24 gateway=33.2.4.1 pref-src=33.2.4.211
  • Old state: add distance=1 dst-address=33.2.1.0/24 gateway=ETH2 pref-src=33.2.4.211

The gateway is not explicitly defined in the address list, only the router:

  • Address: 33.2.4.211/24 | Network:33.2.4.0 | Interface: ETH2

So, what happens technically, and where does the delay and missing ACK come from?

Related Topic