What causes duplicate ACK records

tcpwindows-server-2003windows-xpwireshark

We're reviewing Wireshark captures from a few client machines that are showing multiple duplicate ACK records which then triggers retransmit and out-of-sequence packets.

These are shown in the following screen shot. .26 is client and .252 is server.

enter image description here

What causes the duplicate ACK records?

More background if it helps:

We're investigating network throughput concerns at one particular client site. The perceived issue from a user interface perspective is that data is being transmitted slowly despite an underutilized 1gbps WAN connection.

Almost all of the client machines have the same issue, tested at more than 20 machines. We did find two machines that do not have the problem. We're in the process of identifying what is different in their configuration. We did notice that in the two machines that do not have the problem, we only ever saw at most one duplicate ACK record. The machines that have the problem usually have three duplicate ACK records. One notable difference is that the machines that work fine all belong to members of the network operations team and all of other machines are for "regular" employees. The machines are supposed to be standard but the network admins could have made changes on their local systems, which is another aspect we're researching.

We tried changing the TcpMaxDupAcks setting on the server but the value we really need is 5 and the valid range is only 1-3.

Server is Windows Server 2003. Clients are all enterprise managed Windows XP. All clients, including the two working ones, have Symantec anti-virus installed.

This is the only client site out of hundreds that has exhibited this problem.

pathping shows 56ms RTT and consistent 0/100 packet loss even from the problem machines.

Thanks,

Sam

Best Answer

Note: I'm assuming that this capture was taken on the client machine.

A brief summary on TCP sequencing: TCP reliably delivers streams of bytes between two applications. "Reliably" in this case means that, among other things, TCP guarantees to never deliver out of order data to a listening application.

In-order, reliable delivery is implemented through the use of sequence numbers. Every packet in each stream is assigned a 32 bit sequence number (remember that TCP is effectively two independent streams of data, A->B and B->A). If A sends an ACK to B, the value in the ACK field is the next sequence number A expects to see from B.

From the above, it appears that at least one TCP segment being sent from the server to the client was lost. The three duplicate ACKs in sequence are an attempt by the client to trigger a fast retransmit. When a TCP sender receives 3 duplicate acknowledgements for the same piece of data (i.e. 4 ACKs for the same segment, which is not the most recently sent piece of data), it can reasonably assume that the segment immediately after the segment being ACKed was lost in the network, and results in an immediate re-transmission.

In this case, the re-transmission gets through, and is identified by Wireshark as out-of-order.

As mentioned by joeqwerty, packet loss is most often caused by congestion. It may also be a result of CRC or other errors on a link, due to a bad interface card, loose cable, etc. I'd look at the stats of every link along the path to see if any are highly utilized and/or are experiencing large numbers of errors.

If you can't see any obvious candidates, perform concurrent packet captures at multiple points along the path to try and isolate where the loss is occurring.

What kind of WAN connection is in use here? Is it a dedicated line? MPLS VPN link? IPsec VPN over the public internet? Something else?