Linux – TCP Window Full / Zero Window probe in CentOS 5.4 behind a Squid proxy

centoslinuxnetworkingsquidtcp

A friend's fresh install of CentOS 5.4 is misbehaving behind his university's proxy. If he takes the machine home and connects it directly to the internet, it works fine. If he installs some other OS (even an older version of CentOS) on the machine, it works fine behind the proxy. Only when it's behind the proxy and only when it's running CentOS 5.4 does it have seemingly random connection timeouts and extremely poor throughput.

I had him take a small packet capture just to see what's happening. This is what it turned up:

   9291 532.192095  10.74.88.99           161.112.232.22        TCP      40560 > ndl-aas [ACK] Seq=206 Ack=74213 Win=3328 Len=0 TSV=3733959 TSER=77264420
   9292 532.193750  161.112.232.22        10.74.88.99           TCP      [TCP segment of a reassembled PDU]
   9293 532.193812  161.112.232.22        10.74.88.99           TCP      [TCP segment of a reassembled PDU]
   9295 532.234080  10.74.88.99           161.112.232.22        TCP      40560 > ndl-aas [ACK] Seq=206 Ack=77109 Win=384 Len=0 TSV=3734001 TSER=77264424
   9296 532.658579  161.112.232.22        10.74.88.99           TCP      [TCP Window Full] [TCP segment of a reassembled PDU]
   9297 532.658666  10.74.88.99           161.112.232.22        TCP      [TCP ZeroWindow] 40560 > ndl-aas [ACK] Seq=206 Ack=77493 Win=0 Len=0 TSV=3734426 TSER=77264471
   9298 533.091240  161.112.232.22        10.74.88.99           TCP      [TCP ZeroWindowProbe] [TCP segment of a reassembled PDU]
   9299 533.091407  10.74.88.99           161.112.232.22        TCP      [TCP ACKed lost segment] 40560 > ndl-aas [ACK] Seq=206 Ack=77494 Win=2176 Len=0 TSV=3734859 TSER=77264514
   9300 533.092361  161.112.232.22        10.74.88.99           TCP      [TCP segment of a reassembled PDU]
   9301 533.092397  161.112.232.22        10.74.88.99           HTTP     HTTP/1.0 200 OK  (application/x-rpm)

(161.112.232.22 is the proxy, 10.74.88.99 is the CentOS box, ndl-aas is port 3128 the port squid is running on on the proxy)

Assuming this is causing the connection timeouts in all applications (FireFox, yum update, etc.), I'm wondering why this would happen only on this CentOS 5.4 machine, and only behind the Squid proxy.

The proxy is squid/3.0.STABLE19 running on Linux 1 hop away on the network, and is explicitly configured on the client side (by setting the http_proxy env variable or the appropriate application-specific configuration).

Help, anyone?

Best Answer

You could check the following values in /proc that relate to TCP window scaling:

/proc/sys/net/core/rmem_default
/proc/sys/net/core/rmem_max
/proc/sys/net/core/wmem_default
/proc/sys/net/core/wmem_max
/proc/sys/net/ipv4/tcp_window_scaling

See if these values vary between a machine that works OK with the proxy and the machine you are trying to set up.

Also I don't know if this is related, but you might want to consider is that CentOS 5 has SELinux on by default. This has personally caused me many tricky-to-diagnose problems and you might want to set it to permissive or disabled for testing. (ref: http://wiki.centos.org/HowTos/SELinux)

ps: is this more of a comment then an "answer"? I can't comment yet.

Related Topic