WAN Performance – Methodologies for Performance-Testing a WAN Link

networkingperformancesftp

We have a pair of new diversely-routed 1Gbps Ethernet links between locations about 200 miles apart. The 'client' is a new reasonably-powerful machine (HP DL380 G6, dual E56xx Xeons, 48GB DDR3, R1 pair of 300GB 10krpm SAS disks, W2K8R2-x64) and the 'server' is a decent enough machine too (HP BL460c G6, dual E55xx Xeons, 72GB, R1 pair of 146GB 10krpm SAS disks, dual-port Emulex 4Gbps FC HBA linked to dual Cisco MDS9509s then onto dedicated HP EVA 8400 with 128 x 450GB 15krpm FC disks, RHEL 5.3-x64).

Using SFTP from the client we're only seeing about 40Kbps of throughput using large (>2GB) files. We've performed server to 'other local server' tests and see around 500Mbps through the local switches (Cat 6509s), we're going to do the same on the client side but that's a day or so away.

What other testing methods would you use to prove to the link providers that the problem is theirs?

Best Answer

Tuning an Elephant:
This could require tuning, probably not the issue here as pQd says though. This sort of link is known "Long, Fat Pipe" or elephant (see RFC 1072). Because this is a fat gigabit pipe going over a distance (distance is really time/latency in this case), the tcp receive window needs to be large (See TCP/IP Illustrated Volume 1, TCP Extensions Section for pictures).

To figure out what the receiving window needs to be, you calculate the bandwidth delay product:

Bandwidth * Delay = Product

If there is 10MS latency, this calculator estimates you want a receive window of about 1.2 MBytes. We can do the calculation ourselves with the above formula:

echo $(( (1000000.00/.01)/8  )) 
12500000

So you might want to run a packet dump to see if tcp window scaling (the TCP extension that allows for larger windows) is happening right to tune this once you figure out whatever the large problem is.

Window Bound:
If this is the problem, that you are window size bound with no scaling, I would expect the following results if no Window scaling is in place and there is about 200ms latency regardless of the pipe size:

Throughput = Recieve Window/Round Trip Time

So:

echo $(( 65536/.2 ))
327680 #Bytes/second

In order to get the results you are seeing you would just need to solve for latency, which would be:

RTT = RWIN/Throughput

So (For 40 kBytes/s):

echo $(( 65536.0/40000.0 )) 
1.63 #Seconds of Latency

(Please check my Math, and these of course don't include all the protocol/header overhead)

Related Topic