Linux – NFS performance woes on Debian

linuxnetworkingnfs

I am having very inconsistent performance with NFS between two wheezy machines, and I can't seem to nail it down.

Setup:

Machine 1 'video1': Dual 5506 w/12GB ram, XFS on 8x3TB RAID6 exported as 'video1' from '/mnt/storage'

Machine 2 'storage1': Phenom X2 @ 3.2Ghtz w/8GB ram, ZFS on 5x2TB exported as 'storage1' from /mnt/storage1-storage

Local write performance:

mackek2@video1:/mnt/storage/testing$ dd if=/dev/zero of=localwrite10GB bs=5000k count=2000
2000+0 records in
2000+0 records out
10240000000 bytes (10 GB) copied, 16.7657 s, 611 MB/s

Local read performance:

Both are connected to the same HP gigabit switch, and iperf gives rock solid 940mbps both ways.

My problem is that when I write to the video1 export from storage1, performance is all over the place. It seems for the first few (5-7) gigs of file transfer (I'm hoping to move around 30-120GB AVCHD or MJPEG files as quickly as possible), performance goes from 900mbps, down to 150-180mbps, so times as slow as 30mbps. If I restart the NFS kernel server, performance picks back up for a few more gigs.

mackek2@storage1:/mnt/video1/testing$ dd if=/dev/zero of=remoteWrite10GB count=2000 bs=5000K
2000+0 records in
2000+0 records out
10240000000 bytes (10 GB) copied, 223.794 s, 45.8 MB/s
mackek2@storage1:/mnt/video1/testing$ dd if=/dev/zero of=remoteWrite10GBTest2 count=2000 bs=5000K
2000+0 records in
2000+0 records out
10240000000 bytes (10 GB) copied, 198.462 s, 51.6 MB/s
mackek2@storage1:/mnt/video1/testing$ dd if=/dev/zero of=bigfile776 count=7000 bs=2000K
7000+0 records in
7000+0 records out
14336000000 bytes (14 GB) copied, 683.78 s, 21.0 MB/s
mackek2@storage1:/mnt/video1/testing$ dd if=/dev/zero of=remoteWrite15GB count=3000 bs=5000K
3000+0 records in
3000+0 records out
15360000000 bytes (15 GB) copied, 521.834 s, 29.4 MB/s

When things are going fast, nfsiostat on the client gives average RTTs of a few ms, but it shoots up to over 1.5seconds RTT as soon as performance drops. Additionally, the CPU queue depth jumps up to over 8 while the write is happening.

Now, when reading from the same export, I get beautiful 890Mbps give or take a few mbps for the entire read.

mackek2@storage1:/mnt/video1/testing$ dd if=remoteWrite10GBTest2 of=/dev/null
20000000+0 records in
20000000+0 records out
10240000000 bytes (10 GB) copied, 89.82 s, 114 MB/s
mackek2@storage1:/mnt/video1/testing$ dd if=remoteWrite15GB of=/dev/null
30000000+0 records in
30000000+0 records out
15360000000 bytes (15 GB) copied, 138.94 s, 111 MB/s

The same thing happens the other way around with storage1 as the NFS server. CPU queue jumps up, speeds drop to crap, and I pull my hair out.

I have tried increasing the number of NFS daemons to as many as 64, and it still sputters out after a few gigs.

Best Answer

You don't include your mount or export options, so there's a number of things with NFS that could be impacting performance. I'd recommend trying the following options for maximum NFS performance and reliability (based on my experiences):

  • Mount Options: tcp,hard,intr,nfsvers=3,rsize=32768,wsize=32768

  • Export Options: async