Nfs – How to set i/o priority for nfs client processes

ioionicenfsnice

The configuration is: A linux server and a nas box (netgear) acting as nfs server.

It is easy for a single process on the linux server to use all i/o bandwidth by simply copying a file from the nfs share to the nfs share. The i/o channel is jammed and all other processes on the server will nearly halt waiting for i/o. Load grows up to 10-20 (four cores), more and more pdflush processes appear… until someone stopps the file copy.

How can I limit the i/o bandwidth the cp process uses? nice will not help of course, but also ionice -c3 has no effect. Does ionice affect nfs mounts at all? Is there something like nfsnice ?

Best Answer

What are your "rsize" and "wsize" values set to?

Normally, modern linux NFS clients negotiate the maximum values with the server, but sometimes, they can end up way off base. For example, we had rsize=1m,wsize=1m in /proc/mounts, not knowing the NAS being unable to support more than 32768. Same slowliness, same effect of load skyrocketing as you describe.

Setting both values down to 32k immediately solved the slowliness and the rising load for us, desktop remained perfectly responsive even while copying gigabytes per NFS. And we have our home directories on NFS...

Perhaps your NAS's NFS server implementation does a little "show off" by offering more size than it can chew...?

Cheers

Related Solutions

NFS IO priority on ZFS/Solaris

AFAIK, to ZFS all I/O is I/O. What I mean by that is that it won't differentiate between your local operations and the ones NFS is asking ZFS to do.

You could play with the scheduling classes to somehow slow down your userland process that is copying all this data locally.

BTW, your dedicated 1TB disk for write log device won't help you at all, unless that specific disk is much faster than the rest (eg. SATA 7200 vs SAS 15k). We usually use SSDs for log/cache devices, or nothing at all.

Linux – NFS client has unbalanced read and write speeds

Adding the noac nfs mount option in fstab was the silver bullet. The total throughput has not changed and is still around 100 MB/s, but my read and writes are much more balanced now, which I have to imagine will bode well for Postgres and other applications.

enter image description here

You can see I marked the various "block" sizes I used when testing, i.e. the rsize/wsize buffer size mount options. I found that an 8k size had the best throughput for the dd tests, surprisingly.

These are the nfs mounts options I'm now using, per /proc/mounts:

nfsc:/vol/pg003 /mnt/peppershare nfs rw,sync,noatime,nodiratime,vers=3,rsize=8192,wsize=8192,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,hard,noac,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.x.x.x,mountvers=3,mountport=4046,mountproto=tcp,local_lock=none,addr=172.x.x.x 0 0

FYI, the noac option man entry:

ac / noac

Selects whether the client may cache file attributes. If neither option is specified (or if ac is specified), the client caches file attributes.

To improve performance, NFS clients cache file attributes. Every few seconds, an NFS client checks the server's version of each file's attributes for updates. Changes that occur on the server in those small intervals remain undetected until the client checks the server again. The noac option prevents clients from caching file attributes so that applications can more quickly detect file changes on the server.

In addition to preventing the client from caching file attributes, the noac option forces application writes to become synchronous so that local changes to a file become visible on the server immediately. That way, other clients can quickly detect recent writes when they check the file's attributes.

Using the noac option provides greater cache coherence among NFS clients accessing the same files, but it extracts a significant performance penalty. As such, judicious use of file locking is encouraged instead. The DATA AND METADATA COHERENCE section contains a detailed discussion of these trade-offs.

I read mixed opinions on attribute caching around the web, so my only thought is that its an option that is necessary or plays well with a NetApp NFS server and/or Linux clients with newer kernels (>2.6.5). We didn't see this issue on SLES 9 which has a 2.6.5 kernel.

I also read mixed opinions on rsize/wise, and usually you take the default, which currently for my systems is 65536, but 8192 gave me the best tests results. We'll be doing some benchmarks with postgres too, so we'll see how these various buffer sizes fare.

Best Answer

Related Solutions

NFS IO priority on ZFS/Solaris

Linux – NFS client has unbalanced read and write speeds

Related Topic