Linux – How dangerous is NFS async when RAID BBU and UPS are present

linuxnfsUbuntu

I have an NFSv3 server and around 15 clients. I am looking for pros and cons enabling async on the server side. I have read about it, but it is still a bit unclear to me. I know it can lead to data corruption, if the server crashes in the middle of write operation. However, I also read that the client stores a cache of that same operation and can recover it, if needed. My questions are:

  • What exactly would happen if my server crashes (i.e. would it lose pending-to-be-written data, would it corrupt the underlying filesystem, etc)?;
  • What would happen if both the server and the client crash at the same time (i.e. power failure/fault and UPS failure to handle it)?;
  • What if the server crashes, but I have RAID BBU. Would the server recover safely?;
  • Is there any way to detect such a corruption (something similar to fsck maybe)?;
  • What if the server shutdown gracefully by UPS? Will I have chances of data corruption then?;
  • What do you guys use – sync or async?

All machines are Ubuntu OS 10.04.

I was trying to find similar question here to no available. I have read the NFS Home Page and took a quick look at Managing NFS and NIS, 2nd Edition book.

Best Answer

So what the NFSv3 spec says, is basically that for the following two NFS data operations

  • WRITE operation with the stable bit set
  • COMMIT

the server is allowed to return success to the client only after the data has hit stable storage. This is what the Linux NFS server implements with the default "sync" export option. With "async", the server can cheat and return success even though data is not on stable storage.

That is, the potential corruption issue with async is basically something along the following

  1. Server returns success for a WRITE or COMMIT operation
  2. Client sees the success, and at some point deletes the pages from its own cache (why waste space keeping them around since they are already on server storage, it thinks)
  3. Server crashes, thus losing the data which was not committed to stable storage
  4. Client reconnects to the server, but as there is no log of which data was written or not, it cannot know exactly which data was lost.

Now, the last point is the serious thing, in that there is no way to know which data was lost/corrupted or wasn't.

OTOH, if the client crashes, then any dirty data in the client cache (that hasn't been flushed) will be lost, but the client programmer can work around it (i.e. only after fsync() or close() returns success can the programmer assume data is on stable storage).