I have an NFSv3 server and around 15 clients. I am looking for pros and cons enabling async
on the server side. I have read about it, but it is still a bit unclear to me. I know it can lead to data corruption, if the server crashes in the middle of write operation. However, I also read that the client stores a cache of that same operation and can recover it, if needed. My questions are:
- What exactly would happen if my server crashes (i.e. would it lose pending-to-be-written data, would it corrupt the underlying filesystem, etc)?;
- What would happen if both the server and the client crash at the same time (i.e. power failure/fault and UPS failure to handle it)?;
- What if the server crashes, but I have RAID BBU. Would the server recover safely?;
- Is there any way to detect such a corruption (something similar to
fsck
maybe)?; - What if the server shutdown gracefully by UPS? Will I have chances of data corruption then?;
- What do you guys use –
sync
orasync
?
All machines are Ubuntu OS 10.04.
I was trying to find similar question here to no available. I have read the NFS Home Page and took a quick look at Managing NFS and NIS, 2nd Edition book.
Best Answer
So what the NFSv3 spec says, is basically that for the following two NFS data operations
the server is allowed to return success to the client only after the data has hit stable storage. This is what the Linux NFS server implements with the default "sync" export option. With "async", the server can cheat and return success even though data is not on stable storage.
That is, the potential corruption issue with async is basically something along the following
Now, the last point is the serious thing, in that there is no way to know which data was lost/corrupted or wasn't.
OTOH, if the client crashes, then any dirty data in the client cache (that hasn't been flushed) will be lost, but the client programmer can work around it (i.e. only after fsync() or close() returns success can the programmer assume data is on stable storage).