Should I disable write caching on the Windows 2008 VM

disk-cachewindows-server-2008

I have a Windows Server 2008 x64 Standard virtual machine that runs on a machine with a hardware RAID controller, a Perc 6/i, which has a battery on-board.

Doing everything I can for additional performance, I think I should disable this. Is this very dangerous though?

My understand is that Battery Backed Write Caching gives a performance boost to the host OS, telling it the write was complete when they are still sitting in flash waiting to be written.

However, I can't see how it would be detrimental to performance, but is there a gain (even if marginal) to enabling it / disabling it?

P.s. There machine has a backup power.

Here is a screen shot for clarification:

screenshot

Best Answer

Whenever a Windows application is writing data to disk, this data is written to the host memory first.

A "normal" write request returns immediately after the data is in memory and a queue entry indicating that this data needs flushing to persistent storage is created. A mechanism called the Lazy Writer ensures that this queue is being processed periodically (1/8th of the queue is flushed by the lazy writer every second by default). This is the mechanism you are disabling by unchecking "enable write caching on the disk" - every write request would need to wait until it has been acknowledged as "written" by the storage device before it returns.

Applications with specific requirements for data integrity (databases, filesystem drivers) do have options for a more intelligent approach to caching. For writes which need immediate persistence (NTFS journal, database transaction logs) FILE_FLAG_WRITE_THROUGH can be specified with the write. In this case, the write call would not return before the data is actually committed to persistent storage. Unless you activate the "Enable advanced performance" checkbox which causes the cache manager to ignore FILE_FLAG_WRITE_THROUGH, return the call immediately and pass it to the lazy writer as every other "normal" write.

As you have two additional layers of caching1 (numer one is your host operating system running the KVM hypervisor, number two is your storage controller with a BBWC/FBWC), things are getting more complicated. Each of these layers would provide you with similar choices and as every write request has to pass through all of them, the weakest link of the chain will be effective to your data's integrity.

Application developers at large do know and understand the effects of caching and write through calls. So really critical data parts are written with FILE_FLAG_WRITE_THROUGH while everything not written with this flag can be considered as safe to be cached in volatile memory. The trouble starts when FILE_FLAG_WRITE_THROUGH is being ignored at any layer and the data is actually lost in the case of a power outage or a software failure. Such conditions usually result in corruptions in filesystems and transaction logs, leading to unpredictable results and maybe even requiring you to restore from a backup, so this obviously should be avoided. If your storage controller's cache is "battery-backed" or "flash-backed", it can be considered "non-volatile" to a certain degree, so it generally is considered safe to use its write-back cache even for write-through requests2.

The bottom line: it generally is safe to "Enable write caching on disk" unless you are dealing with broken applications which are not using FILE_FLAG_WRITE_THROUGH but need every write to be persistent. Disabling this would not hurt too much in your case as most calls should be handled by the storage controller's write cache and return nearly immediately (but you likely would have additional overhead from this though and the cache size would be limited by the controller's DRAM). You never should "Enable advanced performance" or "Turn off Windows write-cache buffer flushing on the device" on a system where you value uptime or require data integrity.

Further reading:

MSDN Libary - Windows File Caching
Smallvoid blog - description of the hard disk cache


1 actually there even is another layer of caching at the hard disk itself, but in most cases the write through request is honored no matter what the drive's cache settings are. Some flash drives are notable (read: broken) exceptions to this rule though - with flash SSDs, writes are usually cached and reported as written immediately but only committed to volatile cache - not just for performance reasons but also to coalesce writes and prolong the life time of the flash cells. The "enterprise" versions of flash SSDs usually have capacitors which would ensure the drive has enough power to flush the cache to flash cells, the "consumer" versions often don't - beware of those.

2 it obviously is not safe under all circumstances - if the battery is defective and it goes undetected, if there is a bug in the controller's logic handling the power failure case, if the power outage period exceeds the time the battery is able to provide power, if the supercaps or the flash cells of the FBWC go bust, data is going to be lost. But these occurrences are commonly rare enough to take the risk.