Is ReFS ready to host production VHDXs on Hyper-V 2012 r2 clusters

cluster-shared-volumeshyper-v-server-2012-r2refswindows-server-2012-r2

One of the new features that I didn't see listed in all the "Windows Server 2012 r2" posts is that Clustering now supports CSVs that are formatted with ReFS. So, naturally, I would like to change the CSVs where I store the VHDX files to be ReFS. But the VHDX files are being used to store database files in VMs running Sql Server 2012.

The thought is that I would then have RAID at the hardware level, protecting against instantaneous failure. Above that, the real OS (Hyper-V Server 2012 r2) would maintain them as ReFS volumes, which would protect data on those drives against bitrot. Finally, VHDXs are NTFS drives, which means the applications being supported continue to use the filesystem they rely on.

So far, the best I can find is that this is technically supported—because Hyper-V reports that you must turn off the "data integrity" setting in the VHDX file (Set-FileIntegrity cmdlet) when you try to use it from the ReFS volume. But I can't find any more solid information than that. Is it really ready for prime-time, or is it effectively just a tech-preview for clustering?

Edit: 2014-01-22

I found that ReFS only detects bitrot by itself. In order to have ReFS both detect and auto-fix, you must also use Storage Spaces to create a RAID-1 volume using multiple ReFS drives. So it's looking like my solution is evolving into having the hardware RAID present its disks as JBOD, then Windows would take care of the RAID-1 part. I'll be testing if this is a viable setup in Production over the next month or so.

Best Answer

The answer is a very clear "No".

ReFS only detects bit rot in user data if the file in question has "Integrity Streams" enabled (Sources: official TechNet docs, everyone's favorite blog post, and another spot). Oh, and you also lose COW (Copy-On-Write) when Integrity Streams are disabled. Since you cannot use a VHDX residing on a ReFS volume unless Integrity Streams is disabled, you cannot protect a VHDX against bit rot. Game Over.

It's like the same person who thought a Clustered Storage Space Pool should require at least 3 disks was also the one that made the decision to make the best thing about ReFS something you could turn off, and then got the Hyper-V people to require it to be disabled. It's hard to imagine that amount of "dumb" spread out so far across core teams like that.

Ancillary

While doing some testing, I found the following that may be useful to people who still wish to move forward:

  • You can only SLM (Storage Live Migrate) an in-use VHDX to a ReFS-mirror volume if your destination is a folder where Integrity Streams have been disabled.
    • If you attempt to do SLM onto a ReFS-mirror where Integrity Streams is enabled, you'll get an error with this in it: "The destination '...' is not valid because it is configured with the integrity stream attribute. Select a destination that does not have the integrity stream attribute to continue.". You get the same error when attempting via PowerShell.
  • Copying/Moving a file onto a ReFS-mirror will result in the file having its "integrity bit" set to match the setting from the destination folder.
  • You cannot get/set the integrity bit of a VHDX that is in-use.
  • Otherwise, the performance of a ReFS-mirror volume appears to be good enough (my opinion, of course) for Production. My "differences" test is here if anyone cares.