ReFS / storage spaces drive being dropped under heavy load

hard driverefsstorage-spaceswindows 10

I have a Windows 10 workstation used within my business for things like image processing (Photoshop) and software development (Eclipse). It's an i7-2600K based computer, Gigabyte GA-B75M-D3H B75 motherboard, 16 GB RAM. OS is on Samsung 850 pro SSD, there's another 850 pro for data, WD Black for data, plus two 4GB HGST drives each on SATA 3 ports, formatted ReFS, in a storage spaces mirror. The array has 1.63GB used, 1.99GB free.

Recently the ReFS drives in the storage spaces mirror have started dropping – so far three times in a month. This usually occurs under moderate to heavy load, after an extended period. None of the other disks drop under load as far as I can tell, so I assume it's ReFS, Storage Spaces, or a problem with an underlying disk. A reboot brings the disk online.

I can see errors in the event viewer such as those below. These are not all in one place, and while there are NTFS and Storage Spaces log areas under "application and services log -> microsoft -> windows" there doesn't seem to be one for ReFS.

I'd appreciate help tracking down what's causing these problems, and resolving them, so my system stays up.

16:27.05 (under event viewer -> application and services log -> microsoft -> windows -> storagespaces-driver-operationsl
Virtual disk {26bf58b3-1cb9-4b93-a945-1b89331bb565} requires a data integrity scan.                                    
Data on the disk is out-of-sync and a data integrity scan is required.                  To start the scan, run the following command:                  

Get-ScheduledTask -TaskName "Data Integrity Scan for Crash Recovery" | Start-ScheduledTask                  

Once you have resolved the condition listed above, you can online the disk by using the following commands in PowerShell:                  

Get-VirtualDisk | ?{ $_.ObjectId -Match "{26bf58b3-1cb9-4b93-a945-1b89331bb565}" } | Get-Disk | Set-Disk -IsReadOnly $false                  
Get-VirtualDisk | ?{ $_.ObjectId -Match "{26bf58b3-1cb9-4b93-a945-1b89331bb565}" } | Get-Disk | Set-Disk -IsOffline  $false

16:27.05 (windows system event log): The file system was unable to write metadata to the media backing volume R:. A write failed with status "A device which does not exist was specified." ReFS will take the volume offline. It may be mounted again automatically.
16:27.06 (windows system event log): The file system detected a checksum error and was not able to correct it. The name of the file or folder is "<unable to determine file name>".
18:35.50 (windows system event log): Failed to connect to the driver: (-2147024894) The system cannot find the file specified. 
18:35.50 (Kernel PNP) The driver \Driver\WudfRd failed to load for the device SWD\WPDBUSENUM\_??_USBSTOR#Disk&Ven_Generic&Prod_STORAGE_DEVICE&Rev_9451#7&2a9fd895&0#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}.

18:35.58: Virtual disk {26bf58b3-1cb9-4b93-a945-1b89331bb565} could not be repaired because there is not enough free space in the storage pool.                  
Replace any failed or disconnected physical disks. The virtual disk will then be repaired automatically or you can repair it by running this command in PowerShell:                  
Get-VirtualDisk | ?{ $_.ObjectId -Match "{26bf58b3-1cb9-4b93-a945-1b89331bb565}" } | Repair-VirtualDisk

UPDATE
as yagmoth points out this error includes something about USB. The scenarios where I recall this error happening are
a) When backing up to an external USB disk
b) When running CrashPlan backups to another internal SATA disk

Best Answer

Storage spaces seems very sensitive to write latency: if it too much spikes, the volume can be dropped.

This seems a know problem when using consumer SSDs, as you can find here