Cluster Fails When Enabling Storage Spaces Direct on Server 2016

failoverclusterstorage-spaces-directwindows-server-2016

I'm trying to get a 2 Node Hyper-converged Fail-over cluster setup with 2016 Server and Storage Spaces Direct.

I am able to Validate (with no errors) and Create the Cluster with the S2D tests included but when I run Enable-S2D on the Cluster the cluster fails.

What I can see happening is that during the S2D setup the Cluster Service begins to restart repeatedly.

On both nodes I get errors

7032
The Service Control Manager tried to take a corrective action (Restart the service) after the unexpected termination of the Cluster Service service, but this action failed with the following error:
The service cannot be started, either because it is disabled or because it has no enabled devices associated with it.

7031
The Cluster Service service terminated unexpectedly. It has done this "x" time(s). The following corrective action will be taken in 15000 milliseconds: Restart the service.

7024
The Cluster Service service terminated with the following service-specific error:
The cluster join operation was aborted.

As well as Application Event 1000

Faulting application name: clussvc.exe, version: 10.0.14393.2273, time stamp: 0x5ae40d1f
Faulting module name: clussvc.exe, version: 10.0.14393.2273, time stamp: 0x5ae40d1f
Exception code: 0xc0000409
Fault offset: 0x00000000000332f1
Faulting process id: 0x16d0
Faulting application start time: 0x01d3ef1c502ea68c
Faulting application path: C:\WINDOWS\Cluster\clussvc.exe
Faulting module path: C:\WINDOWS\Cluster\clussvc.exe
Report Id: a1700bc1-bf18-464e-b35c-b759832e1382
Faulting package full name:
Faulting package-relative application ID:

I've Destroyed and recreated the cluster several times but no luck.

My drives are all clean and have "CanPool = True". I have a file-share witness configured and validated.

Best Answer

Don’t do S2D with only two nodes as it’s nothing but begging for troubles! Failover isn’t reliable and you can’t lose second node (obviously!) or second disk/ssd in a row. In RL it means your cluster will collapse with a very high probability during patch process and you’ll have hard times getting your data back... In your particular case you have to re-create S2D pool, create virtual disk and only after you have working shared storage you start playing with f/o cluster thing.