Windows server 2012 failover cluster replace shared disk

cluster-shared-volumesfailoverclusterreplacewindows-server-2012

We have cluster of 2 nodes (Windows Server 2012) with file server role. There is resource group configured under cluster which have multiple shared disks. As one of the disk is having file system error we want to replace it with a new disk (LUN clone of existing disk) we already have cloned LUN presented to server. Issue while doing it is:

  1. We took existing problematic disk offline from failover cluster manager
  2. When we remove same disk from resource group all other healthy disks goes into offline mode and get removed from cluster group.

We have checked dependency from file server role properties and it has no dependency on problematic LUN.

Thanks in advance.

Best Answer

I see a couple of potential issues here, let me try to target them one by one:

As one of the disk is having file system error we want to replace it with a new disk (LUN clone of existing disk) we already have cloned LUN presented to server

Two remarks about this:

  1. You should present the new disk to all cluster nodes, not just one.
  2. It seems that you have cloned the disk at hardware/block level. This is obviously introducing the risk that you have cloned the filesystem problem onto the new disk. When experiencing filesystem problems, I would strongly recommend taking a file level backup, and restore this to the new disk, and don't rely on block level operations. I would also recommend that you run a chkdsk on the new disk, but be aware of the fact that chkdsk can remove files in order to repair the filesystem.

When we remove same disk from resource group all other healthy disks goes into offline mode and get removed from cluster group

This could be that the other disks have a dependancy on the disk you just took offline/remove. This is highly unusual, only when you use mountpoints the disks should have a dependency to a disk which is representing the mountpoint. In case of mounted on drive letters, a disk should not have a dependency. Check your dependency report for each of the disks which went offline.

If you have further questions, please update your post (not in comments) and I will update this answer.

HTH, Edwin.