Completely replacing (upgrading) a RAID 5 array of disks on an ESXi server

hard driveraid5vmware-esxi

I have a development server that runs several VM on ESXi 5. It has an array of disks in the RAID 5 configuration where all of the disks are currently the same size. I would like to expand storage on this box greatly, but I am not sure what the smartest way to go about this would be. My current plan is to:

  1. Turn off all VM
  2. Copy VM folders from server to another location
  3. Verify that I can mount all the VM on the new location (ie that the copy went ok)
  4. Replace all the disks with new, bigger ones
  5. Reinstall ESXi5
  6. Copy the VM back over

This seems like it might take a while to accomplish and is not terribly slick, especially since I will have to reconfigure ESXi 5, but is there a smarter alternative?

Best Answer

Your current plan is the route I'd take too, though if I may get on a soapbox for a moment:

  1. This very thing is why every reasonably modern server I've seen from a quality OEM has 6 or more drive bays - 2 for a mirror RAID drive to put the OS and installed programs on, and 4 for a RAID 5/6/10 with your data. So you don't have to reinstall everything when you need to add storage capacity.

  2. It's not 1999 anymore, don't use RAID 5 in production. Upgrade it to a RAID 6 or 10. (While you're at it anyway, right?)

Anyhow, another option is to force the RAID array to rebuild onto bigger disks by pulling the disks one at a time, replacing them with a higher capacity disk and waiting for the rebuild to complete... until all your disks are the higher capacity ones. At that point, with most RAID cards, newer, good RAID cards, anyway, you can expand the array to include the rest of the useable space on the drives.

It can be problematic because it's RAID5, so there's a chance of UREs (Unrecoverable Read Errors), because rebuilding from parity takes a long time, so you'll probably be waiting ~one work day per disk for the rebuild, and because the RAID card in question may or may not support this, or may not support this well.

In my experience, it's usually easier, faster, and much less hassle just to do what you've proposed.

The other advantage of replacing the whole array at once is that if things go bad, you can always put the old disks back in and return things to the old state very quickly. If you rebuild the array one disk at a time, that's usually not an option, because you'll end up with inconsistent data on the original disks and won't have a working array if you pop them all back in. Don't underestimate the value of "playing it safe" in this profession.