Disadvantages of more RAID 5 arrays in a RAID 50 array

raidraid5storage

We have a 24 disk SAN, currently configured in RAID 50, with a RAID 0 stripe across two RAID 5 arrays with 11 disks in each RAID 5. The two remaining disks are allocated as hot spares, one for each RAID 5.

Initial RAID 5 setup with two RAID 5 arrays

I'd like to move this setup to use RAID 50 with three RAID 5 arrays inside the RAID 0 stripe. This increases the amount of disks that can fail before the array is lost, as well as decreases the chance that two disks fail in the same RAID 5 array. It may also have performance benefits.

Desired RAID 50 setup with three RAID 5 arrays

What disadvantages does moving to a greater number of RAID 5 arrays in a RAID 50 setup have? Obviously, you are sacrificing usable capacity for increased resiliency, but are there any other disadvantages going from a RAID 50 with two larger RAID 5 arrays to a RAID 50 with three (or more) smaller RAID 5 arrays?

Best Answer

Moving this raid-50 array to use smaller subsets of R5 would present a few advantages. R5 doesn't scale well past five or so drives in most implementations, with sharply decreasing returns and sometimes worse performance with (let's say nine) drives rather than five. Eleven is pushing it, but these numbers greatly depend on the controller you're using.

With each R5 moving more efficiently and predictably, the R0 can combine their performance easily and you should see measurable performance differences between the implementations. And as you mentioned, you'll be able to suffer more drive failures.

However, I would generally advise to avoid RAID-5 arrays. R5 has several serious flaws that affect whole-array integrity (see about the RAID-5 write hole), the most important of which is due to the design concept itself - an array with a single missing member is degraded, and it only takes a single error in that degraded array to damage data or bork the entire array. A single data error either across an entire drive or across a more limited area is quite likely statistically when operating many large disks in an array. Obviously hot spares mitigate this issue, but you're still leaving a rather large hole for bad things to happen during rebuilds or power failures. Additionally, R5 performs quite poorly when degraded and is pretty sub-optimal for random write heavy workloads even when not degraded.

Raid-10 would be a lot more stable and would perform much faster, but has obvious expense concerns. Raid-60 could be a great middle ground depending on your workload, and wouldn't seriously impact your storage costs. Using two R6s would allow for four drive failures without one failure being a potential imminent catastrophe, and R6 will perform much better when scaled to that amount of drives.