Maximum Number of Disks in a RAID Set – Guidelines

raid

In any of the RAID levels that use striping, increasing the number of physical disks usually increases performance, but also increase the chance of any one disk in the set failing. I have this idea that I shouldn't use more than about 6-8 disks in a given RAID set but that's more just passed down knowledge and not hard fact from experience. Can anyone give me good rules with reasons behind them for the max number of disks in a set?

Best Answer

The recommended maximum number of disks in a RAID system varies a lot. It depends on a variety of things:

  • Disk technology SATA tolerates smaller arrays than SAS/FC does, but this is changing.
  • RAID Controller limits The RAID controller itself may have fundamental maximums. If it is SCSI based and each visible disk is a LUN, the 7/14 rule holds true. If it FibreChannel based, it can have up to 120 or more visible disks.
  • RAID Controller processor If you go with any kind of parity RAID, the CPU in the RAID card will be the limiter on how fast you can write data. There will be a fundamental maximum for the card. You'll see it when a drive fails in a RAID5/6 LUN, as the performance drop will affect all LUNs associated with the RAID card.
  • Bus bandwidth U320 SCSI has its own limits, as does FibreChannel. For SCSI keeping RAID members on different channels can enhance parallelism and improve performance, if the controller supports it.

For SATA-based RAID, you don't want to have more than about 6.5TB of raw disk if you're using RAID5. Go past than and RAID6 is a much better idea. This is due to the non-recoverable read error rate. If the size of the array is too large, the chances of a non-recoverable read error occurring during the array rebuild after a loss get higher and higher. If that happens, it's very bad. Having RAID6 greatly reduces this exposure. However, SATA drives have been improving in quality lately, so this may not hold true for much longer.

The number of spindles in an array doesn't really worry me over much, as it's pretty simple to get to 6.5TB with 500GB drives on U320. If doing that, it would be a good idea to put half of the drives on one channel and half on the other just to reduce I/O contention on the bus side. SATA-2 speeds are such that even just two disks transferring at max-rate can saturate a bus/channel.

SAS disks have a lower MTBF rate than SATA (again, this is beginning to change) so the rules are less firm there.

There are FC arrays that use SATA drives internally. The RAID controllers there are very sophisticated, which muddies the rules of thumb. For instance, the HP EVA line of arrays groups disks into 'disk groups' on which LUNs are laid out. The controllers purposefully place blocks for the LUNs in non-sequential locations, and perform load-leveling on the blocks behind the scenes to minimize hot-spotting. Which is a long way of saying that they do a lot of the heavy lifting for you with regards to multiple channel I/O, spindles involved in a LUN, and dealing with redundancy.

Summing up, failure rates for disks doesn't drive the rules for how many spindles are in a RAID group, performance does. For the most part.