RAID – Is RAID5 More Robust Than RAID1?

hardware-raidraidraid1raid5software-raid

I am about to replace an old hardware RAID5 array with a Linux software RAID1 array. I was talking to a friend and he claimed that RAID5 was more robust than RAID1.

His claim was that with RAID5, on read the parity data was read to make sure that all the drives were returning the correct data. He further claimed that on RAID1 errors occurring on a drive will go unnoticed because no such checking is done with RAID1.

I can see how this could be true, but can also see that it all depends on how the RAID systems in question are implemented. Surely a RAID5 system doesn't have to read and check the parity data on a read and a RAID1 system could just as easily read from all drives on read to check they were all holding the same data and therefore achieve the same level of robustness (with a corresponding loss of performance).

So the question is, what do RAID5/RAID1 systems in the real world actually do ? Do RAID5 systems check the parity data on reads ? Are there RAID1 systems that read from all drives and compare the data on read ?

Best Answer

RAID-5 is a fault-tolerance solution, not a data-integrity solution.

Remember that RAID stands for Redundant Array of Inexpensive Disks. Disks are the atomic unit of redundancy -- RAID doesn't really care about data. You buy solutions that employ filesystems like WAFL or ZFS to address data redundancy and integrity.

The RAID controller (hardware or software) does not verify the parity of blocks at read time. This is a major risk of running RAID-5 -- if you encounter a partial media failure on a drive (a situation where a bad block isn't marked "bad"), you are now in a situation where your data have been silently corrupted.

Sun's RAID-Z/ZFS actually provides end-to-end data integrity, and I suspect other filesystems and RAID systems will provide this feature in the future as the number of cores available on CPUs continues to increase.

If you're using RAID-5, you're being cheap, in my opinion. RAID 1 performs better, offers greater protection, and doesn't impact production when a drive fails -- for a marginal cost difference.