Linux software-RAID4/RAID5 and CPU usage

raid5software-raid

In case of RAID4 or RAID5, for each stripe of data bits, a parity bit is stored. For example if I write 0 to drive A and 1 to drive B, then parity bit 1 is stored to drive C. Isn't this a huge load to CPU in case of Linux software-RAID if for each bit of data, a parity bit needs to be calculated? For example if I write a 1GB file to RAID5 array, then 8000000000 XOR calculations needs to be performed by CPU?

Best Answer

As TomTom has said, it's not as brutal as it used to be; but then disc drives have got bigger while CPUs were getting faster.

Which is why it's not a good idea to do RAID-5 in software unless you really don't care about performance. RAID-5 in hardware at least ensures there's a reserved processor whose sole job is to do those parity calculations; also, the hardware will often have things like NVRAM to prevent array corruption, and the ability to optimise the calculations, eg by knowing that a whole stripe is being written and skipping the (hugely expensive) read-modify-write cycle in favour of a simple parity recalculation.

Even with hardware RAID acceleration, applications that modify very small chunks of data at a time - particularly databases - can perform very badly indeed on RAID-5 (and RAID-6, which is even more expensive in terms of parity calculation). For that kind of application, just put your hand in your pocket, get the extra discs, and do RAID-1+0.

Related Topic