Linux – Raid-10 – how many disks should i span for read performance

linuxperformanceraid10

We manage a mid-sized array of images (4kb -> 1mb or so) using DAS hardware that is NFS exported to various webservers. We have traditionally used raid5 for the safety-net and storage-gains, rather than losing capacity for the sake of better speeds with raid10. We have reached a point where our 16TB arrays are suffering, and read times are slowing down and becoming noticeable to the end user.

I am considering purchasing a new array to test with various raid10 configurations, but before I can get approval for that, I will need to be able to spec the approximate capacity for the new system which leads me to my ultimate question:

How is disk performance affected by spanning N-number of drives? I know that write/read perf is different, so for the sake of this conversation we will limit it to reads.

eg, Would 4 drives in a raid-10 outperform 6 drives in a raid-10, etc.

Is there a metric for establishing this, or known constraints to consider? I know that there is no silver-bullet that works everywhere, but I am hoping for some best practices thoughts.

Thanks for helping me to wrap my head around this one!

Best Answer

If you mean strictly RAID 10 with stripe units consisting of two drives, then the capacity is always equal to half the number of drives. The worst case read performance is twice the performance of a single drive. The best case read performance scales with the number of drives.

With both RAID 10 and RAID 5, you can read from all the drives, though RAID 10 handles small reads better. The biggest difference is write performance, particularly for small writes.

Note that in many cases, large RAID 5 arrays are no longer adequately safe. The time to rebuild a failed drive, even with diligent replacement, may be so long that the risk of a second drive failure in that window and the need to recover from a backup and lose recent data is unacceptable.