One large RAID 10 vs several smaller arrays

performanceraid

My question is: when is it appropriate to simply create one large array with fast read & write performance, such as RAID 10, instead of creating smaller arrays for specific applications?

It seems to me that if my overall I/O requirements are not terribly heavy, a single array with excellent read and write performance could provide better performance overall to all applications except at those (perhaps rare) times when applications with different access patterns peaked at the same time (copying large amounts of large files while a database is getting slammed).

If I dedicate a pair spindles to a particular task, such as transaction logs, and they're not even breaking a sweat with the workload…why not just put that workload onto a larger RAID 10? Those spindles would then be able to contribute to other workloads instead of sitting around scratching themselves 60% of the time.

PS, in my particular case the cost overhead of RAID 10 isn't a factor, because I'm looking at creating multiple RAID 1 arrays and one smallish RAID 5. Going RAID 10 for the amount of space I need would be comparable.

Best Answer

Knowing how to setup your storage is all about measuring and budgeting the IOPS and bandwidth. (I'm being simplistic here because the size of the percentage mix of read/write, average IOs, RAID stripe size, and cache hit percentages matter greatly. If you can get those numbers you can make your calculations even more accurate.)

There's a really nice little IO calculator here that I frequently use when planning out storage. wmarow's storage directory is also nice for getting some fairly contemporary disk performance numbers.

If I dedicate a pair spindles to a particular task, such as transaction logs, and they're not even breaking a sweat with the workload...why not just put that workload onto a larger RAID 10?

Remember that putting sequential IO onto a spindle with random IO makes that sequential IO random. Your transaction log disks may look like they're not breaking a sweat because you're seeing sequential IO operations. Sequential reads and writes to a RAID-1 volume will be quite fast, so if you're basing "not breaking a sweat" on disk queue length, for example, you're not getting the whole story.

Measure or calculate the maximum possible random IOPS for the intended destination volume, take a baseline of the current workload on that volume, and then decide if you have enough headroom to put those transaction log IOPS into the remaining random IO in the destination volume. Also, be sure to budget the space necessary for the workload (obviously). If you're so inclined, build in a percentage of additional "headroom" in your IO workload / space allocation.

Continue with this methodology for all of the other workloads that you want to put into the destination RAID-10 volume. If you run out of random IOPS then you're piling too much into the volume-- add more disks or put some of the workload on dedicated volumes. If you run out of space, add more disks.