Split RAID array at controller, or at LVM

hardware-raidraid

I have a CentOS box with 10 2TB drives & an LSI RAID controller, used as an NFS server.

I know I'm going to use RAID 1 to create 5TB of usable space. But in terms of performance, reliability & management, which is better, create a single 5TB array on the controller, or create 5 1TB arrays and use LVM to regroup them into one (or more) VGs.

I'm particularly interested in hearing why you would pick one approach or the other.

Thanks!

Best Answer

If the controller will allow you to provision a 10-disk raid 10 (rather than 2 8-disk units with 2 disks left over) that would probably be the best bet. It's simple to manage, you get good write performance with battery backed cache and the RAID card does all the heavy lifting, monitoring, management. Just install the RAID card's agent in the OS so you can reconfigure and monitor status from within the OS and you should be set.

Putting everything in the care of the RAID card makes the quality of the software on the card the most important factor. I have had RAID cards which have crashed causing the whole IO subsystem to "go away" and requiring a server reboot, I've even had instances of a card completely losing the array configuration requiring either it to be carefully reconfigured from the console or the whole thing to be restored from backups. The chances that you, with your one server, would see any particular problem are low, but if you had hundreds or thousands of servers you would probably see these kinds of problems periodically. Maybe newer hardware is better, I haven't had these kinds of problems in a while.

On the other hand it is possible and even probable that the IO scheduling in Linux is better than what's on the RAID card so either presenting each disk individually or as 5 RAID 1 units and using LVM to stripe across them might give the best read performance. Battery backed write cache is critical for good write performance though so I wouldn't suggest any configuration that doesn't have that feature. Even if you can present the disks as a JBOD and have battery backed write cache enabled at the same time there is additional management overhead and complexity to using Linux software raid and smartd hardware monitoring. It's easy enough to get set up but you need to work through the procedure to handle drive failures, including the boot drive. It's not as simple as pop out the disk with the yellow blinky light and replace. Extra complexity can create room for error.

So I recommend a 10-disk RAID 10 if your controller can do it or 5 RAID 1s with LVM striping if it can't. If you test out your hardware and find that JBOD and Linux RAID works better than use that but you should specifically test for good random write performance across a large portion of the disk using something like sysbench rather than just sequential reads using dd.