Dell PERC H700 and 8 disk slots : what are the options to increase disk fault tolerance

delldell-percfault-toleranceraid

We've got a Dell PowerEdge T510 which has 8 disk slots in the front. It is currently utilising 6 of these slots, and the disks are controlled by a PERC H700 controller. The disks are currently running in the following configuration

  • 2x500GB in RAID 1 (OS partition)
  • 4x2000GB in RAID 5 (file storage partition)

Considering that this server is our domain controller which hosts the Active Directory, I'd hate to see the server go down due to disk failures. Thus, I'd like to upgrade the server with disk fault tolerance in mind (and not storage capacity or speed). However, I'm not quite sure what my options are.

I was thinking of simply adding 2 more 500GB disks and convert the OS partition from RAID 1 to RAID 6, as this partition would be the most critical. We do have backups of important data on the RAID 5 partition, and it can be recreated within about 24 hours, but the OS partition would need more as it would require me to do a complete restore of Windows Server including Active Directory from backups.

However, I still acknowledge that RAID 5 has the habit of "acting up" during rebuilds, and I would of course love to convert this one to RAID 6 as well. However, the chassis only has 8 slots.

What are my options to secure this system with regards to disk fault tolerance?

Thanks!

UPDATE:

  • Even though not stated initially, we do of course keep backups of the OS partition (Active Directory). I know that RAID is not a backup system.
  • A second DC is not an option due to budget and licensing restrictions on SBS 2011.

The question was aimed towards my disk fault tolerance options. In other words, I'm looking for a good RAID setup.

Best Answer

To answer your question directly, the thing that can best improve your system's tolerance to hard drive failure is to use your extra hard drives to hot spare parts. With such a low disk count, the chances of having two simultaneous drive failure is low enough that you can rely on backup. Having a hot spare, however, will allow the system to recover much more quickly (and safely) from a single drive failure. You'll need two spares, though: one of each size.

I have to add, however, that it sounds like your heading in the wrong direction here. First, you need to understand that RAID is not a backup system. There are many things that can go wrong with your RAID that will cause total data lose: manipulation error, controller failure, viruses, fire/water on the server itself. You need to use a real backup system.

Second, the best way to increase redundancy for an AD is not to improve the disk subsystem or have better backup but to add a second domain controller, if possible at a different physical location. It will improve uptime much more than anything you will do for a single server setup.