RAID hard drive preventive replacement

raid

From experience I learned that every hard drive will fail, it's just a matter of time.

I have learn my lesson the hard way and now I do backup.

When I bough new drive, I often segregate my drives list with the warranty period. Hard drive manufacturer are there to make money and obviously, most of the time, they designed their hard drive to last at least the warranty period. So after that period, I expect the failing rate to be greater. I already had 2 of 3 drive of a RAID 5 failed almost at the same time (second drive failed when reconstructing the array and yes I had a recent backup).

My question is:
What is the best practice with preventive replacement of hard drive in a RAID after the warranty?

Do you care about it? How many drive in the array do you replace?


Notes on responses
When creating a new array: use drives from different manufacturer / batch.
When having an already old array: add a new spare.

Best Answer

It depends on whether you're talking about server-class gear or desktop-class gear.

If it's a desktop machine built with your own money and off-the-shelf drives, and you're not worried about compatibility, then yes, your strategy is sound. Every X years, go out and buy all-new drives to replace your current drives. They're going to be faster, quieter, and larger. You could replace the drives individually, letting the array rebuild itself, and then when the rebuilds are complete, reconfigure your array to be larger. (Not all raid adapters support operations like this - online rebuilds and size changes.)

If it's a server-class machine like an HP Proliant or IBM System X, it gets more complicated. You may need to use hard drives on the compatibility list for your raid adapter. In that case, the drives are going to be expensive because they're probably no longer produced, or they're just plain expensive to begin with on server-class stuff anyway. Even worse, you might be buying refurb gear from your reseller and not knowing it - this isn't uncommon with server resellers.

Plus, you may be discarding drives with perfectly good lifespans and replacing them with drives that are destined for trouble. Rather than proactively replacing those, it makes more sense to build the server with a hot spare to begin with, and make sure your raid array supports automatic rebuilds using a hot spare. Then the rebuild will happen before you even get out of bed to make it into the datacenter, and you can replace the dead drive at your leisure without spending money or time.