Raid1, one disk displaying SMART errors:

hardware-raidraid1

I have a HP Pavillion machine, and I'm using the RAID controller built into the motherboard (H-RS880-uATX, some consumer grade RAID controller, not sure who).

I got a slew of event log errors from the AMD RAID API the other night. I ran a SMART test from the BIOS on both drives and the second drive failed on reads.

I think the right answer here is to break the RAID array, to run off a single drive for a while, buy a new drive to replace the drive that is failing SMART testing, and rebuild the mirror.

However, I'm not sure, so I'm asking. It would be nice to salvage the second drive if possible. Could I run a checkdisk on it and continue using it as part of the RAID array? What about as an extra non-RAID data disk?


(H-RS880-uATX specs: http://h10025.www1.hp.com/ewfrf/wc/document?docname=c01925486&lc=en&dlc=en&cc=us&product=4195937#N33)

(Event log error: Task 20 disk error on port 4 target 1 at LBA 0x053e065f3 (Length 0xf3) with status 51; Error register: 40)

Best Answer

Actually, your answer is not completely correct. in all likelihood you do not want to break the RAID.

First, if the system is under warranty, call HP. If it is close to still under warranty, I would call and see if they will cut you a break. Make sure to tell them that you are actually using the hardware RAID.

1/ Get a replacement drive. If the system is under warranty, HP should send one. If not, go buy one or order one online. The replacement should be the same disk if at all possible. If not, it needs to the the same size or bigger.

2/ If you haven't already, take a backup. At the very least, get a Dropbox or SugarSync account and get a copy of your important stuff off the machine.

3/ If you don't have one, create a Recovery Disk. The specific procedure depends on your OS.

4/ If you haven't already, figure out specifically which drive has failed. The error codes might make it clear, or the raid array management utility might tell you.

5/ I presume it is not a hot-swappable RAID controller. So, turn off the system, swap the failed drive for the new drive, and start the system. Go into whichever tool you used to create the RAID set, and confirm that it sees the new drive and is rebuilding.

I would not reuse the bad drive in any production system. You could run SpinRite on it and possibly resolve the issue and keep it around a cold spare, but I wouldn't.