Raid 1 disk failure recovery on Dell Poweredge 2850 – how to repair

drive-failureraid

Ive just spotted an amber disk error message on my 2850, E0D76 BP drive 4 fail. Drives are ULTRA 320 SCSI. Its been a while since this server was set up so I cannot be abolutely sure my memory is accurate but I think it was

Drive 0 73GB
Drive 1 73GB paired as RAID 1

Drive 2 146GB
Drive 3 146GB paired as RAID 1

Drive 4 146GB as hot swap

(I had a dodgy 146gb drive that was giving me flashing amber as a predicted fail but I thought better than nothing to leave it as the hot swap in drive 4)

I think I had the config as

   Raid Ch- 0
ID

0 ONLIN A00-00
1 ONLIN A00-01
2 ONLIN A01-00
3 ONLIN A01-01
4 HOTSP

So on checking the config i now see

enter image description here

enter image description here

Seeing drive 4 as failed I removed it , and re-sited it and rebooted but still failed. So I rebooted without it in which gave a POST warning but corrected the LED error from amber to blue.

My question is, can someone with a clue help me figure what has happened, and how can I recover it?

[EDIT] Whats the best way to monitor hardware RAID failure, its PERC 4e/Di controller, OS is Windows Web Server 2008 R2. Can the state of the RAID array be monitored from within windows? Is there some error thrown in the event log that I can hook a warning event on to?

Best Answer

Logical drive 0 (RAID 1) has a failed hard drive or has not been rebuilt. Drive 4 appears to be the mirror of drive 1. Be very careful here and make sure you have a backup of all your data before proceeding. I'd consider placing drive 0 in slot 4 and see if it rebuilds. But, I can't verify from the screen shots which physical drives belong to which logical drives and what sizes they are. At this point be very sure of what you're doing.

EDIT: Looking at the screen shots again it appears that LD0 is using slots 1 and 4 and LD1 is using slots 2 and 3. Confirm the hard drive sizes in the slots and proceed accordingly. (Have a backup!)

Related Topic