2 drives “failed” on a 3 drive RAID 5

data-lossraid

But I don't believe it.

The machine is a Dell PowerEdge 2600 server running Windows Server 2008 trial 32bit (yah, its not supposed to…but it works! [well, it used to]).

For the sake of confusion: the drives are numbered 0, 1 and 2.

I was coding away as usual when I noticed the Dell logo on the front of the case was orange. So I opened the case door and saw that the HD vents were completely covered with dust (I know its not related to the orange light…but I hate dust). Since the drives are hot-swappable I yanked drive 2 out and cleaned off the dust and put it back in. I then yanked drive 1 out and cleaned the dust off that one and put it back in. Someone asked be to help set up a printer on their machine so I got up and 20 minutes later came back to see 'No boot device available – strike F1 to retry boot, F2 for setup utility` displayed on the server's monitor. I look down at the drives and drives 1 and 2 have orange lights instead of the green ones!

Since then here is what I have tried:

  • Installed drives into a Dell
    PowerEdge 2500. Drives were detected
    fine. Got a message stating Missing
    operating system.
  • Reset the bios on the original PowerEdge 2600 (pulled the bios battery out). All drives appear fine. Get the Missing operating system message when booting. A drive lights are green.
  • Booted Ubuntu from a CD to inspect the drives. 2 of the drives
    are displayed in Computer. Since the data is striped the files/folders in the drives are gibberish.
  • Booted Ubuntu and opened Terminal and executed sudo fdisk -l which listed the 3 drives. On the 3rd drive listed it states Disk identifier: 0x00000000 Disk /dev/sdb doesn't contain a valid partition table

Do you think the drives ARE actually toast?
Could it be SCSI or other hardware failure?
Could it be incorrect System Settings?
Is there any way to create a virtual RAID in Ubuntu on the 2 drives that are "valid" so I can copy the data to a network share?
Should I try reinstalling the Windows Server OS(eek!)?
Do you have any suggestions that I can try?


UPDATE

After doing lots of googling I came across Raid Reconstructor. I tried this program using my Dell PowerEdge 2600 using a bootable windows XP CD but it did not work (no drives detected). I then installed two of the drives into the PowerEdge 2500 alongside the 2500's existing single-drive RAID 0 running Microsoft Server 2003. I then installed and activated Raid Reconstructor which created a virtual image of the RAID-5 array, opened the image with Captain Nemo, and backed up my C:/Websites directory to another computer…with ALL files intact (so far)!!!

I will hopefully be able to restore the drives 100%

Lessons learned:

  • I don't care if the server can "hot-swap" drives. DON'T FREAKING DO IT!
  • Back up your data, dummy!

Thanks for all your help, answers, and comments (and for being wrong about the data loss. haha)!

Best Answer

Biggest 'Doh!' of the week I reckon - sorry dude.

The drives themselves won't be physically broken, this is simply that you've killed the array by removing a second disk before the first one had rebuilt - I'm >90% sure your array is toast. Basically you shouldn't have removed them at all while live, if you absolutely had to you should have waited for the array to rebuild before doing the second disk.

It's reinstall/restore time I'm afraid - your data is gone.