X345 and ServeRaid 5i – Defunct Drives

data-recoveryibm

I have an old IBM X345 server with a ServeRaid 5i card that I was running as a fileserver/test-server.

It was running VMWare ESXi 3.5, with a virtualised OpenSolaris machine inside of this this (as well as other machines) with ZFS as the filesystem.

Anyhow, recently, after a reboot, on startup, ServerRaid complains that two logical arrays are offline. I have six SCSI drives in the machine – four 147 Gb ones, and two 72 Gb ones. From memory, the first four drives are in one array/logical disk, and the fifth drive is in one of it's own. The sixth drive isn't used.

Yes, this was running in RAID 0…shame on me, I know.

I booted up the ServeRaid Support CD, and the first and fifth drive were marked as defunct.

The first drive (147 Gb) was listed as defunct drive, I/O subsystem error, and the fifth drive (72 Gb) as defunct drive, physical drive not found. When I right-clicked, I did see an option to mark the drives as online again (I'm not exactly sure how this would work with the physical drive not found?), however, there was a warning about data loss if I proceeded. I'm assuming this wasn't the option I wanted to recover the disks.

What exactly does the IO/Subsystem error mean? If I mark online will it blow away the disk?

Also, on the Lightpath console on the server, I have an orange error light on the DASD. I'm guessing that's not such a good sign? Or is it? Could it just be the ServeRaid card? Cause if so, I can just replace that..sigh, really hoping it is.

Anyhow, how exactly should I proceed going ahead? Is there any way to recover either of the defunct drives (io subsystem or physical drive not found) back to their initial state? Or can I somehow rebuild the data from the remaining drives? (I suppose I'd have a corrupted VMFS (VMWare Filesystem) after that, and then I'd have to rebuild that, and then rebuild the virtual disk images inside of that…?)

Any recommendations on how to proceed? (And yes, some people are probably going to say…oh…just reformat, and use RAID1 next time…lol. Hoping for something a little more hopeful than that…haha).

Thanks,
Victor

Ps: I've also made the Support Archive from ServeRaid available here:

http://www.victorhooi.com/files/Support.zip

(This is basically all the controller logs, and configuration information).

Best Answer

The orange led doesn't mean that the drive is broken: it means that it's marked as failed.

Recently I had the same issue on a ServeRaid 6i: two drives disappeared. The raid was level 5. I put one of the two online and I rebuilt the second. At the end of the process I got my array rebuild. Of course it was not a broken disk but a weird bug into the controller or into the disks.

Some disks seem to have broken firmware that cause the disk to deattach from the array randomly.