Linux Software RAID1 Rebuild Completes, but after reboot, its degraded again

mdmdadmraidraid1software-raid

I have been beating my head with an issue here, and I'm now turning to the internet for help.

I have a system running Mandrake Linux, with the following configuration:

  • /dev/hda – This is a IDE drive. Has some partitions on it that boot the system and make up most of the file system.
  • /dev/sda – This is drive 1 of 2 for a software raid /dev/md0
  • /dev/sdb – This is drive 2 of 2 for a software raid /dev/md0

md0 gets mounted but fstab as /data-storage, so it is not critical to the systems ability to boot. We can comment it out of fstab, and the system works just fine either way.

The problem is, we have a failed sdb drive. So I shut the box down, and have pulled the failed disk and installed a new disk.

When the system boots up, /proc/mdstat shows only sda as part of the raid. I then run the various command to rebuild the RAID to /dev/sdb. Everything rebuilds correctly, and upon completion, you look at /proc/mdstat and it shows 2 drives sda1(0) and sdb1(1). Everything looks great.

Then you reboot the box … UGH!!!

Once rebooted, sdb is missing again from the RAID. It is like the rebuild never happened. I can walk through the commands to rebuild it again, and it will work, but again, after reboot, the box seems to make sdb just vanish!

The real odd thing is, if after reboot, I pull sda out of the box, and try to get the system to load with the rebuilt sdb drive in the system, and when I do, the system actually throws and error just after grub, and says something about drive error, and the system has to shut down.

Thoughts??? I'm starting to wonder if grub has something to do with this mess. That the drive isn't being setup within grub to be visible at boot? This RAID array isn't necessary for the system to boot, but when the replacement drive is in there, without SDA it won't boot system, so it makes me believe there is something to that. On top of that, there just seems to be something wonky here the drive falling off of RAID after reboot.

I've hit the point of pounding my head on the keyboard. Any help would be greatly appreciated!!!

Best Answer

Maybe it's too late now, but did you update your mdadm.conf file after adding you new drive? If you change a disk, your array won't have the same uuid anymore, and at reboot it will be looking for the old drive, not knowing that the new drive is here.

Here's the command to generate the lines for mdadm.conf :

mdadm --detail --scan

About the boot problem, your computer must use grub from sda to boot on hda, you have to change this in the bios and make sure grub is installed on hda as well.

Related Topic