Linux – Cannot install grub to RAID1 (md0)

grublinuxraid

I have a RAID1 array on my Ubuntu 12.04 LTS and my /sda HDD has been replaced several days ago. I use this commands to replace:

# go to superuser
sudo bash
# see RAID state
mdadm -Q -D /dev/md0
# State should be "clean, degraded"
# remove broken disk from RAID
mdadm /dev/md0 --fail /dev/sda1
mdadm /dev/md0 --remove /dev/sda1
# see partitions
fdisk -l
# shutdown computer
shutdown now
# physically replace old disk by new
# start system again
# see partitions
fdisk -l
# copy partitions from sdb to sda
sfdisk -d /dev/sdb | sfdisk /dev/sda
# recreate id for sda
sfdisk --change-id /dev/sda 1 fd
# add sda1 to RAID
mdadm /dev/md0 --add /dev/sda1
# see RAID state
mdadm -Q -D /dev/md0
# State should be "clean, degraded, recovering"
# to see status you can use
cat /proc/mdstat

This is the my mdadm output after sync:

/dev/md0:
        Version : 0.90
  Creation Time : Wed Feb 17 16:18:25 2010
     Raid Level : raid1
     Array Size : 470455360 (448.66 GiB 481.75 GB)
  Used Dev Size : 470455360 (448.66 GiB 481.75 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Thu Nov  1 15:19:31 2012
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 92e6ff4e:ed3ab4bf:fee5eb6c:d9b9cb11
         Events : 0.11049560

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1

After bebuilding completion "fdisk -l" says what I have not valid partition table /dev/md0.
This is my fdisk -l output:

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00057d19

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *          63   940910984   470455461   fd  Linux raid autodetect
/dev/sda2       940910985   976768064    17928540    5  Extended
/dev/sda5       940911048   976768064    17928508+  82  Linux swap / Solaris

Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000667ca

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *          63   940910984   470455461   fd  Linux raid autodetect
/dev/sdb2       940910985   976768064    17928540    5  Extended
/dev/sdb5       940911048   976768064    17928508+  82  Linux swap / Solaris

Disk /dev/md0: 481.7 GB, 481746288640 bytes
2 heads, 4 sectors/track, 117613840 cylinders, total 940910720 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/md0 doesn't contain a valid partition table

This is my grub install output:

root@answe:~# grub-install /dev/sda
/usr/sbin/grub-setup: warn: Attempting to install GRUB to a disk with multiple partition labels or both partition label and filesystem.  This is not supported yet..
/usr/sbin/grub-setup: error: embedding is not possible, but this is required for cross-disk install.
root@answe:~# grub-install /dev/sdb
Installation finished. No error reported.

Some version information:

grub-install (GRUB) 1.99-21ubuntu3.4
3.2.0-32-generic #51-Ubuntu SMP Wed Sep 26 21:33:09 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

So

1) "update-grub" find only /sda and /sdb Linux, not /md0

2) "dpkg-reconfigure grub-pc" says "GRUB failed to install the following devices /dev/md0"

I cannot load my system except from /sdb1 (and by hand, not automatically), and in DEGRADED mode…

Anybody can resolve this issue? I have big headache with this.

UPDATE: after wiping out new disk with zeroes, copying partitions with sfdisk update-grub say:

root@answe:~# grub-install /dev/sda
error: found two disks with the index 2 for RAID md0.
error: found two disks with the index 2 for RAID md0.
error: found two disks with the index 2 for RAID md0.
error: found two disks with the index 2 for RAID md0.
error: found two disks with the index 2 for RAID md0.
Installation finished. No error reported.

Now update-grub generate same errors:

root@answe:~# update-grub
error: found two disks with the index 2 for RAID md0.
error: found two disks with the index 2 for RAID md0.
Generating grub.cfg ...
error: found two disks with the index 2 for RAID md0.
error: found two disks with the index 2 for RAID md0.
error: found two disks with the index 2 for RAID md0.
error: found two disks with the index 2 for RAID md0.
error: found two disks with the index 2 for RAID md0.
error: found two disks with the index 2 for RAID md0.
Found linux image: /boot/vmlinuz-3.2.0-32-generic
Found initrd image: /boot/initrd.img-3.2.0-32-generic
...
error: found two disks with the index 2 for RAID md0.
error: found two disks with the index 2 for RAID md0.
Found memtest86+ image: /boot/memtest86+.bin
No volume groups found
error: found two disks with the index 2 for RAID md0.
error: found two disks with the index 2 for RAID md0.
Found Ubuntu 12.04.1 LTS (12.04) on /dev/sda1
Found Ubuntu 12.04.1 LTS (12.04) on /dev/sdb1
error: found two disks with the index 2 for RAID md0.
error: found two disks with the index 2 for RAID md0.
error: found two disks with the index 2 for RAID md0.
error: found two disks with the index 2 for RAID md0.
done

Best Answer

The warning points out the root cause. The new disk that you used for the replacement probably used to have a filesystem on it and grub-setup is now confused by the metadata that is probably still in there.

So, just wipe out everything at the beginning of the disk up to the first partition. I took the 62 number from the fdisk -l output. It says that 63 is the start of the first partition, which sounds like there are 62 sectors before it that you can clean out.

dd if=/dev/zero of=/dev/sda bs=512 count=62

Then recreate the partition table:

sfdisk -d /dev/sdb | sfdisk /dev/sda

sfdisk will probably complain about "the kernel may be using the old partition table" but you can ignore the warning since you are not really changing the partition table.

Then you should be able to grub-install /dev/sda.

Update:

If you are still getting errors, try taking the disk out of the RAID and zeroing out more data from the beginning and the end. Or just zero out the whole disk (dd if=/dev/zero of=/dev/sda). Then add it back to the RAID as you did before, starting from sfdisk -d /dev/sdb | sfdisk /dev/sda. And consider switching to metadata format 1.0 as was recommended in a few other places.

Related Topic