Linux – mdadm raid 1 grub only on sda

centosgrublinuxmdadmraid1

I just finished setting up a CentOS 6.3 64bit server with mdadm however a lightbulb went on and I realised grub would only be installed on the first drive which is about as much use and an ashtray on a motorbike.

I had a look to confirm my suspicion:

grub> find /grub/stage1
find /grub/stage1
 (hd0,0)

So I updated my device map to look like:

(fd0)   /dev/fd0
(hd0)   /dev/sda
(hd1)   /dev/sdb

(Note the (hd1) entry was added by me

so then I tried to install grub on /dev/sdb

And I get:

grub> root (hd1,0)
root (hd1,0)
 Filesystem type is ext2fs, partition type 0x83
grub> setup (hd1)
setup (hd1)
 Checking if "/boot/grub/stage1" exists... no
 Checking if "/grub/stage1" exists... no

Error 15t: File not found

So I did some googling (Sadly google has just done a good job and picked up 100's of grub install examples which dont help here)

After finding a few clues I tried:

# grub-install --recheck /dev/sdb

Probing devices to guess BIOS drives. This may take a long time.
Installation finished. No error reported.
This is the contents of the device map /boot/grub/device.map.
Check if this is correct or not. If any of the lines is incorrect,
fix it and re-run the script `grub-install'.

(fd0)   /dev/fd0
(hd0)   /dev/sda
(hd1)   /dev/sdb

# grub-install /dev/sdb
Installation finished. No error reported.
This is the contents of the device map /boot/grub/device.map.
Check if this is correct or not. If any of the lines is incorrect,
fix it and re-run the script `grub-install'.

(fd0)   /dev/fd0
(hd0)   /dev/sda
(hd1)   /dev/sdb

Which sort of suggests grub is now installed on /dev/sdb too, however if I take another look I still get:

grub> find /grub/stage1
find /grub/stage1
 (hd0,0)

parted outputs for the 2 drives:

SDA

Partition Table: gpt

Number  Start   End     Size    File system  Name  Flags
 1      17.4kB  500MB   500MB   ext3         1     boot
 2      500MB   81.0GB  80.5GB               2     raid
 3      81.0GB  85.0GB  4000MB               3     raid
 4      85.0GB  3001GB  2916GB               4     raid

SDB

Partition Table: gpt

Number  Start   End     Size    File system  Name  Flags
 1      17.4kB  500MB   500MB   ext3         1
 2      500MB   81.0GB  80.5GB               2     raid
 3      81.0GB  85.0GB  4000MB               3     raid
 4      85.0GB  3001GB  2916GB               4     raid

And mdadm mdstat:

Personalities : [raid1]
md1 : active raid1 sdb3[1] sda3[0]
      3905218 blocks super 1.1 [2/2] [UU]

md2 : active raid1 sdb4[1] sda4[0]
      2847257598 blocks super 1.1 [2/2] [UU]

md0 : active raid1 sda2[0] sdb2[1]
      78612189 blocks super 1.1 [2/2] [UU]

Is anyone able to throw some light on the situation, it feels like I am 99% there at the moment and missing something obvious.

Thanks.

edit update:

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/md0               74G   18G   53G  25% /
tmpfs                 580M     0  580M   0% /dev/shm
/dev/sda1             462M   98M  341M  23% /boot
xenstore              580M   64K  580M   1% /var/lib/xenstored

/ is on md0 which is made up of sda2 and sdb2
swap is md1 which is sda3 and sdb3
md2 is LVM
however /boot is only on /sda1

I suppose that is the problem, would the resolution be to create md4 and have it contain sda1 and sdb1

Perhaps I have things mixed up a little in my head but I assumed grub was not installed on a partition but the first few blocks of the drive i.e. sda or hd0/1

Any clarification and advice is appreciated.

Best Answer

This should be your problem

root (hd1,0)
 Filesystem type is ext2fs, partition type 0x83

Take the following steps:

  • Create the 2 /boot partitions on /dev/sda1 and /dev/sdb1 - type fd(Linux autodetect raid) - use your favorite tool(fdisk, cfdisk, gparted,...) (fd00 for GPT)
  • Remember to turn on the bootable flag on both partitions, sda1 and sdb1 (not for GPT)
  • Force the disks to be a brand new raid:

    mdadm --zero-superblock /dev/sda1 
    mdadm --zero-superblock /dev/sdb1
    
  • While creating the raid metadata that will be your /boot partition, use the version 0.9. Linux cannot autodetect newer versions (without a ramdisk).

    mdadm --create /dev/md0 --level=1 --raid-disks=2 /dev/sda1 /dev/sdb1 --metadata=0.9
    
  • Format using ext2 or ext3

  • Install your Linux of choice, WITHOUT formating the /boot

After your distro first boot:

  • Fix your /etc/fstab to point /boot to /dev/md0(maybe it will not be necessary)
  • Install grub on the 2 disks MBR

    # grub /dev/sda
     grub> root (hd0,0)
     grub> setup (hd0)
     grub> quit
     quit
    
    # grub /dev/sdb
     grub> root (hd1,0)
     grub> setup (hd1)
     grub> quit
     quit
    
  • Edit your bootloader(instructions to Grub1)

  • Search the "default" line and add the "fallback" option bellow

    vi /boot/grub/menu.lst
    default 0
    fallback 1
    
  • Add another entry to your bootloader(again, in my case i've choosen grub1 since its less complicated and it's good enough to my needs), one of each pointing to the different boot partitions that are members of the raid:

    title           Debian GNU/Linux, kernel 2.6.32-5-686  (default)
    root            (hd0,0)
    kernel          /vmlinuz-2.6.32-5-686 root=/dev/mapper/vg-root ro quiet
    initrd          /initrd.img-2.6.32-5-686
    
    title           Debian GNU/Linux, kernel 2.6.32-5-686  (fallback)
    root            (hd1,0)
    kernel          /vmlinuz-2.6.32-5-686 root=/dev/mapper/vg-root ro quiet
    initrd          /initrd.img-2.6.32-5-686 
    
  • Note that in my case, i have a LVM layer on my / md raid.

Done. This should be enough to you to have a "redundant" bootloader.