Linux – Setting up a bootable multi-device (RAID 1) using Linux Software RAID

linuxraidsoftware-raid

I'm trying to setup a bootable software RAID that will contain the root filesystem and boot up Linux Mint Qiana. It will be used to run a few graphical monitoring applications in a small datacenter, as well as a simple terminal to access other LAN nodes.

I have two 500GB SATA drives (/dev/sda and /dev/sdb) which I will use to construct the RAID 1 Array. There seems to be many ways to do this, but it's a bit unclear to me how to create a md0 device that is bootable.

My first approach was to boot using a Live Linux Mint installation CD. I would then switch over to a bash prompt and manually partition /dev/sda using sfdisk. I created a simple partition table which included a single primary partition, along with a swap partition. I then simply cloned the partition table from /dev/sda to /dev/sdb:

sfdisk -d /dev/sda | sfdisk /dev/sdb

Okay, so now I have two drives ready to be assembled into a RAID array. I first create the array by saying:

mdadm --create --verbose --metadata=0.90 /dev/md0 --level=mirror 
   --raid-devices=2 /dev/sda /dev/sdb

About an hour later, the array is finished syncing.

I can now initialize /dev/md0 by giving it a filesystem:

mke2fs -t ext4 /dev/md0

Okay, so now everything seems good. So, I switch back to the Live CD installation, and install Linux onto /dev/md0. Everything works until the installer attempts to run grub-install, after which it receives a fatal error.

So, I've been researching around trying to understand the cause here. I'm not entirely sure why this happens, but my understanding is that it has something to do with the fact that "one does not simply boot from /dev/md0". It seems that in order to create a bootable multi-device RAID 1 array, you need to either create a separate non-RAID /boot partition, or use initramfs.

Unfortunately, I don't exactly understand what this entails. Firstly, I don't want to create a separate non-RAID /boot partition, because the whole boot of booting from md0 is for the redundancy. Secondly, my understanding is that the initramfs approach is necessary to load mdadm into the rootfs at boot time. But, when I boot from the Live CD and create my RAID array, mdadm is already loaded into memory, so I don't understand why the installer always gets a fatal error when running grub-install.

Can someone explain what steps I'm missing here, or provide an outline of how to setup a multi-device mount that can boot?

Best Answer

Booting with Software RAID almost always requires a separate /boot partition, especially with older versions of GRUB. Specifying "--metadata=0.90" when creating the RAID-1 partition for /boot is also required.

/boot should be the first partition and kept small. Mostly for legacy BIOS boot reasons. The other reason that /boot should be a separate partition is that it should normally be mounted in read-only mode in /etc/fstab. You should only re-mount it as read-write before doing a kernel upgrade.

The typical drive setup I use is:

/dev/sda1 + /dev/sdb1 = /dev/md0 Software RAID-1 with metadata 0.90
/dev/sda2 + /dev/sdb2 = /dev/md### Software RAID-1 with LVM on top

I always use /dev/md0 for the /boot partition. For the LVM area, I use a random number below 125 for the mdadm device #, mostly to keep things from breaking if this drive is ever attached to a different server at bootup (i.e. via USB during recovery).

After setting up your RAID-1 on /boot, you have to install GRUB onto each drive in the RAID-1 array.

How to boot after RAID failure (software RAID)?

This particular example shows how to setup a 3-way RAID-1 mirror so that all three drives are bootable with GRUB.

# grub
grub> find /grub/stage1
 (hd0,0)
 (hd1,0)
 (hd2,0)
grub> device (hd0) /dev/sda
grub> root (hd0,0)
grub> setup (hd0)
grub> device (hd0) /dev/sdb
grub> root (hd0,0)
grub> setup (hd0)
grub> device (hd0) /dev/sdc
grub> root (hd0,0)
grub> setup (hd0)
grub> quit