Linux – Ubuntu 14.04 Software RAID 1 – md0 inactive

linuxmdadmraidUbuntu

I have my root filesystem on /dev/sdc, and a software RAID 1 spanning /dev/sda and /dev/sdb (I think). I physically moved my computer and ran software updates today (either of these could be the culprit), then noticed that my RAID array was no longer available. I see mdadm has marked it inactive, though I'm not sure why. I also am unable to mount it. I see other suggestions out there, but none that look exactly like my situation, and I'm worried about losing data.

I have not edited any configuration files and this configuration was previously working (with the exception that the RAID was not auto-mounted, which didn't bother me much).

edit: I should also mention that I originally tried setting up software RAID when I built the machine, something went wrong and I think I accidentally destroyed the data on the RAID, so I set up another software RAID and have been using that ever since. I believe that's the reason for the two entries. And now that I look at it, it looks like my data may not even be mirrored across the two drives? Just two separate RAID 1s on one drive each somehow?

edit 2: It looks like /dev/sdb is the RAID configuration that I want based on the update time of today, and the RAID consisting of /dev/sda1 and /dev/sdb1 is the old configuration that has an update time of February when I built this.

cat /proc/mdstat

root@waffles:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md127 : inactive sda1[0](S)
      976630488 blocks super 1.2

md0 : inactive sdb[1](S)
      976631512 blocks super 1.2

unused devices: <none>

mdadm –examine –scan –config=/etc/mdadm/madadm.conf

root@waffles:~# mdadm --examine --scan --config=/etc/mdadm/madadm.conf
ARRAY /dev/md/0 metadata=1.2 UUID=dd54a7bd:15442724:ffd24430:0c1444b3 name=waffles:0
ARRAY /dev/md/0 metadata=1.2 UUID=047187c2:2a72494b:57327e8e:7ce78e9c name=waffles:0

cat /etc/mdadm/mdadm.conf

root@waffles:~# cat /etc/mdadm/mdadm.conf
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
#ARRAY /dev/md/0 metadata=1.2 UUID=047187c2:2a72494b:57327e8e:7ce78e9c name=waffles:0

# This file was auto-generated on Fri, 20 Feb 2015 10:00:12 -0500
# by mkconf $Id$
ARRAY /dev/md0 metadata=1.2 name=waffles:0 UUID=dd54a7bd:15442724:ffd24430:0c1444b3

cat /proc/mounts

root@waffles:~# cat /proc/mounts
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,relatime,size=16379004k,nr_inodes=4094751,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=3278828k,mode=755 0 0
/dev/disk/by-uuid/28631011-e1c9-4152-85b6-82073656a9ee / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0
none /sys/fs/cgroup tmpfs rw,relatime,size=4k,mode=755 0 0
none /sys/fs/fuse/connections fusectl rw,relatime 0 0
none /sys/kernel/debug debugfs rw,relatime 0 0
none /sys/kernel/security securityfs rw,relatime 0 0
none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0
none /run/user tmpfs rw,nosuid,nodev,noexec,relatime,size=102400k,mode=755 0 0
none /sys/fs/pstore pstore rw,relatime 0 0
systemd /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,name=systemd 0 0
/home/todd/.Private /home/todd ecryptfs rw,nosuid,nodev,relatime,ecryptfs_fnek_sig=b12c61ee79f0f7fc,ecryptfs_sig=2b32246c98b2f7ca,ecryptfs_cipher=aes,ecryptfs_key_bytes=16,ecryptfs_unlink_sigs 0 0
gvfsd-fuse /run/user/1000/gvfs fuse.gvfsd-fuse rw,nosuid,nodev,relatime,user_id=1000,group_id=1000 0 0

cat /etc/fstab

root@waffles:~# cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
# / was on /dev/sdc1 during installation
UUID=28631011-e1c9-4152-85b6-82073656a9ee /               ext4    errors=remount-ro 0       1
# swap was on /dev/sdc5 during installation
#UUID=d662ea5e-38f3-4a71-8a56-fa658c32b2eb none            swap    sw              0       0
/dev/mapper/cryptswap1 none swap sw 0 0

mount /dev/md0 /media/raid1/

root@waffles:~# mount /dev/md0 /media/raid1/
mount: /dev/md0: can't read superblock

grep 'md0' /var/log/syslog

root@waffles:~# grep 'md0' /var/log/syslog
Dec 21 13:50:16 waffles kernel: [    1.043320] md/raid1:md0: active with 2 out of 2 mirrors
Dec 21 13:50:16 waffles kernel: [    1.043327] md0: detected capacity change from 0 to 1000070512640
Dec 21 13:50:16 waffles kernel: [    1.050982]  md0: unknown partition table
Dec 21 14:20:16 waffles mdadm[1921]: DeviceDisappeared event detected on md device /dev/md0
Dec 21 14:32:26 waffles mdadm[2426]: DeviceDisappeared event detected on md device /dev/md0
Dec 21 14:37:17 waffles kernel: [  302.004127] EXT4-fs (md0): unable to read superblock
Dec 21 14:37:17 waffles kernel: [  302.004198] EXT4-fs (md0): unable to read superblock
Dec 21 14:37:17 waffles kernel: [  302.004244] EXT4-fs (md0): unable to read superblock
Dec 21 14:37:17 waffles kernel: [  302.004294] FAT-fs (md0): unable to read boot sector
Dec 21 14:45:26 waffles mdadm[1917]: DeviceDisappeared event detected on md device /dev/md0
Dec 21 15:38:31 waffles kernel: [ 3190.749438] EXT4-fs (md0): unable to read superblock
Dec 21 15:38:31 waffles kernel: [ 3190.749609] EXT4-fs (md0): unable to read superblock
Dec 21 15:38:31 waffles kernel: [ 3190.749679] EXT4-fs (md0): unable to read superblock
Dec 21 15:38:31 waffles kernel: [ 3190.749749] FAT-fs (md0): unable to read boot sector

mdadm –examine /dev/sda1

root@waffles:~# mdadm --examine /dev/sda1
/dev/sda1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 047187c2:2a72494b:57327e8e:7ce78e9c
           Name : waffles:0  (local to host waffles)
  Creation Time : Thu Feb 12 15:43:00 2015
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 1953260976 (931.39 GiB 1000.07 GB)
     Array Size : 976630336 (931.39 GiB 1000.07 GB)
  Used Dev Size : 1953260672 (931.39 GiB 1000.07 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 0b0a69b7:3c3900c0:6e26b3e4:91155d98

    Update Time : Fri Feb 20 09:36:16 2015
       Checksum : 9bfb3aa - correct
         Events : 27


   Device Role : Active device 0
   Array State : AA ('A' == active, '.' == missing)

mdadm –examine /dev/sdb1

root@waffles:~# mdadm --examine /dev/sdb1
/dev/sdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 047187c2:2a72494b:57327e8e:7ce78e9c
           Name : waffles:0  (local to host waffles)
  Creation Time : Thu Feb 12 15:43:00 2015
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 1953260976 (931.39 GiB 1000.07 GB)
     Array Size : 976630336 (931.39 GiB 1000.07 GB)
  Used Dev Size : 1953260672 (931.39 GiB 1000.07 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 2fdaaf8c:30d5c44e:893f9a5a:11d8170c

    Update Time : Fri Feb 20 09:36:16 2015
       Checksum : 576cfb5c - correct
         Events : 27


   Device Role : Active device 1
   Array State : AA ('A' == active, '.' == missing)

mdadm –examine /dev/sdb (given the update times here, I think this is the one I care about)

root@waffles:~# mdadm --examine /dev/sdb
/dev/sdb:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : dd54a7bd:15442724:ffd24430:0c1444b3
           Name : waffles:0  (local to host waffles)
  Creation Time : Fri Feb 20 10:03:33 2015
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 1953263024 (931.39 GiB 1000.07 GB)
     Array Size : 976631360 (931.39 GiB 1000.07 GB)
  Used Dev Size : 1953262720 (931.39 GiB 1000.07 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : f2e16155:49caff6d:d13115a6:379d2fc8

    Update Time : Mon Dec 21 13:14:19 2015
       Checksum : d5017b27 - correct
         Events : 276


   Device Role : Active device 1
   Array State : AA ('A' == active, '.' == missing)

Any suggestions to get this mounted again? It could be a bad drive from the move, but I was careful when moving the computer and see others have solved similar issues in software.

Best Answer

You've got a rather perplexing system there. /dev/sdb (the entire volume) and /dev/sdb1 (the first partition on that volume) are both being detected as RAID devices. This is confusing the OS, and it's creating two RAID arrays: /dev/md0 is a degraded RAID 1 array consisting of /dev/sdb, and /dev/md127 is a degraded RAID 1 array consisting of /dev/sda1. Since they're degraded, the OS won't automatically start them.

The first step in recovering from this is to make a volume-level backup (dd if=/dev/sda, dd if=/dev/sdb), so that if things go wrong, you won't be any worse off than you currently are.

Once you've done that, you can activate your arrays in read-only mode: mdadm --run --readonly /dev/md0; mdadm --run --readonly /dev/md127, mount the disks, and see what each contains.

Assuming that you're correct that /dev/sdb is the RAID array you're using, the next step is to figure out what it was using as the second volume of the RAID array: the metadata clearly states that when you shut it down, it was a two-disk RAID 1 array with both disks present.

If you can't figure it out, or don't want to use whatever the missing piece is, and you're correct that /dev/sda1 contains nothing important, the next step is to add it to /dev/md0:

  1. Wipe out the partition table and md metadata as a safety precaution: dd if=/dev/zero of=/dev/sda bs=1M count=1024
  2. Add it to the array: mdadm --manage /dev/md0 --add /dev/sda and let the array rebuild.

The final step is to wipe out the md superblock on /dev/sdb1. According to the mdadm man page, mdadm --zero-superblock /dev/sdb1 will work, but since the superblock is inside an existing array, I'd be very nervous about actually doing this.