Ubuntu – mdadm on ubuntu 10.04 – raid5 of 4 disks, one disk missing after reboot


I'm having a problem with the raid array in a server (Ubuntu 10.04).

I've got a raid5 array of 4 disks – sd[cdef], created like this:

# partition disks
parted /dev/sdc mklabel gpt
parted /dev/sdc mkpart primary ext2 1 2000GB
parted /dev/sdc set 1 raid on
# create array
mdadm --create -v --level=raid5 /dev/md2 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1

This has been running fine for a couple of months.

I just applied system updates and rebooted, and the raid5 – /dev/md2 – didn't come back on boot. When I re-assembled it with mdadm --assemble --scan, it seems to have come up with only 3 of the member drives – sdf1 is missing. Here's what I can find:

(Side-note: md0 & md1 are raid-1 built on a couple of drives, for / and swap respectively.)

root@dwight:~# mdadm --query --detail /dev/md2
        Version : 00.90
  Creation Time : Sun Feb 20 23:52:28 2011
     Raid Level : raid5
     Array Size : 5860540224 (5589.05 GiB 6001.19 GB)
  Used Dev Size : 1953513408 (1863.02 GiB 2000.40 GB)
   Raid Devices : 4
  Total Devices : 3
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Fri Apr  8 22:10:38 2011
          State : clean, degraded
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 1bb282b6:fe549071:3bf6c10c:6278edbc (local to host dwight)
         Events : 0.140

    Number   Major   Minor   RaidDevice State
       0       8       33        0      active sync   /dev/sdc1
       1       8       49        1      active sync   /dev/sdd1
       2       8       65        2      active sync   /dev/sde1
       3       0        0        3      removed

(Yes, the server's called Dwight; I'm a The Office fan 🙂 )

So it thinks one drive (partition really) is missing, /dev/sdf1.

root@dwight:~# mdadm --detail --scan
ARRAY /dev/md0 level=raid1 num-devices=2 metadata=00.90 UUID=c7dbadaa:7762dbf7:beb6b904:6d3aed07
ARRAY /dev/md1 level=raid1 num-devices=2 metadata=00.90 UUID=1784e912:d84242db:3bf6c10c:6278edbc
mdadm: md device /dev/md/d2 does not appear to be active.
ARRAY /dev/md2 level=raid5 num-devices=4 metadata=00.90 UUID=1bb282b6:fe549071:3bf6c10c:6278edbc

What, what, /dev/md/d2? What's /dev/md/d2? I didn't create that.

root@dwight:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md2 : active raid5 sdc1[0] sde1[2] sdd1[1]
      5860540224 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]

md_d2 : inactive sdf1[3](S)
      1953513408 blocks

md1 : active raid1 sdb2[1] sda2[0]
      18657728 blocks [2/2] [UU]

md0 : active raid1 sdb1[1] sda1[0]
      469725120 blocks [2/2] [UU]

unused devices: <none>

Ditto. md_d2? sd[cde]1 Are in md2 properly, but sdf1 is missing (and seems to think it should be an array of its own?)

root@dwight:~# mdadm -v --examine /dev/sdf1
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 1bb282b6:fe549071:3bf6c10c:6278edbc (local to host dwight)
  Creation Time : Sun Feb 20 23:52:28 2011
     Raid Level : raid5
  Used Dev Size : 1953513408 (1863.02 GiB 2000.40 GB)
     Array Size : 5860540224 (5589.05 GiB 6001.19 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 2

    Update Time : Fri Apr  8 21:40:42 2011
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 71136469 - correct
         Events : 114

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       81        3      active sync   /dev/sdf1

   0     0       8       33        0      active sync   /dev/sdc1
   1     1       8       49        1      active sync   /dev/sdd1
   2     2       8       65        2      active sync   /dev/sde1
   3     3       8       81        3      active sync   /dev/sdf1

…so sdf1 thinks it's part of the md2 device, is that right?

When I run that on /dev/sdc1, I get:

root@dwight:~# mdadm -v --examine /dev/sdc1
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 1bb282b6:fe549071:3bf6c10c:6278edbc (local to host dwight)
  Creation Time : Sun Feb 20 23:52:28 2011
     Raid Level : raid5
  Used Dev Size : 1953513408 (1863.02 GiB 2000.40 GB)
     Array Size : 5860540224 (5589.05 GiB 6001.19 GB)
   Raid Devices : 4
  Total Devices : 3
Preferred Minor : 2

    Update Time : Fri Apr  8 22:50:03 2011
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 1
  Spare Devices : 0
       Checksum : 71137458 - correct
         Events : 144

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       33        0      active sync   /dev/sdc1

   0     0       8       33        0      active sync   /dev/sdc1
   1     1       8       49        1      active sync   /dev/sdd1
   2     2       8       65        2      active sync   /dev/sde1
   3     3       0        0        3      faulty removed

And when I try to add sdf1 back into the /dev/md2 array, I get a busy error:

root@dwight:~# mdadm --add /dev/md2 /dev/sdf1
mdadm: Cannot open /dev/sdf1: Device or resource busy

Help! How can I add sdf1 back into the md2 array?


  • Ben

Best Answer

mdadm -S /dev/md_d2, then try adding sdf1.