Linux – Recover mdadm 4-Disk RAID5 Array with 2 Out of Date Disks

linuxmdadmraidraid5software-raid

Edit:

The scenario in this wiki, where 1 drive has a slightly lower and another a significantly lower event count than the rest of the array, suggests assembling with --force while leaving out the oldest drive, and adding it (or a new one in case the disk is actually bad) back after the array assembled in a degraded state.

Would it make sense to do this in my situation, or is it more advisable to attempt a --force assemble with all 4 drives, given that the two out of date ones have the same event count?


Given my limited RAID knowledge I figured I'd ask about my specific situation before trying anything. Losing the data on these 4 drives wouldn't be the end of the world to me, but it'd still be nice to get it back.

I migrated a RAID5 array from an old machine to a new one without any problems at first. I used it for about 2 days until I noticed that 2 of the drives weren't listed in the BIOS boot screen. Since the array still assembled and worked fine after getting into linux I didn't think too much of it.

The next day the array stopped working, so I hooked up a PCI-e SATA card and replaced all my SATA cables. After that all 4 drives showed up in the BIOS boot screen so I'm assuming either my cables or SATA ports were causing the initial problem.

Now I'm left with a broken array though. mdadm --assemble lists two drives as (possibly out of date), and mdadm --examine shows 22717 events for the out of date drives and 23199 for the other two. This wiki entry suggests that an event count difference of <50 could be overcome by assembling with --force, but my 4 drives are separated by 482 events.

Below is all the relevant raid info. I was aware of all 4 drives having corrupt primary GPT tables before the array broke down, but since everything was working fine at the time I hadn't gotten around to fixing that yet.

mdadm --assemble --scan --verbose

mdadm: /dev/sde is identified as a member of /dev/md/guyyst-server:0, slot 2.
mdadm: /dev/sdd is identified as a member of /dev/md/guyyst-server:0, slot 3.
mdadm: /dev/sdc is identified as a member of /dev/md/guyyst-server:0, slot 1.
mdadm: /dev/sdb is identified as a member of /dev/md/guyyst-server:0, slot 0.
mdadm: added /dev/sdb to /dev/md/guyyst-server:0 as 0 (possibly out of date)
mdadm: added /dev/sdc to /dev/md/guyyst-server:0 as 1 (possibly out of date)
mdadm: added /dev/sdd to /dev/md/guyyst-server:0 as 3
mdadm: added /dev/sde to /dev/md/guyyst-server:0 as 2
mdadm: /dev/md/guyyst-server:0 assembled from 2 drives - not enough to start the array.

mdadm --examine /dev/sd[bcde]

/dev/sdb:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 356cd1df:3a5c992d:c9899cbc:4c01e6d9
           Name : guyyst-server:0
  Creation Time : Wed Mar 27 23:49:58 2019
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 7813772976 (3725.90 GiB 4000.65 GB)
     Array Size : 11720658432 (11177.69 GiB 12001.95 GB)
  Used Dev Size : 7813772288 (3725.90 GiB 4000.65 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=688 sectors
          State : clean
    Device UUID : 7ea39918:2680d2f3:a6c3b0e6:0e815210

Internal Bitmap : 8 sectors from superblock
    Update Time : Fri May  1 03:53:45 2020
  Bad Block Log : 512 entries available at offset 24 sectors
       Checksum : 76a81505 - correct
         Events : 22717

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)



/dev/sdc:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 356cd1df:3a5c992d:c9899cbc:4c01e6d9
           Name : guyyst-server:0
  Creation Time : Wed Mar 27 23:49:58 2019
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 7813772976 (3725.90 GiB 4000.65 GB)
     Array Size : 11720658432 (11177.69 GiB 12001.95 GB)
  Used Dev Size : 7813772288 (3725.90 GiB 4000.65 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=688 sectors
          State : clean
    Device UUID : 119ed456:cbb187fa:096d15e1:e544db2c

Internal Bitmap : 8 sectors from superblock
    Update Time : Fri May  1 03:53:45 2020
  Bad Block Log : 512 entries available at offset 24 sectors
       Checksum : d285ae78 - correct
         Events : 22717

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)



/dev/sdd:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 356cd1df:3a5c992d:c9899cbc:4c01e6d9
           Name : guyyst-server:0
  Creation Time : Wed Mar 27 23:49:58 2019
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 7813772976 (3725.90 GiB 4000.65 GB)
     Array Size : 11720658432 (11177.69 GiB 12001.95 GB)
  Used Dev Size : 7813772288 (3725.90 GiB 4000.65 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=688 sectors
          State : clean
    Device UUID : 2670e048:4ebf581d:bf9ea089:0eae56c3

Internal Bitmap : 8 sectors from superblock
    Update Time : Fri May  1 04:12:18 2020
  Bad Block Log : 512 entries available at offset 24 sectors
       Checksum : 26662f2e - correct
         Events : 23199

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : A.AA ('A' == active, '.' == missing, 'R' == replacing)



/dev/sde:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 356cd1df:3a5c992d:c9899cbc:4c01e6d9
           Name : guyyst-server:0
  Creation Time : Wed Mar 27 23:49:58 2019
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 7813772976 (3725.90 GiB 4000.65 GB)
     Array Size : 11720658432 (11177.69 GiB 12001.95 GB)
  Used Dev Size : 7813772288 (3725.90 GiB 4000.65 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=688 sectors
          State : clean
    Device UUID : 093856ae:bb19e552:102c9f77:86488154

Internal Bitmap : 8 sectors from superblock
    Update Time : Fri May  1 04:12:18 2020
  Bad Block Log : 512 entries available at offset 24 sectors
       Checksum : 40917946 - correct
         Events : 23199

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : A.AA ('A' == active, '.' == missing, 'R' == replacing)

mdadm --detail /dev/md0

/dev/md0:
           Version : 1.2
        Raid Level : raid0
     Total Devices : 4
       Persistence : Superblock is persistent

             State : inactive
   Working Devices : 4

              Name : guyyst-server:0
              UUID : 356cd1df:3a5c992d:c9899cbc:4c01e6d9
            Events : 23199

    Number   Major   Minor   RaidDevice

       -       8       64        -        /dev/sde
       -       8       32        -        /dev/sdc
       -       8       48        -        /dev/sdd
       -       8       16        -        /dev/sdb

fdisk -l

The primary GPT table is corrupt, but the backup appears OK, so that will be used.
Disk /dev/sdb: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: WDC WD40EFRX-68N
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 79F4A900-C9B7-03A9-402A-7DDE6D72EA00

Device     Start        End    Sectors  Size Type
/dev/sdb1   2048 7814035455 7814033408  3.7T Microsoft basic data


The primary GPT table is corrupt, but the backup appears OK, so that will be used.
Disk /dev/sdc: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: WDC WD40EFRX-68N
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 43B95B20-C9B1-03A9-C856-EE506C72EA00

Device     Start        End    Sectors  Size Type
/dev/sdc1   2048 7814035455 7814033408  3.7T Microsoft basic data


The primary GPT table is corrupt, but the backup appears OK, so that will be used.
Disk /dev/sdd: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: WDC WD40EFRX-68N
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 1E276A80-99EA-03A7-A0DA-89877AE6E900


The primary GPT table is corrupt, but the backup appears OK, so that will be used.
Disk /dev/sde: 3.7 TiB, 4000787030016 bytes, 7814037168 sectors
Disk model: WDC WD40EFRX-68N
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 11BD8020-C9B5-03A9-0860-6F446D72EA00

Device     Start        End    Sectors  Size Type
/dev/sde1   2048 7814035455 7814033408  3.7T Microsoft basic data

smartctl -a -d ata /dev/sd[bcde]

As pastebin since it exceeded the character limit: https://pastebin.com/vMVCX9EH

Best Answer

Generally speaking, you must expect data loss in this situation. Two out of your four disks were ejected out of the RAID at roughly the same point on time. When assembled back, you will have a corrupt file system.

If possible, I would only experiment futher after dd-ing all disks as a backup to start over.

Using all 4 disks will allow you to identify which blocks differ (as there the checksum will not match), but it will not help you to compute a correct state. You could start checkarray after a forced re-assembly of all 4 and find the number of inconsistent blocks afterwards in /sys/block/mdX/md/mismatch_cnt. This may or may not be interesting to estimate the "degree of brokenness" of the file system.

Re-building the array can only use information from three disks to re-calculate parity. As the ejected disks have the same event count, using either of the ejected disks should result in the same (partially wrong) partity information to be re-computed.