RAID Data Recovery – Does One Failed Drive + One Single Bad Sector Destroy an Entire RAID 5?

data-recoverydisaster-recoveryraid

During planning my RAID setup on a Synology Disk Station I've done a lot of reading about various RAID types, being this a great reading: RAID levels and the importance of URE (Unrecoverable Read Error).

However, one thing remains unclear to me:

Let's have two scenarios:

  1. An array is a RAID 1 of 2 drives
  2. An array is a RAID 5 of 3 drives

The same assumptions for both scenarios:

  • Let's have 100.000 files on the RAID array
  • One drive fails (needs replacement)
  • There happens to be one bad sector (URE) during rebuilding the array

What happens? Does the RAID rebuild with 99.999 files doing fine and 1 file lost? Or am I going to lose all 100.000 files?

If the answer requires the knowledge of the filesystem type, let assume it's BTRFS or ZFS being the filesystem.

Best Answer

The short answer is that it depends.

In the situation you describe (a faulty disk + some unreadable sectors on another disk) some enterprise RAID controllers will nuke the entire array on the grounds that its integrity is compromised and so the only safe action is to restore from backup.

Some other controllers (most notably from LSI) will instead puncture the array, marking some LBAs as unreadable but continuing with the rebuild. If the unreadable LBAs are on free space effectively no real data is lost, so this is the best scenario. If they affect already written data, some information (hopefully of little value) is inevitably lost.

Linux MDADM is very versatile, with the latest versions having a dedicated "remap area" for such a punctured array. Moreover one can always use dd or ddrescue to first copy the drive with unreadable sectors to a new disk and the use that disk to re-assemble the array (with some data loss of course).

BTRFS and ZFS, by the virtue of being more integrated with the block allocation layer, can detect if lost data are on empty or allocated space, with detailed reporting of the affected files.

Related Topic