Does BTRFS self-healing work only by software RAID

btrfsraid

Afaik. there are 3 types of RAID:

  • hardware – e.g. an expensive PCIE RAID card / or rarely onboard RAID which have a dedicated CPU, probably with write cache and BBU
  • software – e.g. ZFS, MD, BTRFS software RAID
  • fake – e.g. a cheap PCIE RAID card / or most of the onboard RAIDs which uses the RAM and CPU on the motherboard.

By using the BTRFS software RAID the filesytem knows about the drives and when it finds a block with wrong checksum, then it can use the clone (RAID1/10) or the parity (RAID5/6) to restore the affected block and fix the error. So our files won't degrade over time.

The question is whether the same self-healing mechanism works with hardware or fake RAIDs by BTRFS?

I guess there should be an API which the BTRFS can use to access the clone/parity of the affected block and fix it. I think there is a higher chance to have something like this by onboard RAID, but I don't know whether such a thing exists or every RAID is implemented differently and there is not standard API the BTRFS could use.

Best Answer

Struggling to make much sense out of this question but I think I can answer it anyway.

BTRFS is software RAID and is handled by the BTRFS software built into Linux. In order to do any sort of maintenance on the array you will need to use the btrfs commands in the operating system.

As far as your hardware is concerned, including any RAID controller, the disks are just basic block devices. Nothing other than the BTRFS software in Linux* will have any idea that the disks are part of an array.

*It's possible for other operating systems to implement BTRFS but I'd say it's unlikely, at least at the moment.

Update: Using BTRFS on top of an existing RAID array.

The RAID and BTRFS are completely separate in this case. Take a hardware mirror for example:

  • BTRFS has no idea it is on a mirror and will write data to a single device (call this /dev/raid in this example). It's up to the RAID controller to mirror that across both disks.
  • If a device fails, BTRFS will just read data as normal and have no idea a disk has failed. It's up to the RAID controller to keep /dev/raid functional and read/write data to the remaining disk. It's also up to the RAID controller to rebuild the array when the disk is replaced.

The same is true regardless of how the RAID is provided (hardware/"fake" or software)

I don't think I've explained very well but it's a very simple concept... It's not up to BTRFS to fix RAID data if a separate device (or software) is handling the RAID. It sees a single file system on a single disk, just like any other file system running on a RAID array.

Related Topic