Linux – Q: MDADM mismatch_cnt > 0. Any way to identify which blocks are in disagreement

linuxmdadm

Okay. After a routine scrub, my MDADM RAID5 is reporting mismatch_cnt = 16. As I understand, this means that while no device reported a read error, there are 16 blocks for which the data and parity do not agree.

Question #1: Can one obtain a list of these blocks?

Question #2: Assuming #1 is possible, given that the underlying filesystem is EXT4, is there a way to identify which files are associated with these blocks?

I do have nearline backups and, in an ideal world, I could just diff the live array against the backup data to locate any files that have become silently corrupted. But the reality is recalling 6TB of backup data would be both prohibitively expensive and time-consuming. Knowing where to look and what to recover would greatly simplify things.

(I should note that I only run the RAID scrub with the 'check' option. Running scrub with the 'repair' option seems awfully dangerous because MDADM only knows that either the data or the parity is wrong but it doesn't know which. So it seems there is a 50% chance that MDADM guesses wrong and reconstructs incorrect data. Hence my desire to know which files are potentially affected so that I can restore them from backup, if necessary)

Any suggestions greatly appreciated!

Best Answer

Sorry, 'check' does indeed write back to the array when it encounters an error - see https://www.apt-browse.org/browse/ubuntu/trusty/main/amd64/mdadm/3.2.5-5ubuntu4/file/usr/share/doc/mdadm/README.checkarray

'check' is a read-only operation, even though the kernel logs may suggest otherwise (e.g. /proc/mdstat and several kernel messages will mention "resync"). Please also see question 21 of the FAQ.

If, however, while reading, a read error occurs, the check will trigger the normal response to read errors which is to generate the 'correct' data and try to write that out - so it is possible that a 'check' will trigger a write. However in the absence of read errors it is read-only.

... so it may already be too late to collect the data you're looking for, sorry.

For the longer term, it's worth noting that RAID5 (and 6, and 1) have no protection against bit-rot which is likely the situation you have encountered. When data in one disc goes bad, they have no way of determining which of the data is good vs bad. I'd suggest planning to migrate to a filesystem that checksums each disc such as btrfs or zfs.

(RAID-5 really shouldn't be used in new deployments - and really really shouldn't where the capacity of raw discs is over 2TB each - see http://www.zdnet.com/article/why-raid-5-stops-working-in-2009/)