Linux – “atomically” swap a raid5 drive in Linux software raid

linuxraid

One of the drives in my 3-disk raid5 array is starting to show read errors and SMART warnings. Not enough to get it kicked out of the array (as faulty), but it affects performance and will probably go bad (worse). I obviously want to replace this drive.

Now the question is if I run this: (sdc is the broken drive and sdd is the new one):

mdadm /dev/md0 -a /dev/sdd1 -f /dev/sdc1 -r /dev/sdc1

Will Linux first mark sdc1 as faulty, and never read from it again, and then sync up sdd1 from sda1 and sdb1 (the two other disks in the array)?

If so then I'm vulnerable to the case that there is an unreadable block (even a single one!) on sda1 or sdb1, and that will cause the rebuild to fail.

What I want to do is have sdd1 be synced as a copy of sdc1 before marking sdc1 as faulty. Then I won't be in a situation where I don't have redundancy (albeit with one redundancy-stripe being on a disk that very well can give read errors).

Is there a way to do this online? Offline I can:

  • down the array (mdadm –stop)
  • dd sdc1 over to sdd1 (dd if=/dev/sdc1 of=/dev/sdd1)
  • take out sdc physically
  • bring the array up using the two old working ones and the new one (mdadm -A -s)
  • resync

Well, the problem with that method is that in the last step if there is a mismatch I want the new disk to be the one that is rewritten, not the parity (whatever disk that is on for that stripe).

So "rebuild sdd1 as a new sdc1, getting data from sda1 and sdb1, but if they fail, copy what's on sdc1".

Best Answer

Individually, those commands will not do what you desire.

mdadm /dev/md0 -a /dev/sdd1 
cat /proc/mdstat; #(you should now have a spare drive in raid5)
mdadm /dev/md0 -f /dev/sdc1
cat /proc/mdstat; #(you should now see a rebuild occuring to sdd1)

A test of the actual command does indeed cause a rebuild to occur.

Alas, I don't believe you can do what you desire right now.

As an aside, I often reference the linux raid wiki, and experiment on what I see there using loopback files.

dd if=/dev/zero of=loopbackfile.0 bs=1024k count=100
losetup /dev/loop0 loopbackfile.0

That gives you 100 meg file that is available as /dev/loop0. Create another couple of them, and you can use mdadm (e.g. "mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/loop0 /dev/loop1 /dev/loop2) without affecting real drives or data.


Note I formerly had said that

mdadm /dev/md0 -a /dev/sdd1
mdadm --grow /dev/md0 --raid-disks=4

would grow your array to a raid6. This is false. This will simply add a fourth disk to your array, which does not put you in any better of a position than you are currently in.