Mdadm – Remove disk from RAID0

mdadmraid0

So, I'd like to know, is is possible to do the following with mdadm:

I start with RAID0 configuration on 2 disks: sda and sdb.
I would like to add one more disk to array, sdc and move all data from sdb to it.
Disconnect sdb.

Right now I see only one option – I stop the array, copy sdb to sdc with dd or any other block-copy tool and start the array back.

Do I miss something? Is it possible to do this with mdadm?

Best Answer

First of all: to those, who still believes in "RAID0 has no hot spare". It could have a manual spare, done by human, who understand RAID levels and mdadm. mdadm is software RAID, so it could do a lot of interesting things.

Credits to Zoredache for the idea!

So, the situation:

you have RAID0 array of two disks
you would like to replace one of them without array downtime

If the downtime is acceptable, you always can just make a block copy of disk with dd and reassemble the array, mdadm will do OK.

Solution: use RAID4 as intermediate solution

RAID0 -> RAID4 -> RAID0

So, if you don't remember RAID4, it is simple. It has a parity block, but unlike RAID5 it is not distributed across the array, but resides on ONE disk. That's the point, this is important and this is the reason RAID5 will not work.

What you'll need: two more disks of the same size, as the disk you would like to replace.

Environment:

Ubuntu 14.04 Thrusty Thar
mdadm - v3.2.5 - 18th May 2012
/dev/sdb - start with it, will replace it
/dev/sdc - start with it
/dev/sdd - will be used temporary
/dev/sde - will be used instead of sdb

The ultimate RAID0 hot-spare mdadm guide ;)

sudo mdadm -C /dev/md0 -l 0 -n 2 /dev/sd[bc]

md0 : active raid0 sdc[1] sdb[0]
      2096128 blocks super 1.2 512k chunks

We've created raid0 array, it looks sweet.

sudo md5sum /dev/md0

b422ba644a3c83cdf28adfa94cb658f3  /dev/md0

This is our check point - if even one bit will differ in resulting /dev/md0 - we've failed.

sudo mdadm /dev/md0 --grow --level=4

md0 : active raid4 sdc[1] sdb[0]
      2096128 blocks super 1.2 level 4, 512k chunk, algorithm 5 [3/2] [UU_]

So, we've grown our array to be RAID4. We haven't added the parity disk yet, so let's do it. The grow will be instant - there is nothing to recompute or recalculate.

sudo mdadm /dev/md0 -a /dev/sdd

md0 : active raid4 sdd[3] sdc[1] sdb[0]
      2096128 blocks super 1.2 level 4, 512k chunk, algorithm 5 [3/2] [UU_]
      [===>.................]  recovery = 19.7% (207784/1048064) finish=0.2min speed=51946K/sec

We've added sdd as parity disk. This is important to remember - the order of disks in the first row is not syncronized with the picture in second row! [UU_]

sdd is displayed first, but in fact it is last one, and holds not the data, but the parity.

sudo mdadm /dev/md0 -f /dev/sdb

md0 : active raid4 sdd[3] sdc[1] sdb[0](F)
      2096128 blocks super 1.2 level 4, 512k chunk, algorithm 5 [3/2] [_UU]

We've made our disk sdb faulty, to remove it in the next steps.

sudo mdadm --detail /dev/md0

State : clean, degraded

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1       8       32        1      active sync   /dev/sdc
       3       8       48        2      active sync   /dev/sdd

       0       8       16        -      faulty spare   /dev/sdb

Details show us the removal of the first disk and here we can see the true order of the disks in the array. It's important to track the disk with parity, we should not leave it in the array when going back to RAID0.

sudo mdadm /dev/md0 -r /dev/sdb

md0 : active raid4 sdd[3] sdc[1]
      2096128 blocks super 1.2 level 4, 512k chunk, algorithm 5 [3/2] [_UU]

sdb is completely removed, could be taken away.

sudo mdadm /dev/md0 -a /dev/sde

md0 : active raid4 sde[4] sdd[3] sdc[1]
      2096128 blocks super 1.2 level 4, 512k chunk, algorithm 5 [3/2] [_UU]
      [==>..................]  recovery = 14.8% (156648/1048064) finish=0.2min speed=52216K/sec

We have added the replacement for our sdb disk. And here we go: now the data of sdb is being recovered using parity. Sweeeeet.

md0 : active raid4 sde[4] sdd[3] sdc[1]
      2096128 blocks super 1.2 level 4, 512k chunk, algorithm 5 [3/3] [UUU]

Done. Right now we are completely safe - all data from sdb are recovered, and now we have to remove sdd (remember, it holds parity).

sudo mdadm /dev/md0 -f /dev/sdd

md0 : active raid4 sde[4] sdd[3](F) sdc[1]
      2096128 blocks super 1.2 level 4, 512k chunk, algorithm 5 [3/2] [UU_]

Made sdd faulty.

sudo mdadm /dev/md0 -r /dev/sdd

md0 : active raid4 sde[4] sdc[1]
      2096128 blocks super 1.2 level 4, 512k chunk, algorithm 5 [3/2] [UU_]

Removed sdd from our array. We are ready to become RAID0 again.

sudo mdadm /dev/md0 --grow --level=0 --backup-file=backup

md0 : active raid4 sde[4] sdc[1]
      2096128 blocks super 1.2 level 4, 512k chunk, algorithm 5 [3/2] [UU_]
      [=>...................]  reshape =  7.0% (73728/1048064) finish=1.5min speed=10532K/sec

Aaaaaaand bang!

md0 : active raid0 sde[4] sdc[1]
      2096128 blocks super 1.2 512k chunks

Done. Let's look at md5 checksum.

sudo md5sum /dev/md0

b422ba644a3c83cdf28adfa94cb658f3  /dev/md0

Any more questions? So RAID0 could have a hot spare. It's called "user" ;)

Related Solutions

RAID – mdadm Did Not Notice a Failed Disk in RAID0

The md(4) man page sheds some light on how the word "clean" is used (crucial bit italicized):

Unclean Shutdown

When changes are made to a RAID1, RAID4, RAID5, RAID6, or RAID10 array there is a possibility of inconsistency for short periods of time as each update requires at least two block to be written to different devices, and these writes probably won't happen at exactly the same time. Thus if a system with one of these arrays is shutdown in the middle of a write operation (e.g. due to power failure), the array may not be consistent.

To handle this situation, the md driver marks an array as "dirty" before writing any data to it, and marks it as "clean" when the array is being disabled, e.g. at shutdown. If the md driver finds an array to be dirty at startup, it proceeds to correct any possibly inconsistency. For RAID1, this involves copying the contents of the first drive onto all other drives. For RAID4, RAID5 and RAID6 this involves recalculating the parity for each stripe and making sure that the parity block has the correct data. For RAID10 it involves copying one of the replicas of each block onto all the others. This process, known as "resynchronising" or "resync" is performed in the background. The array can still be used, though possibly with reduced performance.

If a RAID4, RAID5 or RAID6 array is degraded (missing at least one drive, two for RAID6) when it is restarted after an unclean shutdown, it cannot recalculate parity, and so it is possible that data might be undetectably corrupted. The 2.4 md driver does not alert the operator to this condition. The 2.6 md driver will fail to start an array in this condition without manual intervention, though this behaviour can be overridden by a kernel parameter.

It's plausible that a disk in the RAID failed after the RAID was safely and normally disabled by the system (at, e.g., a shutdown). In other words, the disk failure happened with the RAID in a consistent, synchronized state. The RAID would then be flagged "clean", and, when it was next enabled and one of its disks failed, the flag would remain.

Mdadm, RAID5 all disks marked as spare, won’t start

Try assembling it without the odd third disk sdc, i.e.

mdadm --assemble /dev/md0 /dev/sda /dev/sdb /dev/sdd --verbose

That sounds like it could work because the remaining three appear to be in sync and with RAID-5, N-1 disks is sufficient to restart the array in degraded mode.

It's possible that the device indices aren't right, examine mdadm -E output and see if you can identify the set of three working disks. From the error messages, it sounds like both sdc and sda had failed simultaneously at some point, which is something RAID-5 can't handle gracefully.

(Originally I had suggested to omit the third disk by replacing it with the string missing, but that is --create syntax as pointed out by S.Haran below.)

Afterwards, after you verify things are in order, you can try to re-add the third (fourth) disk with:

sudo mdadm /dev/md0 --add /dev/sdc