Performance – How to Obtain Read Speeds of Two Disks Using mdadm/btrfs RAID1 or ZFS Mirror

btrfsmdadmperformanceraid1zfs

Given that RAID1 writes two copies of the data, my understanding is that reads should be close to twice that of a single disk.

I have tested read performance of different technologies (mdadm, zfs, btrfs) with little success.

From my experience:

btrfs only used one disk during reads
zfs/mdadm used both disks however the read speed achieved was that of a single disk

I've verified the results using dd, iotop and iostat.

How can data on both disks be read at the same time in order to achieve twice a single disk's read performance?

Best Answer

The question is a bit limited on information as to how the test where done so I am assuming that you where just using dd to write directly to a file.

BTRFS:

In BTRFS terms RAID1 means TWO COPIES regardless of how many disks you have in the pool. So for simplicity let's assume that both data and metadata is stored in RAID1 on two disks. Therefore each disk is a copy of the content of the other disk (in BTRFS the disks layout may not be identical).

When BTRFS executes reads it (the last time I checked) relied on the PID of the process to determine what disks to read from first. That means that if you run a single process it will only read from one of the disks, unless there is a error and a good copy needs to be retrieved form the other disk.

The next time you run that process it may have a different PID and BTRFS will read the data from another disk first.

MD: (MDADM)

For reads that are sequential you would not gain much if anything since the same data is on both devices and both disks would therefore need to skip (seek over) the same amount of data before starting a read. E.g. disk A would need to wait (skip over) the first 10 bytes before reading the next 10 bytes and even if disk B could read the first 10 bytes at the same time disk A is skipping it would still need to wait until disk A have finished to be able to put together the 20 bytes it was supposed to read in memory.

From the MD manual page (man md):

"Note that the read balancing done by the driver does not make the RAID1 performance profile be the same as for RAID0; a single stream of sequential input will not be accelerated (e.g. a single dd), but multiple sequential streams or a random workload will use more than one spindle. In theory, having an N-disk RAID1 will allow N sequential threads to read from all disks."

ZFS:

I have no knowledge of ZFS , but I expect it to work roughly the same as BTRFS/MDADM.

CONCLUSION:

For single sequential read operations like you probably do with dd there is not much to be gained performance wise by having a RAID1 setup on both BTRFS and/or MDADM.

If you would like to see the improved read speed (that does exist) on both BTRFS and MDADM you would need to do multiple different reads in parallel on the array. BTRFS would likely distribute reads on different disks based on PID and MDADM should reduce the number of seeks significantly. Remember that RAID1 is not the same as RAID0 and especially not on BTRFS arrays.

Best Answer

Related Solutions

Centos – Is Software Raid1 Using mdadm with a Local Hard Disk and GNDB Possible

How to Calculate IOPS for ZFS RAIDZ vs RAID5 & RAID6

Related Topic