I suggest you look at glusterfs. I use it for 1) transparency - it's backing store, if you will, is an ordinary file-system such as ext3; 2) data availability - glusterfs provides both striping, replication, or any combination; 3) performance and reliability and 4) easy of use.
While you could use it in a (web-server) client / (file-server) server mode, depending on the speed of your network, it could make more sense to me to enable it on each machine. In a sense the file-server becomes the definitive source. Each web-server reads and writes to it's own local glusterfs server, or at the very least it's own cache at local I/O speeds and to the file-server at network speeds making the system quite fast.
It can use tcp or Infiniband. And it seems that it works under Amazon Web Services. It also exports NFS and CIFS so it can be rather portable. Install via yum under CentOS, up and running in under 20 minutes. Compared to GNBD, it is much easier to setup and use. Glusterfs is configured in a highly modular way so you can use only what you need.
The beauty of glusterfs is that it's very tolerant of network or host outages. At my business, whcreative.com, I use it for partially mobile laptops serving home directories as well as html and database file-systems (for the Drupal CMS) in a mixed environment with CentOS 5.5, Fedora 13, and other assorted Linux flavours. Home directories are served from every laptop as well as the server. When a laptop reconnects after being used off-network, a simple >ls -Rl on the server syncs everything. If a machine crashes and the ext4 filesystem potentially has stale data, it's not a problem as syncing to the crashed machine once it's alive solves the problem rather quickly.
The first drawback is it is only tested on x86_64 (claimed to run on i386). Not a big issue though for most. The bigger drawback is it's documentation. For example, there is no man page describing one of the key commands, glusterfs-volgen and the 'man like' page on the website does not provide a working synopsis although it does provide examples. Configuration options are not clearly documented and take a bit of hacking to figure out. The last drawback is that it essentially relies only on user permissions for security. But in the *nix tradition, it is quite easy to run inside a VPN so that's not really a big issue.
I can not vouch for it's reliability as I have only been using it a few months. However, it seems to handle our home directories just fine after disconnecting, using Laptop, and reconnecting. Of course I don't trust it completely and do tar based backups to a CentOS, ext3 filesystem.
Best of luck,
Eric Chowanski
This is easier to answer...
It's all distilled here: ZFS RAID recommendations: space, performance, and MTTDL and A Closer Look at ZFS, Vdevs and Performance
- RAIDZ with one parity drive will give you a single disk's IOPS performance, but n-1 times aggregate bandwidth of a single disk.
So if you need to scale, you scale with the number of RAIDZ vdevs... E.g. with 16 disks, 4 groups of 4-disk RAIDZ would have greater IOPS potential than 2 groups of 8-disk RAIDZ.
Surprising, right?
I typically go with striped mirrors (RAID 1+0) on my ZFS installations. The same concept applies. More mirrored pairs == better performance.
In ZFS, you can only expand in units of a full vdev. So while expansion of a RAID 1+0 set means adding more pairs, doing the same for RAIDZ sets means adding more RAIDZ groups of equal compositon.
Best Answer
The question is a bit limited on information as to how the test where done so I am assuming that you where just using dd to write directly to a file.
BTRFS:
In BTRFS terms RAID1 means TWO COPIES regardless of how many disks you have in the pool. So for simplicity let's assume that both data and metadata is stored in RAID1 on two disks. Therefore each disk is a copy of the content of the other disk (in BTRFS the disks layout may not be identical).
When BTRFS executes reads it (the last time I checked) relied on the PID of the process to determine what disks to read from first. That means that if you run a single process it will only read from one of the disks, unless there is a error and a good copy needs to be retrieved form the other disk.
The next time you run that process it may have a different PID and BTRFS will read the data from another disk first.
MD: (MDADM)
For reads that are sequential you would not gain much if anything since the same data is on both devices and both disks would therefore need to skip (seek over) the same amount of data before starting a read. E.g. disk A would need to wait (skip over) the first 10 bytes before reading the next 10 bytes and even if disk B could read the first 10 bytes at the same time disk A is skipping it would still need to wait until disk A have finished to be able to put together the 20 bytes it was supposed to read in memory.
From the MD manual page (man md):
"Note that the read balancing done by the driver does not make the RAID1 performance profile be the same as for RAID0; a single stream of sequential input will not be accelerated (e.g. a single dd), but multiple sequential streams or a random workload will use more than one spindle. In theory, having an N-disk RAID1 will allow N sequential threads to read from all disks."
ZFS:
I have no knowledge of ZFS , but I expect it to work roughly the same as BTRFS/MDADM.
CONCLUSION:
For single sequential read operations like you probably do with dd there is not much to be gained performance wise by having a RAID1 setup on both BTRFS and/or MDADM.
If you would like to see the improved read speed (that does exist) on both BTRFS and MDADM you would need to do multiple different reads in parallel on the array. BTRFS would likely distribute reads on different disks based on PID and MDADM should reduce the number of seeks significantly. Remember that RAID1 is not the same as RAID0 and especially not on BTRFS arrays.