Theorically, GlusterFS is an answer to your need.
Using GlusterFS, you can easily create RAID0 (type cluster/distribute) and RAID1-like volumes (type cluster/replicate), distributed across many machines.
GlusterFS architecture lets you stack translators in a way you can create 2 distributed volumes, replicate them, and then, access your distributed/replicated data through an unique mount point.
However, there are some feedbacks of bugs appearing when users stack such translators (see GlusterFS mailing lists).
That's why I don't trust GlusterFS enough to setup RAID10-like volumes. (Since I don't tested this setup enough, it's only a belief)
Of course, simple RAID0-like and RAID1-like volumes seem production-ready.
Let's say you have two machines, A and B. On each machine, you export /opt/files
as a Gluster brick, and set up client-side replication. We then mount the resulting directory as /mnt/gluster-files
on both machines. This is important!
Using that mount point, we now have a highly available file system across the two machines.
When you write a file - let's say /mnt/gluster-files/example
on machine A, it will cause two things to happen:
- Write a copy to
/opt/files
- Send a copy over the network to be written to
/opt/files
on machine B.
This is good, because we want to have redundancy, which means we have to have more than one copy of the data.
Next up, let's say we want to read the same file. Again on machine A:
- You issue a read for
/mnt/gluster-files/example
- GlusterFS says "I need to check all the replica nodes to find out who has the most recent version of this file"
- GlusterFS checks every node
- It turns out that all copies are the same, because replication is working nicely
- You are returned the file from your local disk. §
(§ There is a read-subvolume
client option, and it is sensible to set it to the local volume on any machine that is a Gluster client and server, as in this case. Otherwise, step 5 could be 'you are sent the file from a random node'.)
Behind the scenes, GlusterFS keeps /opt/files
on both machines in sync. Checking every node, especially for a large number of small files, adds a not-insignificant performance penalty.
The question is therefore raised: if I am running a process on one of these two machines, and I know the files are in sync, why can't I just read the files from the local share?
It's not recommended, but you can do this. Read the files from /opt/files
. Manually keep track of if you get out sync, and if you do, do something like a ls -laR
in /mnt/gluster-files
which will trigger a synchronization.
So, what happens if you write to /opt/files
on machine A?
The file sits there unnoticed by GlusterFS. Gluster doesn't work that way. It doesn't get onto machine B unless you happen to do something which makes Gluster notice it on machine A.
Therefore, you can't just tell Apache to read and write to /opt/files
. What seems like a good compromise is telling it to read from /opt/files
but write to /mnt/gluster-files
. This is only possible if your application lets you specify a different path for reading and writing files, which not many do.
Best Answer
It is possible that you could use ceph as your cluster file system, and export as smb, with ldap (well, I assume you can authenticate to ldap for Samba).
http://ceph.newdream.net/
Of course, that means btrfs (beta) and ceph (is it even beta?).
Have you checked on the Gluster mailing list? http://www.gluster.org/interact/mailinglists/
Ceph should work, assuming it works well enough.