Install Gluster inside a VM, or set up a VM on top of Gluster

glusterfsvirtual-machinesvirtualbox

I'm setting up a high-availability shared-nothing geographically-distributed web server, using multiple A-records for its domain.
Right now I'm more concerned with high availability — "When I unplug any one power cord, every browser can still see my web site" — than speed.

The web server software runs inside one virtual machine per physical box.
(Does it really matter that which web server and hypervisor I'm using?
If it does, I'm currently using Apache and VirtualBox)

Someone recommended I dump the current horrifically complicated home-grown system that I was planning to use to keep the web servers synchronized, and replace it with Gluster.

Which one of these alternatives is better?

  1. Have the host OS run only the hypervisor and store only the VM disk image. Inside each virtual machine, install the Gluster software, set up a GlusterFS mount point pointed at some folder (brick) inside the VM disk image, and use that mount point (or a folder inside it) as the web root.

  2. Have the host OS run only the hypervisor and store both the VM image and separately a folder (brick) that I allow the VM to access. Inside each virtual machine, install the Gluster software, set up a GlusterFS mount point pointed at the brick outside the VM disk image, and use that mount point (or a folder inside it) as the web root.

  3. Have the host OS run the hypervisor and Gluster. On the host OS, set up a GlusterFS mount point pointed at some folder (brick) elsewhere on the actual physical disk. Allow the VM to access the GlusterFS mount point as the web root. (No need to install Gluster software inside the virtual machine).

  4. Have the host OS run the hypervisor and Gluster. On the host OS, set up a GlusterFS mount point pointed at some folder (brick) elsewhere on the actual physical disk. Since both web servers should be identical, tell the hypervisor to store the virtual disk image inside the GlusterFS mount point.

  5. Something else?

I suspect someone who knows more than I do about Gluster can immediately say "If you do #4, (some horrible thing will happen), and (some other number) causes (some other horrible thing) … so the only option that actually works is (the only remaining number)".
(I.e., I don't think this is a subjective question).

(The "optimization" mentioned at
Can someone explain this GlusterFS setup?
could apply to any of these alternatives).

Best Answer

Unless things have changed since last year, I advise not using glusterfs AFR on geographically disparate hardware. It does not handle latency well and failure to specify read-subvolume will cause it to randomly (apparently randomly, it isn't really random) try to read from a remote brick. As a test, set up your two most latent nodes in gluster replication, then try to touch testfile && time stat testfile on that filesystem to see how long every FS op is going to take, minimum. Even though you specified your application is share-nothing, it's still going to do lock checking and coherency checks for you, which means polling metadata from all other replicas.

After running said test, if you still want to use gluster, this is what you asked:

If you use #4, and try to mount the VM image on multiple VMs at the same time, you're going to get consistency problems in the FS. Normal Linux filesystems do not handle volatile backing storage at all.

Filesystem passthrough like in #2 has poor performance in every virtualization system I've tried. You're much better off using NFS as a transport instead of something like 9p virtio (kvm) or whatever the equivalent is for vbox; windows-host fs passthrough for vbox is fairly awful (read: slow, fragile), I imagine the linux-hosted equivalent is similar. Even if you do get fs passthrough working, odds are it isn't going to support xattr, which is required by glusterfs replication. Historically, getting NFS configured for extended attributes has been something of a pain, but you might have better luck with NFSv4. This approach is fraught with pitfalls, I would avoid it.

#3 has some promise. Set up gluster on the host, then run the gluster NFS client listener and have the VM connect to that. It also is the most easily converted when you find out you A. don't want to use VMs and/or B. don't want to use glusterfs.

#1 is equally workable but has no advantage over bare metal -- any security gains are effectively negated since you say it's for a single-application system. For that matter, neither does #3. The only difference with this one is you can use the linux native driver for glusterfs. You may want to set up a different virtual disk for the brick so you can detach it if you need to reimage the VM without repopulating your brick.

Related Topic