Storage Solutions – How to Grow Beyond 150 TB with ZFS and GlusterFS

glusterfsstoragestorage-area-networkzfs

My group currently has two largish storage servers, both NAS running debian linux. The first is an all-in-one 24-disk (SATA) server that is several years old. We have two hardware RAIDS set up on it with LVM over those. The second server is 64 disks divided over 4 enclosures, each a hardware RAID 6, connected via external SAS. We use XFS with LVM over that to create 100TB useable storage. All of this works pretty well, but we are outgrowing these systems. Having build two such servers and still growing, we want to build something that allows us more flexibility in terms of future growth, backup options, that behaves better under disk failure (checking the larger filesystem can take a day or more), and can stand up in a heavily concurrent environment (think small computer cluster). We do not have system administration support, so we administer all of this ourselves (we are a genomics lab).

So, what we seek is a relatively low-cost, acceptable performance storage solution that will allow future growth and flexible configuration (think ZFS with different pools having different operating characteristics). We are probably outside the realm of a single NAS. We have been thinking about a combination of ZFS (on openindiana, for example) or btrfs per server with glusterfs running on top of that if we do it ourselves. What we are weighing that against is simply biting the bullet and investing in Isilon or 3Par storage solutions.

Any suggestions or experiences are appreciated.

Best Answer

I hope this is gonna help a little. I tried to not let it turn into a full wall of text. :)

3Par/Isilon

If you can and will dedicate a fixed amount of man-hours for someone who takes the SAN admin role and wanna enjoy a painless life with night-sleep instead of night-work then this is the way I'd go.

A SAN lets you do all the stuff where a single "storage" would limit you (i.e. connect a purestorage flash array and a big 3par sata monster to the same server), but you also have to pay for it and keep it well maintained all the time if you wanna make use of the flexibility.

Alternatives

Amplidata

Pros: Scale out, cheap, designed with a nice concept and dedicated read/write cache layers. This might actually be the best thing for you.

RisingTideOS

Their target software is used in almost all linux storages now and it's allowing for a little better management than plain linux / gluster stuff could. (Imho) The commercial version might be worth a look.

Gluster/btrfs

PRO: Scales out and "Bricks" give you an abstraction layer that is very good for management.

CON: The first has been a total PITA for me. It was not robust, and failures could be either local to one brick or take out everything. Now, with RedHat in control it might actually turn into something working and i've even met people who can tame it so that it works for years. And the second is still half-experimental. Normally a FS needs 3-4 years after it's "done" till it's proven and robust. If you care for the data, why would you ever consider this? Talking of experimental, Ceph commercial support is almost out now, but you'd need to stick to the "RBD" layer, the FS is just not well-tested enough yet. I wanna make it clear though that Ceph is much more attractive in the long run. :)

ZFS

Pro: Features that definitely put a nail in other stuff's coffin. Those features are well-designed (think L2ARC) and compression/dedup is fun. Have more "storage clusters" meaning having also just small failures instead of one large consolidated boom

Con: Maintaining many small software boxes instead of a real storage. Need to integrate them and spend $$$ hours to have a robust setup.