How should we serve files in a small bioinformatics cluster

storage

We have a small cluster of six ubuntu servers. We run bioinformatics analyses on these clusters. Each analysis takes about 24 hours to complete, each core i7 server can handle 2 at a time, takes as input about 5GB data and outputs about 10-25GB of data. We run dozens of these a week. The software is a hodgepodge of custom perl scripts and 3rd party sequence alignment software written in C/C++.

Currently, files are served from two of the compute nodes (yes, we're using compute nodes as file servers)– each node has 5 1TB sata drives mounted separately (no raid) and is pooled via glusterfs 2.0.1. They each have as 3 bonded intel ethernet pci gigabit ethernet cards, attached to a d-link DGS-1224T switch ($300 24 port consumer-level). We are not currently using jumbo frames (not sure why, actually). The two file-serving compute nodes are then mirrored via glusterfs.

Each of the four other nodes mounts the files via glusterfs.

The files are all large (4gb+), and are stored as bare files (no database/etc) if that matters.

As you can imagine, this is a bit of a mess that grew organically without forethought and we want to improve it now that we're running out of space. Our analyses are I/O intensive and it is a bottle neck– we're only getting 140mB/sec between the two fileservers, maybe 50mb/sec from the clients (which only have single NICs). We have a flexible budget which I can probably get up $5k or so.

How should we spend our budget?

We need at least 10TB of storage fast enough to serve all nodes. How fast/big does the cpu/memory of such a file server have to be? Should we use NFS, ATA over Ethernet, iSCSI, Glusterfs, or something else? Should we buy two or more servers and create some sort of storage cluster, or is 1 server enough for such a small number of nodes? Should we invest in faster NICs (say, PCI-express cards with multiple connectors)? The switch? Should we use raid, if so, hardware or software? and which raid (5, 6, 10, etc)?

Any ideas appreciated. We're biologists, not IT gurus.

Best Answer

I'm in the field of computer science and I do research in bioinformatics. Currently 746 on Biostars :)

I have been operating the bioinformatics compute facilities for 3 years at a university (about 40 Linux servers, 300 CPUs, 100TB disk space + backups, about 1T RAM total - servers ranging 16 to 256GB of RAM). Our cluster has 32 8-core compute nodes, 2 head nodes, and we are expanding it with 2 more 48-core compute node. We serve the files to the compute nodes over NFS.

I would recommend switching to NFS for your situation.

We considered switching to Gluster, Lustre, and Samba but decided not to use those.

NFS

I have a few main tips about NFS:

  1. Have a dedicated NFS server. Give it 4 cores and 16GB RAM. A dedicated server is more secure and easier to maintain. It's a much more stable setup. For example, sometimes you need to reboot the NFS server - a dedicated server will not fail your disk accessing computations - they will simply freeze and proceed once NFS server is back.
  2. Serve to your compute and head nodes only. No workstations. No public network.
  3. Use NFS version 3. From my experience NFSv4 was more fragile - more crashes - harder to debug. We switched the cluster from NFSv3 to NFSv4 and back several times before settling. It's a local network so you don't need the security (integrity and/or privacy) of NFSv4.

Storage Hardware

Our current cluster was bought 3 years ago so it's not using SAS, but rather has an expansive FiberChannel drives and san controllers. This is changing, all the new storage that we are buying is SAS.

I would suggest considering a SAS storage. SAS is replacing FiberChannel as a cheaper, faster and a better solution. Recently I did research on the different solutions offered. Conveniently the options that we looked at are documented of Server Fault: What are SAS external storage options (Promise, Infortrend, SuperMircro, ...)?

We recently ordered a 24TB 6Gb SAS - 6Gb SAS storage system from RAID Incorporated. Just for the storage we payed $12k. The order should come in a couple of weeks. This is a no-single-point-of-failure system - all components are redundant and automatically fail over if any components fail. It's attached to 2 servers each using a different partition of the array. It is a turn-key solution so once it's shipped we just need to connect it, power it on, and it will work (RAID6 partitions will be mounted on Linux). The order also included servers and RAID Incorporated are setting-up Linux Debian on those for no extra cost.

Other considerations

Unfortunately, if you do bioinformatics infrastructure operations you probably need to become a storage guru.

For your 10TB partition, pick RAID6 - 2 drives can fail without losing you data. Rebuilding a 2TB drive onto a hot spare takes 24 hours, another drives can fail during that time. I had 2 drives fail simultaneously in a 16 drive array.

Consider dedicating one drive to be a hot spare in the array. When you have more then 16 drives then I would say a hot spare is a must.

Think of a plan of action if hardware fails on the dedicated NFS server. I would keep a twin as a compute node as a potential replacement for the original NFS server.

Finally, I have to mention our file server is running OpenSolaris (sounds unusual - I know). OpenSolaris (as it turned out for us) has excellent server hardware support (FiberChannel, IniniBand, ...). Setting up an NFS server ground up takes 1 hour - all steps a completely straight forward: install os, update through a NAT, setup network, create a zfs pool, create zfs filesystems, share NFS. Sun were the ones who developed NFS in 1984, not surprisingly OpenSolaris is very good at serving NFS. The main reason to use OpenSolaris was ZFS - a good filesystem for bioinformatics. Some features that I like:

  • Integrity (all writes are checksumed)
  • Pooled storage, snapshots
  • NFS exports are configure in the served filesystem
  • Online compression
  • Reservations (space guarantees)
  • Block level Deduplication
  • Efficient backups (see zfs send).

Using Linux for your NFS server would be fine - in that case stick to XFS or Ext4.

Related Topic