Nfs – DRBD vs. GlusterFS for replication

drbdgitglusterfsnfsshared-storage

I need to build a solution to host internal git repositories. It needs to supports hundreds of thousands (or more) repositories.

I plan on using multiple "dumb" servers with a shared storage, so basically when a client is trying to access a repository – it will be redirected by the load-balancer to any of the available servers. Any change to the repository – will be replicated across all nodes.

My first thought was to use GlusterFS for that, but I've read it doesn't handle well with small files. I'm also thinking of replicating everything myself using DRBD, but this requires more setup and seems more complicated when comparing to GlusterFS.

Which one of the two provides better performances? Basically the problem I'm trying to solve is that when any of the servers goes down – I want others to still be able to serve the data.

Best Answer

This is a classic scale-out use case, and IMO GlusterFS should fit the bill. You can give it a try - just bring a few VMs up, set up a few bricks to be used for repository storage and run a stress test.

DRBD is not an option here anyway - it doesn't scale. If anything, I'd look at other object storage projects (Swift for example), if Gluster doesn't work well enough, but none of them are extremely performance oriented

Related Topic