Is rsync a good candidate for failover implementation (very large dataset)

failoverlarge-datarsync

I have a large a set of data (+100 GB) which can be stored into files. Most of the files would be in the 5k-50k range (80%), then 50k – 500k (15%) and >500k (5%). The maximum expected size of a file is 50 MB. If necessary, large files can be split into smaller pieces. Files can be organized in a directory structure too.

If some data must be modified, my application make a copy, modifies it and if successful, flags it as the latest version. Then, old version is removed. It is crash safe (so to speak).

I need to implement a failover system to keep this data available. One solution is to use a Master-Slave database system, but these are fragile and force a dependency on the database technology.

I am no sysadmin, but I read about the rsync instruction. It looks very interesting. I am wondering if setting some failover nodes and use rsync from my master is a responsible option. Has anyone tried this before successfully?

i) If yes, should I split my large files? Is rsync smart/efficient at detecting which files to copy/delete? Should I implement a specific directory structure to make this system efficient?

ii) If the master crashes and a slave takes over for an hour (for example), is making the master up-to-date again as simple as running rsync the other way round (slave to master)?

iii) Bonus question: Is there any possibility of implementing multi-master systems with rsync? Or is only master slave possible?

I am looking for advice, tips, experience, etc… Thanks !!!

Best Answer

Is rsync smart/efficient at detecting which files to copy/delete?

Rsync is extremely efficient at detecting and updating files. Depending on how your files change, you might find a smaller number of large files are far easier to sync then lots of small files. Depending on what options you choose, on each run it is going to stat() every file on both sides, and then transfer the changes if the files are different. If only a small number of your files are changing, then this step to look for changed files can quite expensive. A lot of factors come into play about how long rsync takes. If you are serious about trying this you should do a lot of testing on real data to see how things work.

If the master crashes and a slave takes over for an hour (for example), is making the master up-to-date again as simple as running rsync the other way round (slave to master)?

Should be.

Is there any possibility of implementing multi-master systems with rsync?

Unison, which uses the rsync libraries allows a bi-directional sync. It should permit updates on either side. With the correct options it can identify conflicts and save backups of any files where a change was made on both ends.

Without knowing more about the specifics I can't tell you with any confidence this is the way to go. You may need to look at DRBD, or some other clustered device/filesystem approach which will sync things at a lower level.