Linux – sync two huge filesystems

linuxsynchronization

I need to sync two huge file systems regularly in one direction. Both sides run linux with full root access.

My preferred solution: I can read the list of changed files and directories and sync only the changed files. But how can I get the list of changes? Inotify needs a handler for every directory, but there are too many. Maybe from the journal of the file system?

Here are some solutions and why they don't fit:

  • rsync: Needs to check recursively all files. There are some million files and only little changes. The check takes too long.
  • inotify: I need a handler for every directory and there too many. Inotify was not build for "watch all files" scenarios.
  • DRDB: Both sides should run independent. It can happen that the hosts can't connect for some days.

Both machines need to synced about every 15 minutes. The initial sync is no problem, this question is only about syncing the changes.

Best Answer

How about GlusterFS? I have found that the traffic it develops is considerable less than DRBD.