Options to efficiently synchronize 1 million files with remote servers

optimizationrsyncsynchronization

At a company I work for we have such a thing called "playlists" which are small files ~100-300 bytes each. There's about a million of them. About 100,000 of them get changed every hour. These playlists need to be uploaded to 10 other remote servers on different continents every hour and it needs to happen quick in under 2 mins ideally. It's very important that files that are deleted on the master are also deleted on all the replicas. We currently use Linux for our infrastructure.

I was thinking about trying rsync with the -W option to copy whole files without comparing contents. I haven't tried it yet but maybe people who have more experience with rsync could tell me if it's a viable option?

What other options are worth considering?

Update: I have chosen the lsyncd option as the answer but only because it was the most popular. Other suggested alternatives are also valid in their own way.

Best Answer

Since instant updates are also acceptable, you could use lsyncd.
It watches directories (inotify) and will rsync changes to slaves.
At startup it will do a full rsync, so that will take some time, but after that only changes are transmitted.
Recursive watching of directories is possible, if a slave server is down the sync will be retried until it comes back.

If this is all in a single directory (or a static list of directories) you could also use incron.
The drawback there is that it does not allow recursive watching of folders and you need to implement the sync functionality yourself.