Client-server synchronization pattern / algorithm

client-serverdata-synchronization

I have a feeling that there must be client-server synchronization patterns out there. But i totally failed to google up one.

Situation is quite simple – server is the central node, that multiple clients connect to and manipulate same data. Data can be split in atoms, in case of conflict, whatever is on server, has priority (to avoid getting user into conflict solving). Partial synchronization is preferred due to potentially large amounts of data.

Are there any patterns / good practices for such situation, or if you don't know of any – what would be your approach?

Below is how i now think to solve it:
Parallel to data, a modification journal will be held, having all transactions timestamped.
When client connects, it receives all changes since last check, in consolidated form (server goes through lists and removes additions that are followed by deletions, merges updates for each atom, etc.).
Et voila, we are up to date.

Alternative would be keeping modification date for each record, and instead of performing data deletes, just mark them as deleted.

Any thoughts?

Best Answer

You should look at how distributed change management works. Look at SVN, CVS and other repositories that manage deltas work.

You have several use cases.

  • Synchronize changes. Your change-log (or delta history) approach looks good for this. Clients send their deltas to the server; server consolidates and distributes the deltas to the clients. This is the typical case. Databases call this "transaction replication".

  • Client has lost synchronization. Either through a backup/restore or because of a bug. In this case, the client needs to get the current state from the server without going through the deltas. This is a copy from master to detail, deltas and performance be damned. It's a one-time thing; the client is broken; don't try to optimize this, just implement a reliable copy.

  • Client is suspicious. In this case, you need to compare client against server to determine if the client is up-to-date and needs any deltas.

You should follow the database (and SVN) design pattern of sequentially numbering every change. That way a client can make a trivial request ("What revision should I have?") before attempting to synchronize. And even then, the query ("All deltas since 2149") is delightfully simple for the client and server to process.

Related Topic