Best practice/Patterns for two way data synchronisation

algorithmsdatabase-developmentdesign-patternssynchronization

Quite often in my work the idea of 2-way data synchronisation between database systems crops up. The classic example is two slightly different CRM systems (say, Raiser's Edge and Salesforce) and the need to have a two-way sync of Contact data between them.

API considerations aside, assuming you have a shared key to sync on, and purely thinking of the algorithm/pattern to be employed, this is a task that is often underestimated by non-techies.

For example, you have to watch out for:

  • Can you easily detect which records have changed in both systems (or will you have to compare all records between the systems to detect changes)
  • If you're going for a once-every-N-hours sync, how to deal with conflicts where the same record changes at more-or-less the same time in both systems
  • If you're going for a real-time sync (ie an update in one system immediately triggers an update to the other system) how to handle divergence over time due to bugs or system crashes

Personally I can think of ways to tackle all this but I'm wondering if there are any well known patterns, literature or best practices that I could refer to.

Best Answer

Yes, a hard problem, easily underestimated. And could be a lot of work. If you are on Microsoft technologies, you may want to have a look at Microsoft Sync Framework here and here.