Database Replication – How to Decide Between Database Replication, Custom REST Applications, and Message Brokers

data-replicationdatabaseenterprise-architecturemessage-passing

We have a distributed solution that's currently under design. There are a few points of integration where some application needs data from someone else and vice versa. We could solve by either writing REST interfaces and provide CRUD features so that apps can share data between themselves. Or we can use something like nanomsg or zeroMQ to send messages back and forth.

We also have H/A requirements where we want to do master to master replication across primary and secondary db servers to ensure that if one goes down, the other one can kick in.

Here's the question.
Some on the team feel that we should choose one method of sharing data like master to master replication… and just use it across the board to "share" data.
It's true that we can do that, but I think each case should be analyzed and treated separately depending on specific criteria such as:

  1. are the databases in question identical? if they are, then master to master is appropriate such as in our High Availability scenario.
  2. in the case of disparate systems, how much data are we shuttling back and forth? if it's a lot of data… maybe a message bus is better. just send a notification to say "there's a change… here's the change id. Go get it".
  3. How often does data change? Does this matter? I *think both a REST api and a message bus will be able to handle many transactions.

I'm not sure what else to consider. It feels like creating a message bus solution would be a lot more work than a REST api into each database. But I could be wrong.
What other things should we take into consideration?

Thanks.

Best Answer

If you need to share (read and write) all the data and use transactions, when just use shared access to the database. If you need high availability, consider using master-slave replication. Don't just blindly go with master-master, think carefully about disadvantages:

  • Most multi-master replication systems are only loosely consistent, i.e. lazy and asynchronous, violating ACID properties.
  • Eager replication systems are complex and increase communication latency.
  • Issues such as conflict resolution can become intractable as the number of nodes involved rises and latency increases.

REST cons:

  • You actually need to implement, test, and support it.
  • No transactions on multiple resource, unless you'll reinvent transactions on API layer, which is the worst possible idea.

Message queues cons:

  • Not suited for sharing data. MQs are good for asynchronous inter-process communication: invoking commands, firing events, etc.
  • Again, you need to implement, test, and support message handlers, since the data is not magically shared by just being put into queue.

However, if you need just some parts of your data to be shared, and multi-resource transactions are not needed, when REST (or maybe JSON-RPC) might be a better solution.

Still, MQ will be a bad solution, since it is, as I said, solves another set of problems.


The answer to your question depends on what you actually need:

  • Sharing all the data as it is, being able to use it in transactions - shared DB
  • Sharing some data in granular uniform fashion (CRUD) - REST
  • Operations which involve various parts of data and/or non-trivial logic - JSON-RPC
  • Asynchronous inter-process communication - Message queues
Related Topic