SOA Best Practices – Sharing Database in Services

cqrsdomain-driven-designeaiintegrationsoa

I have recently been reading Hohpe and Woolf's Enterprise Integration Patterns, some of Thomas Erl's books on SOA and watching various videos and podcasts by Udi Dahan et al. on CQRS and Event Driven systems.

Systems in my place of work suffer from high coupling. Although each system theoretically has its own database, there is a lot of joining between them. In practice this means there is one huge database that all systems use. For example, there is one table of customer data.

Much of what I've read seems to suggest denormalising data so that each system uses only its database, and any updates to one system are propagated to all the others using messaging.

I thought this was one of the ways of enforcing the boundaries in SOA – each service should have its own database, but then I read this:

https://stackoverflow.com/questions/4019902/soa-joining-data-across-multiple-services

and it suggests this is the wrong thing to do.

Segregating the databases does seem like a good way of decoupling systems, but now I'm a bit confused. Is this a good route to take? Is it ever recommended that you should segregate a database on, say an SOA service, an DDD Bounded context, an application, etc?

Best Answer

Decoupling only works if there really is separation. Consider if you have an ordering system:

  • Table: CUSTOMER
  • Table: ORDER

If that's all you've got, there's no reason to decouple them. On the other hand, if you have this:

  • Table: CUSTOMER
  • Table: ORDER
  • Table: CUSTOMER_NEWSLETTER

Then you could argue that ORDER and CUSTOMER_NEWSLETTER are part of two totally separate modules (ordering and marketing). Perhaps it makes sense to move these into separate databases (one for each table), and have both modules share access to the common CUSTOMER table in its own database.

By doing that you're simplifying each module, but you're increasing the complexity of your data layer. As your application grows larger and larger, I can see an advantage to separating. There will be more and more "data islands" that really have no relation to each other. However, there will always be some data that cross-cuts all modules.

The decision to put them in different physical databases would typically be based around real-world constraints like frequency of backups, security restrictions, replication to different geographic locations, etc. I wouldn't separate tables into different physical databases just because of separating concerns. That can be handled more simply with different schemas or views.

Related Topic