Database – Breaking up a monolith into SOA, and breaking referential integrity

databasemicroservicesrelational-databasesoa

When breaking up a large monolithic application with a monolithic RDBMS into a service-oriented architecture with many databases, how do you deal with the breaking of data integrity?

I have a large monolithic application. My team has agreed that it's beneficial to break up this monolith into domain-specific services and embrace more of a service-oriented architecture. With our monolith, we have a huge RDBMS database. The team is split as to whether it's appropriate to carve out domain-specific tables into their own databases. Some folks are concerned about breaking referential integrity and the risk that that brings. However, others are OK with this risk.

Does anyone have experience with this problem and can share some advice? One of my goals is to facilitate the ability to scale horizontally (add more DB servers) as opposed to vertically (beefing up a single huge DB server). At some point in the future, I could envision that certain services may be better suited to a document DB aka NoSQL data store than an RDBMS, and certainly, traditional RDMBS-based integrity would be challenging there. (So approaches like storing the tables in a separate schemas on the same database probably won't be appropriate.)

Best Answer

That referential integrity is a concern at all suggests that there are implicit dependencies between the domains. One thing that I think people who have drunk the microservices kool-aid often misunderstand is that these kinds of dependencies and the issues related to them do not magically disappear once you break down the monolith. It's good that people are concerned and if you are committed to this path, you will want to make sure those concerns are addressed.

If there is really tight coupling between two or more domains from a business perspective, you might want to consider whether they should really be separated into separate DBs. It might not be worth the trouble.

If you do need to maintain integrity between two entities, a couple things might help:

  1. Use surrogate keys always. There should be no reason to change meaningless keys and if your keys never change, you avoid huge swath of RI problems.
  2. Never delete keys and use 'soft' deletes instead. Only adding rows and never updating them is also helpful. At the very least you will know what things meant when they were linked.
Related Topic