Microservices Architecture – Approaches to Splitting a Monolithic Application

Architecturemicroservicesweb services

We have a huge project that desperately needs to be broken apart into multiple databases and applications. I can think of 2 possible approaches to this problem:

  1. Use a REST endpoint on each service to fetch/update data. Services that are dependent on each other simply call each other's REST endpoints to fetch/update data. (this makes the most sense to me, and it seems to have a good separation of concerns in terms of what data each service is supposed to "own")
  2. Use queues to pass "entity updated" messages between services and replicate data that was updated between all the services that need the data.

My lead really, really wants to go with the 2nd option because he thinks it will be more cost effective and resilient: using this approach, we don't need to scale up all the dependent services of a particular microservice because data is replicated, and we can also fetch necessary data even if a dependent microservice is down (again because data is replicated). This kind of makes sense to me, however, my gut is telling me that this is a bad idea because replication bugs make me nervous, and it seems complicated to implement properly (lots of possible ways data could get out of sync, and doesn't seem trivial to manually re-sync all data that's necessary to be used in multiple services). I've never worked with such an architecture as mentioned in #2, so I could be completely off base here, which is why I'm posting here and asking for advice.
What would you all recommend? #1 or 2? Something else entirely?
Thanks in advance!

Best Answer

The preferred approach will depend on several factors. At my company, we use both approaches (using Hermes rather than a queue for asynchronous communication) since they both have their advantages in certain situations.

The main factor of interest is what requirements you have on data freshness for a certain pair of services. Basically, asynchronous communication (such as using a queue) has the benefit of causing less coupling between services and potentially allowing better throughput since you can schedule data transfer at a convenient time, repeat failed transmissions as many times as you like and so on. The downside is that things happen asynchronously so there is always some lag involved as one service sees stale data before it gets an update from the other. There may also be considerable disk and CPU resources required if you want to duplicate a large database in each service.

Also, take into account that in a distributed system, things as simple as "making a copy of data in another place" get tricky. If you have one instance of application A and one instance of application B, and everything happens synchronously, you can make and keep in sync a true copy of all data from service A to service B. But when you have N instances of A talking to M instances of B, and to make matters worse, communication is asynchronous, it's really hard to keep an up-to-date copy. For example instance A1 may update a document, then instance A2 also updates the same document and now you have two update events racing to reach instances B1 and B2 which may receive and apply them in any order. It's a big topic but the take out is complexity grows a lot.

So, here are a few concrete examples of when using each of the approaches may be the better choice:

  • You have an authorization service with information about users and their access rights. Here, REST is preferred: you want any change (such as removing a user's permissions) to be active immediately. For security reasons you also don't want to make any copies of users' personal information or credentials.
  • A service displays advertising for certain products your web shop is selling. Each product's ad contains a name, a short description and a link URL. Assuming the number of advertised products is not huge, making a copy via asynchronous communication may be preferable. Products are not updated very often and if they are, adding a delay of a few minutes before the change is visible in the ads is acceptable. Your ad server may index ads in some special format tuned for performance of ad serving, and services are loosely coupled: your ad service is not increasing the load on your main product database as much as if it were using REST requests.
  • Your shop's listing pages show the products you sell. Here, you probably want to query the product database service directly via REST. When you edit a product, you want the change to be visible on your shop's page as soon as possible. The whole product database is probably quite big so duplicating it in the frontend service would be a large hardware cost.

As you see, there are good use cases for both approaches, and in a large system you probably have to settle on using a hybrid approach of using #1 or #2 depending on the situation.