Architecture – What are the potential problems with operational circular dependency between microservices

Architecturecircular-dependencymicroservices

I am relatively new to microservice architecture and I was never before involved in a project where the architect insists on having a circular dependency between services.

I am in a position without a choice to design two microservices with circular dependency. Is my natural reaction against to do so – a real one, or I am simply transferring incorrectly from other areas of the software development?

One obvious issue that I can see is with the bootstrapping, forcing the services to keep retrying to connect to each other, being impossible to create an order that all dependencies are already up and running. This though does not seem so bad since I have to have this anyway for fault tolerance.

It also creates some issues with testing, but it seems I will be able to resolve them with doubles.

What real risks and dangers are there in such architecture (if any) that I need to consider?

Best Answer

In another answer, I recommend against this.

The main issue I can see, besides the bootstrapping issue which is related, is if one of the services in the cycle fails, then they all fail. It's bad enough debugging/restoring service when one service fails because one of its upstream services failed. In that case, you can backtrack the dependencies until you find which the closest upstream dependency that's still working. You can fix that dependency, and then repeat the process if the original service is still not working. In this case, you've isolated the second failing service to be downstream of the one you just fixed. Keep repeating until the original service is working.

If you have cyclic dependencies though, there's no way to tell which service in the cycle is causing the problem. If two or more services are having problems, then you are in a Christmas/series lights situation: if two services have failed but you don't know which, then you get no information whether "fixing" one actually helped. Only when you fix both the failing services will you know which they were. If the issue is an operational one that can be resolved by restarting the service, say, then the quickest thing to do to restore service is likely restarting the whole cycle. This avoids needing to isolate the problem and effectively treats the cycle as a single service, at least from a failure management perspective. If the issue is a software defect, then you are in a much worse position because to fix the issue requires isolating the defect, and that's what cyclic dependencies make difficult. Obviously, good monitoring/logging helps significantly with this. Just as obviously, the smaller the cycle, the less severe this problem is.

Related Topic