Handling error messages from others services in Micro Service Architecture

apierror handlingmicroservices

Our company runs applications on a Micro Service architecture that includes thousands of services. I am working on a backend application "X" that talks to 50+ services. Frontend services call my service "X" to execute requests on other services.

Problem:

Front end wants to show user friendly messages when something fails on other services.

  1. Other services do not return user friendly messages. It is not possible for me to request changes by other teams as there are several.
  2. There are no agreed error codes as such. Other services return a string error message. Currently, it is passed back to the UI. Sometimes the error messages are a pointer references (bad code :/)

Possible Solution:

Check for error message string and have a mapping in my service to a user friendly message. But things can break if the callee service changed their error message. Fallback to a default error message when a custom error mapping is not found.

Any more ideas on scalable and sustainable solution? Thanks!

Best Answer

Disclaimers

Our company runs applications on a Micro Service architecture that includes thousands of services. I am working on a backend application "X" that talks to 50+ services. Frontend services call my service "X" to execute requests on other services.

First of all, thousands of random services don't make an architecture to be Microservices like architecture. It's still necessary a certain sense of a "whole" and a little bit of arrangement among services. Guidelines or rules of thumb.

Contextualize the backend within the 'whole'

I assume, this backend is neither gateway nor proxy. It has its own business and a well defined domain. So, regarding other services, 'X' is a facade to ease the access to this domain.

As a facade, hidding implementation details (as for instance, integrations) is among its responsibilities. No implementation detail should reach other services and this includes integration errors. Whatever happened in 'X', it's nobody business.

That said, it doesn't mean we cannot tell to the user that something went wrong. We can, but we do it abstracting the details. We won't give the sense of something remote is failing. Right the opposite, something in 'X' failed and that's it.

Since we are speaking about thousands of possible integrations (+50 atm), the number of possible and different errors is significant. If we map every single one to a custom message, the end-user is going to be overwhelmed by so many (and uncontextualized) information. If we map all the errors to a small set of custom errors, we are biasing the information, making hard for us to track the problem and solve it.

In my opinion, error messages should provide to the user with the sense that there's something we can do to amend the problem.

Nevertheless, if end-users still want to know what's going on under the hood, there are better ways. For example, logs.

Accountability

  1. Other services do not return user-friendly messages. It is not possible for me to request changes by other teams as there are several.There are no agreed error codes as such.

  2. Other services return a string error message. Currently, it is passed back to the UI. Sometimes the error messages are a pointer references (bad code :/)

As developer, your responsibility is to expose these arguments to the stakeholders. It's a matter of accountability. In my opinion, there's a leak of technical leadership and that's a real problem when it comes to distributed systems.

There's no technical envision. If there was, services would be implemented upon rules of thumb addressed to make the system scalable and ease the integrations among services. Right now looks like services appear wildly.

If I were asked to do what you have been requested to do (and I have been sometimes), I would argue whether turning the current anarchy into user-friendly messages is beyond the scope of X.

At least, "rise the hand", expose your concerns, expose your alternatives and let whoever has the accountability to decide.

Make your solutions valuable for the company

Check for error message string and have a mapping in my service to a user-friendly message. But things can break if the callee service changed their error message. Fallback to a default error message when a custom error mapping is not found.

You are right. That's a weak solution. It's brittle and inefficient in the mid-long run.

I also think it causes coupling since changes in these strings might force you to refractor the mappings. Not a big deal improvement.

Any more ideas on a scalable and sustainable solution?

Reporting. Handle the errors, give a code/ticket/id to them and report. Then, allow the front-end to visualize the report. For instance, sharing a link to the reporting service.

Error. < A user-friendly and very default error message >. Follow the link for further information

This way, you can integrate as many services as you need. And you release yourself from the overhead of handling and translating random strings into new random, but user-friendly, strings.

The reporting service is reusable for the rest of the services so that, if you have correlated IDs, should be possible for you to allow users to have a panoramic view of the errors and the causes. In distributed architectures, traceability is quite important.

Later, the reporting service can be enhanced with as many mappings as you need to give readable and useful instructions about what to do if error X happens. If strings change here doesn't matters at all. What we have (store) is a final state of the report.

The reporting service will open the door to a possible normalization of the errors within the organization since the service will expose a public API (hence a contract).

Related Topic