Rest – Microservices and data storage

api-designArchitecturemicroservicesrestsharding

I'm considering moving a monolithic REST API to a microservice architecture, and I'm getting a bit confused about data storage.
As I see it, some of the benefits of microservices would be:

Horizontally scalable – I can run multiple redundant copies of a microservice to cope with load and/or a server going down.
Loosely coupled – I can change internal implementations of microservices without having to change the others, and I can independently deploy and change them etc…

My problem is with data storage. As I see it there are several options:

A single Database service shared by all microservices – this would seem to completely eliminate any benefit of loose coupling.
A locally installed database instance on each microservice – I can't see a way of horizontally scaling this, so I don't think it would be an option.
Each microservice has it's own database service – this seems the most promising, as it preserves the benefits of loose coupling and horizontal scaling (using redundant database copies and/or sharding across several)

To me, the third option seems to be the only option, but it seems incredibly heavyweight to me, and a very overengineered solution. If I'm understanding it right, then for a simple application with 4-5 microservices I'd have to run 16-20 servers – two actual microservice instances per microservice (in case of server failure, and for deploying without downtime), and two database service instances per microservice (in case of server failure etc…).

This, quite frankly, seems slightly ridiculous. 16-20 servers to run a simple API, bearing in mind that a realistic project will probably have more than 4-5 services? Is there some fundamental concept that I'm missing that will explain this?

Some things that may help while answering:

I'm the sole developer on this project, and will be for the foreseeable future.
I'm using Node.js and MongoDB, but I'd be interested in language-agnostic answers – an answer might even be that I'm just using the wrong technologies!

Best Answer

Of your three options, the first (a single, shared database) and the third (a "database service") are the most common.

The first is called an integration database. This is generally not seen as a good solution in a microservice architecture. It does add coupling to your services. It also makes it very easy for one service to simply bypass the other services and query into a database directly. You could lose any kind of data integrity or validation provided by the application level not enforced at the database level.

Your third idea is called an application database. And you're right - it allows you to enforce the loose coupling at the API level between services and allows you to more easily scale services at the database level. It also makes it easier to replace the underlying database technology to something appropriate with each service, just as you can change technology or other implementation details of each service. Very flexible.

However, I'd propose an intermediate solution.

Instead of standing up a database service for every microservice, stand up a schema for every service. If you are using multiple database technologies, you may need to split slightly differently, but the idea would be to minimize the number of database servers that you are running, but make it very easy to split out a service into its own database server if and when it becomes necessary. As long as you only allow a database to access its own schema, you have the advantages of an application database but without the overhead of database servers existing for every application or service.

However, as a solo developer, I would challenge the entire notion of microservices at this point in time - Martin Fowler writes about the Monolith First and the Microservice Premium, Simon Brown talks about modular monoliths, and DHH talks about the Majestic Monolith. I'm not sure how well your monolith is organized, but refactor and organize it. Identify components and make clean separations between them for extracting pieces into a service easily. The same goes for your database structure. Focus on good, clean, component-based architecture that can support refactoring into services. Microservices add a lot of overhead for a single developer to build and support in operations. However, once you actually have a need to scale part of the system, use your monitoring and reporting systems to identify the bottlenecks, extract to a service, and scale as necessary.

Related Solutions

Error Handling Microservices – Upstreaming Microservices Errors

Exceptions should be treated just like domain models. Each service works with their own domain models and should have their own set of exception models as well. When communicating with external systems, the service should convert external exceptions to its domain exceptions as soon as possible. Basically I'm saying go with solution #2.

Lets consider the communication from service A -> B. Service A should first of all have an interface defined to decouple the business logic from the implementation of requests to B. In your example A is an account service and B is a user service. So let's call the interface UserService. This interface would have a set of (ideally) compiler-checked exceptions.

interface UserService
    def getUser(id): User throws UserNotFoundException, UserServiceException

You should implement HTTP client for service B so that any service that needs to depend on service B imports the common HTTP client. The error responses from requests to service B will be defined in this HTTP client component. That way they're only defined once.

class BHttpClient
    def getUserById(id) = 
        response = http.get("/users/${id}").send
        if (response.status == 404) throw new UnknownUserException
        else if (response.status == 500) throw new InternalServerException
        else return json.parse[User](response.content)

The implementation of UserService, HttpUserService will use that HTTP client to communicate with B, should catch HTTP and transport exceptions from the client and wrap them in the appropriate "domain" exception.

class HttpUserService(client: BHttpClient) implements UserService
    def getUser(id) = 
        try {
            client.getUserById(id)
        } catch {
            case e: UnknownUserException => throw new UserNotFoundException(e)
            case e: InternalServerException => throw new UserServiceException(e)
        }

Cons: Service A will need to catch and expect that Server B is capable of returning a bunch of errors, adding coupling between Service A and B.

Service A will catch errors from the http client in HttpUserService and wrap them in meaningful errors for service A. The business logic in service A is decoupled from service B through the UserService interface. HttpUserService is coupled to BHttpClient, but decoupled from service B because you can mock service B at the transport level.

Even if you choose to use a different architecture like @Laiv describes in the comments, you'll still want to decouple yourself from message and events you receive by converting the message models and exceptions into domain exceptions in each service. I don't agree with @Laiv, that it's as cut and dry as asynchronous message architecture or you might as well implement a monolith. There are still big gains that can be made by a synchronous, distributed service oriented architecture like you've described. The first and hardest step of getting the right architecture is to decouple the components. By dividing into microservices early, you can more easily adopt an asynchronous approach later if you need it.

Microservices – Managing Many-to-Many Associations

First of all, I'd start with domain description. You haven't mentioned what it is about (I can guess, but it'd be only a guess). After that I would try to decompose it using value-chain analysis or business-capability mapping. And only after that I would think about implementation.

Considering your problem, the first thing that goes to my mind is that you identified your service boundaries wrong, simply because they need each other's data. You don't want to end up with distributed monolith, do you?

The second thing is that you probably haven't worked through your domain good enough. What concept is represented with users table? Is it a registered user, with all the information and behavior required for registration? Are you sure it's the right concept for communicating with trackers (whatever it is)? So if I got it right, your option 2 is exactly about that: introducing the owner concept that's much closer to your domain. If it really is so, I'm for option 2 as well.

However, it seems out of scope for microservice B. Why should microservice B care that microservice A wants to create an association?

It's all about boundaries. I guess you want to form microservices around entities. That's where SOA failed with its layered service architecture. The better approach is to create services that represent some business function, so they encapsulate both data and behavior. From more practical point of view, it's about creating services around business-processes or use-cases. For example, you could have one service for user registration. It contains user's data and behavior required to register a user. Thus the concept of user is formed naturally, and it belongs only to service A. And this brings me to the next point: the other way to think about services is bounded context. It's a good practice to align services and bounded contexts.

When user is registered, UserCreated event could be emitted. Your second service I guess is interested in it. So upon receiving it a completely different entity could be created, say, Owner entity (whatever that is, either). I.m pretty sure there are a lot of interesting collaborations between it and tracker entity -- keep them in a single service.

Be extremely cautious with option 3. If you copy data, functionality follows. It results in tight coupling. And don't cover with CQRS term, it's not about data synchronization between services via events.

Best Answer

Related Solutions

Error Handling Microservices – Upstreaming Microservices Errors

Microservices – Managing Many-to-Many Associations

Related Topic