I recently started moving a monolithic application to microservices architecture using docker containers.
The general idea of the app is:
scraping data -> format the data -> save the data to MySQL -> serve data via REST API.
I want to split each of the steps into a separate service. I think I have two choices, what is the best practice in microservices architecture here?
Option one
Scraper service – scrapes and publishes to Kafka
Formatter service – consumes messages from Kafka and formats it
API service – consumes Kafka messages, updates MySQL and exposes a REST API
Drawback: If I'm not wrong, docker containers should preferably run only one process per container
Option two
Scraper service – scrapes and publishes to Kafka
Formatter service – consumes messages from Kafka and formats it
Saving to DB service – receives the formatted information and just updates MySQL (runs as python process)
API service – exposes a REST API that serves requests with python flask.
Drawback: Two services connecting to the same DB, supposely not recommended as they would not be decoupled
What is the best practice here? should I go with option one and run flask server and kafka listener in the same container?
Thanks!
Best Answer
I would suggest something along the following lines.
The concept of eventual consistency comes into play here. You can spin up as many replicas and API containers as you need to meet demand, at the cost of them sometimes returning different (old) data. At some point the replica dbs get refreshed and the API starts serving the newest data. This way, writing new data doesn't bottleneck the response times of your reads.