Database Agnostic DAO – Combining NoSQL and SQL

daodatabasejava8MySQLnosql

Background

While writing a new component, I m in middle of making a decision of SQL/NOSQL database (Mongo vs Mysql) for my storage layer. As of today, mysql seems to be a perfect fit for my use-case (6-7 domain entities, closely related to each other). Still, I want to keep my integrations with the data layer abstract enough to switch over to a nosql (mongo) in the future.

While trying to build this abstract Data access layer, I feel I am compromising with the offerings of RDBMS big-time (Since NOSQL doesn't support joins as the first class construct, cannot afford to expose joins and other prominent RDBMS features as part of this abstraction.)

Question :

Is it an overkill trying to build such level of abstraction in first place? Is it even possible to build such level of abstraction without compromising on the RDBMS offerings? If possible, What are the recommended patterns ?

Best Answer

The best way to guarantee that you stay reasonably decoupled from the database, but at the same time remain free to use any feature of it, is to not create an abstraction layer for the database. (Well, unless you have the explicit requirement now, that you need to support multiple databases. Otherwise YAGNI.)

The worst thing one can do, is to try to stay "database agnostic". This will almost automatically result in some "common denominator" type interfaces, usually trivial CRUD operations. Then you either can't use any specific feature of your storage backend (which is stupid considering what awesome features dbs have nowadays, not even mentioning completely different paradigms), or you have to constantly introduce new methods for specific features or queries. Even worse, because you don't want this abstraction to "explode" you will be sort-of forced to re-use methods for new requirements, which will be ill-fitting and painful.

The alternative is to model your domain, and provide database specific implementations where it makes sense. One example I came across: We had the requirement to freeze all credit cards of a customer (bank domain). This was initially implemented with an ORM, which had multiple connected entities (data objects with the usual 1-1/1-n relations). We had to issue a query for accounts, then cards, set flags on cards and let the ORM deal with persisting.

Instead of all that, I introduced a method Customer.freezeCreditCards(), which fired an "update" statement directly into a database. While that's not a particularly exciting operation, it shows that if you have the business method somewhere where it makes sense (where the data for it is), that it is trivial to use any optimization or extra feature you require. And you don't have to abstract/generalize features.

Related Solutions

Java – Implementing a NoSQL and RDBMS compatible DAO

Your architecture should look something like this:

Data source <--> Data source driver <--> CRUD interface <--> Application

You'll need one data source driver implementation for each type of data source. The CRUD interface will be the same regardless of data source.

Note that you can have more than just CRUD methods. You might want to return collections, for example. Just make sure that whatever you do on the Data Source side is translatable to your API on the CRUD side for all possible data sources.

When to use a nosql database such as mongodb over thesql

General Uses

If you have data structures that are not clearly defined at the time when you make the system. I tend to keep user settings in nosql, for example. Another example was a system where the users needed to be able to add fields at runtime - very painful in an RDBMS and a breeze in NoSQL.
If your model structure is largely centered around one or few model objects and most relationships are actually child objects of the main model objects. In this case you will find that you will have fairly little need for actual joins. I found that contact management system can be implemented quite nicely in nosql for example. A person can have multiple addresses, phones and e-mails. Instead of putting them each into a separate table, they all become part of the same model and you have one person object.
If you want to benefit from clustering your data across multiple servers rather than having one monolithic server, which is commonly required by RDBMS.
Caching. Even if you want to stick with a RDBMS as your main database, it can be useful to use a NoSQL database for caching query results or keeping data, such as counters.
Storing documents. If you want to store coherent documents, in a database some of the NoSQL databases (such as MongoDB) are actually specialized in storing those.

What about joins?

Honestly, the no join thing sounded quite scary to me too in the beginning. But the trick is to stop thinking in SQL. You have to actually think with the object you have in memory when you are running your application. These should more or less just be saved into the NoSQL database as they area.

Because you can store your full object graph, with child objects, most of the need for joins is eliminated. And if you find you need one, you will have to bite the bullet and fetch both objects and join in your application code.

Luckily, most drivers can do the joining for you, if you set up your schema right.

For further reading I actually recommend Martin Fowler.

Best Answer

Related Solutions

Java – Implementing a NoSQL and RDBMS compatible DAO

When to use a nosql database such as mongodb over thesql

Related Topic