Database – Normalisation Vs Join , trade-off

databasemongodbMySQLnormalizationpostgres

In RDBMS, goals of Normalization:

  • Free the database of modification anomalies

  • Minimize re-design when extending

  • Avoid bias toward any particular access pattern

First step involves avoiding redundancy etc…

As data is normalised and divided in multiple tables, join operations are required to satisfy queries placed by application. join operation is costly.

With relational model, there is an issue of object mismatch i.e., gap between real world(application access) and relational world.


On the contrast, Document databases(like MongoDB) do not have object mismatch issue.

The single most important factor in designing the database schema with document database, is about, matching the data access patterns of your application. So, database design is pretty easy.

They also support transactions, now

De-normalisation issue still gets carried in Document database, unless you link data among collections and map/reduce on application access


Question:

Why document oriented database is yet to replace relational model database?

Best Answer

Well, first of all, normalization in a relational database is not an absolute goal. There are certainly situations where maintaining a denormalized table or column does make sense from a performance perspective.

Document databases, for the most part, don't have joins at all. Instead, you must write code that simulates a join, if you need one. The performance cost of a join in a relational database is more than made up for by the convenience it provides, especially if the fields that participate in the join are properly indexed. If you're going to need a join anyway, you might as well do it in a system that is specifically designed to work with joins.

Relational databases benefit from decades of research and refinement, and are supported by a well-developed body of knowledge from a robust community of users. Despite the object-relational mismatch, relational databases are nevertheless well-suited for business applications, which comprise at least 90 percent of the active software development in industry.

The so-called scalability limitations of relational databases are frequently overstated; such databases can be terabytes in size and support billions of records. You only need a big-data solution when your data sizes get into the hundreds of terabytes or petabytes range. A relational database can still handle the job 95 percent of the time.

Related Topic