Java – Best practices for retrieving data scattered over multiple tables

hibernatejavajpa

In the company I work with, we have a 3-layer architecture in our micro-services and the flow is like this:

Repository/DAO (entity) => Service (entity) => Controller (dto)

At the Controller level we map our entities to DTOs,

Now I am working on a search feature, and I need to perform a query with Spring Data JPA / QueryDSL that spans (joins) multiple entities (tables) in the database and must return only the fields needed to the UI.

For this purpose in the repository I create a query that return a DTO representing data needed by the UI.

And the query works perfectly and return the correct DTO,

However, some guys in my team told me that the DTOs are the Controller responsibility and they must be sent directly over the network and not through the app layers.

If I go with a different name like FooView or FooProjection should I :

  1. make a extra mapping View => DTO in the controller layer ?
  2. send it directly as FooView?

Or I shoud stick with my DTO at DAO layer ?

Best Answer

There are two questions in your question. Question 1: How to fetch scattered data accross tables?

Question 2: "some guys in my team told me that the DTOs are the Controller responsibility and they must be sent directly over the network and not through the app layers"

Question 1 Answer: Since you are using hibernate I would focus on the pitfalls this creates. When using ORM very often people forget they are implicitly using SQL. A good aprach is to think backwards , what SQL you would like to have and then map it accordingly via ORM.

When collecting scattered data one major issue that may arrise from ORM usage would be effectivly joining two many tables which may have 2 effects:

  • In case of Inner join, joining 4-5 tables especialy if nested like table A join table B join table C where you have one to many A to B and one to many B to C would cause significant data duplication
  • In case of MULTIPLE LEFT JOIN you have the potential of creating a problem known as Cartesian product.

So you more or less are left with couple of appriaches which can be used in different scenarios:

  1. Lets say that you have a complex business process that may reuse the same data again and again within the same transaction. In this particular case you should not worry you are fetching to much as the overal gain you would have via Level 1 cache by far will surpass the negative effects of fetching too much. Level 1 caching within a transaction is very effective when dealing with repeatable reads.

  2. Alternative aproach would be to write queries in order to fetch the scattered data accross the tables. There are some PROS and some CONS with this aproach.

PROS:

  • You fetch only what you need.
  • Performance is OK

CONS:

-The level of granularity is very fine grained. You may end up supporting a lot of queries

  • The queries avoid Level 1 caching hibernate presents.
  • Each query may cause flushing which may be negative when dealing with complex process / business transaction
  1. Resolving the cartesian product issue if you are using activly your entities instead of your queries. You can use Collection Batching, or you can define how many levels your hibernate is allowed to fetch the collections. Another aproach here would be to use Fetch Strategy SELECT instead of JOIN. Such fetch strategy would be also more effective when it comes to hitting the cache because internaly hibernate will fetch aways by ID.

Now when you decide your aproach if it will be more ORM or if it will be more Query oriented. On top you can model your DTOs but this is secondary.

Hibernate also have a feature called projections so it may be possible to directly use the projections in your DTO later.

Question 2 Answer:

This answer presumes that you have dealt with fetching the scattered data via answer 1. In a big enough system there are well defined benefits to split your persistence model from your domain model even from your DTO model. Not every application is big and complex enough though to justify having all the three layers and all the boilerplate code that comes with it. Very often when doing a simple app a person measures the time and the resource you need to invest in order to keep all these layers. For a small enough application it may be justified to return your entities and hibernate projections as DTOs.

So the answer to your question is:

It depends.

And by the way there is no such thing as DTO in the DAO layer :) But for small application it may be justified to re-use the objects accross the logical layers.