As you see, the solution might not be very satisfying. The problem of data access at all is, that we have to consider many different aspects which make code not 100% reusable. It's about transactions and locking, and the way e.g. ORM maps a relational database to objects and vice versa.
Mostly if you try to write architecture components, trying to solve the problem you'll soon notice, that you just move the problem to another place. The reason is simply that a bottom-top approach as your question is, is a bad idea.
What we want is an agile top-down strategy. You will see, that most of the accesses are read-only, which (nearly) means no locking and no transactions. Some writes are field updates which must not be synchronized and a few inserts into different tables might need locking and should be in a transaction.
Now to your variants:
Move all methods, even slightly different, as a separate methods into DAO class. This makes DAO complex, their method names will be very long, it would be hard to find correct one.
violates OCP. there should be one entity/table-definition and querybuilding components around it but only depending on the entity itself. That means no large accessor classes.
Forget about fetch joins. Setup application to perform loading when needed (lazy loading). This will reduce number of methods, but database access will be much less effective and transaction must span across entire request, not just controller code.
lazy loading is cool for code which should be more readable focused than performant, e.g. backends with small lists and a few administrators. Not a good idea for uncached permanently accessed frontend pages.
Put complex logic into method when needed. E.g. supply boolean parameters fetchSomething. And method will tweak query according to those parameters. Method implementation will be very complex as I would have to either build query string from pieces or use very cumbersome Criteria API. Method will be hard to read and understand.
parameters like offset and limit are okay, but boolean parameters whether to load sparse or full is too much overhead in code, which might leads to magic behavior.
Leave as is. I generally like this approach. It leads to some code duplication, but not very much and queries are declarative so I don't see much problems with that. The main problem is lack of testability. To test controller I would need to prepare a database, because there's no (or I didn't found) complex JPA mock implementation with supplied in-memory data.
For integration tests yes, you need to save some data to your database (you can use e.g. sqlite). For unit tests no, you mock the service layer according to your controller input and test how controllers behave.
I wonder what approach used in complex projects? I saw 4-th approach and I saw once very complex in-house ORM based on Hibernate.
Big projects actually use service orientated setups on the highest level and use ORM or plain queries according to their needs. There is no one and only solution.
I'm talking now about so-called "repository pattern". I don't really understand what people mean by "service layer" and never saw any need with that, so currently the question is about data repository classes.
IO-Applications are nearly equal. There is a request and a response. The controller is responsible for collecting data from the request to call a service which produces data which will be collected and returned. E.g. business logic and all the use-cases of the software are 100% part of the service layer. Unit-testing the controller by mocking the services is much easier.
"Repository pattern":
I've spent a lot of time on Doctrine and PHP and over the last years I dropped the repositories, because 1% are simply repository operations like a list of countries ordered by name, 80% are more complex DQL or querybuilder queries, also with flat results and the rest are SQL queries I cannot write with DQL.
Result of most of the queries are table with columns from different joined tables. ORM would not makes sense at all, thus I don't use repositories anymore, but a clean service layer and a couple of data access services.
Best Answer
There are two questions in your question. Question 1: How to fetch scattered data accross tables?
Question 2: "some guys in my team told me that the DTOs are the Controller responsibility and they must be sent directly over the network and not through the app layers"
Question 1 Answer: Since you are using hibernate I would focus on the pitfalls this creates. When using ORM very often people forget they are implicitly using SQL. A good aprach is to think backwards , what SQL you would like to have and then map it accordingly via ORM.
When collecting scattered data one major issue that may arrise from ORM usage would be effectivly joining two many tables which may have 2 effects:
So you more or less are left with couple of appriaches which can be used in different scenarios:
Lets say that you have a complex business process that may reuse the same data again and again within the same transaction. In this particular case you should not worry you are fetching to much as the overal gain you would have via Level 1 cache by far will surpass the negative effects of fetching too much. Level 1 caching within a transaction is very effective when dealing with repeatable reads.
Alternative aproach would be to write queries in order to fetch the scattered data accross the tables. There are some PROS and some CONS with this aproach.
PROS:
CONS:
-The level of granularity is very fine grained. You may end up supporting a lot of queries
Now when you decide your aproach if it will be more ORM or if it will be more Query oriented. On top you can model your DTOs but this is secondary.
Hibernate also have a feature called projections so it may be possible to directly use the projections in your DTO later.
Question 2 Answer:
This answer presumes that you have dealt with fetching the scattered data via answer 1. In a big enough system there are well defined benefits to split your persistence model from your domain model even from your DTO model. Not every application is big and complex enough though to justify having all the three layers and all the boilerplate code that comes with it. Very often when doing a simple app a person measures the time and the resource you need to invest in order to keep all these layers. For a small enough application it may be justified to return your entities and hibernate projections as DTOs.
So the answer to your question is:
And by the way there is no such thing as DTO in the DAO layer :) But for small application it may be justified to re-use the objects accross the logical layers.