Java – SQL RDBMS : one query or multiple calls

daojavasql

After looking around the internet, I decided to create DAOs that returned objects (POJOs) to the calling business logic function/method.

For example: a Customer object with a Address reference would be split in the RDBMS into two tables; Customer and ADDRESS. The CustomerDAO would be in charge of joining the data from the two tables and create both an Address POJO and Customer POJO adding the address to the customer object. Finally return the fulll Customer POJO.

Simple, however, now i am at a point where i need to join three or four tables and each representing an attribute or list of attributes for the resulting POJO. The sql will include a group by but i will still result with multiple rows for the same pojo, because some of the tables are joining a one to many relationship. My app code will now have to loop through all the rows trying to figure out if the rows are the same with different attributes or if the record should be a new POJO.

Should I continue to create my daos using this technique or break up my Pojo creation into multiple db calls to make the code easier to understand and maintain?

Best Answer

You should put correctness first. Create your data structures so that they model the domain in question in a correct and effective way that makes it easy for your code to work with.

Beyond that, try to minimize database calls, especially if the database is not local (residing on the same machine as the program calling it). Network latency is a real consideration here, and it can be non-trivial.

Let's say you have an operation that requires 10 database calls. If your network latency is 100 ms, this operation will take 1 second of pure overhead just communicating with the server, in addition to whatever amount of time it takes to actually do the work involved. If your latency is 1 second, it will waste 10 seconds on network latency alone. But if you get that down from 10 calls to 1, suddenly even in really ugly latency conditions, you're not wasting much time on network overhead.

As a general rule of thumb, if you're just retrieving data simply (and not doing heavy processing of the data inside the database server or on the client), the biggest bottleneck by far in a system with a non-local database will be network latency. So if you can reduce the number of calls, even if it means you need to do extra work on the client side once you've retrieved it, you'll still probably come out ahead.

As always, remember the most important rule of optimization: measure first! Optimize by hard data, not by rules of thumb like the one I just described, or you could easily end up doing a lot of hard work that slows things down! But in general, keeping the number of queries down is usually the best route.

Related Topic