Java – SQL RDBMS : one query or multiple calls

daojavasql

After looking around the internet, I decided to create DAOs that returned objects (POJOs) to the calling business logic function/method.

For example: a Customer object with a Address reference would be split in the RDBMS into two tables; Customer and ADDRESS. The CustomerDAO would be in charge of joining the data from the two tables and create both an Address POJO and Customer POJO adding the address to the customer object. Finally return the fulll Customer POJO.

Simple, however, now i am at a point where i need to join three or four tables and each representing an attribute or list of attributes for the resulting POJO. The sql will include a group by but i will still result with multiple rows for the same pojo, because some of the tables are joining a one to many relationship. My app code will now have to loop through all the rows trying to figure out if the rows are the same with different attributes or if the record should be a new POJO.

Should I continue to create my daos using this technique or break up my Pojo creation into multiple db calls to make the code easier to understand and maintain?

Best Answer

You should put correctness first. Create your data structures so that they model the domain in question in a correct and effective way that makes it easy for your code to work with.

Beyond that, try to minimize database calls, especially if the database is not local (residing on the same machine as the program calling it). Network latency is a real consideration here, and it can be non-trivial.

Let's say you have an operation that requires 10 database calls. If your network latency is 100 ms, this operation will take 1 second of pure overhead just communicating with the server, in addition to whatever amount of time it takes to actually do the work involved. If your latency is 1 second, it will waste 10 seconds on network latency alone. But if you get that down from 10 calls to 1, suddenly even in really ugly latency conditions, you're not wasting much time on network overhead.

As a general rule of thumb, if you're just retrieving data simply (and not doing heavy processing of the data inside the database server or on the client), the biggest bottleneck by far in a system with a non-local database will be network latency. So if you can reduce the number of calls, even if it means you need to do extra work on the client side once you've retrieved it, you'll still probably come out ahead.

As always, remember the most important rule of optimization: measure first! Optimize by hard data, not by rules of thumb like the one I just described, or you could easily end up doing a lot of hard work that slows things down! But in general, keeping the number of queries down is usually the best route.

Related Solutions

Database – Using a single table for identity and metadata

You are designing your system in the incorrect order. You need to develop your business objects first. I know you are developing your DB first because you are trying to impose patterns on it that belong in your business objects.

You are also making the common mistake of thinking that there can only be one ID because you are thinking in terms of primary keys.

Also, it is not clear to me, that those things are as related as you think they are.

Assuming they are a related pattern, you should define interfaces to express that, IHasBusinessEntityId, ICreatedDate, IModifiedDate. Please do not forget the interface segregation principal. Then perhaps you can aggregate those interfaces into an IAuditable interface?

Explore the patterns in your business objects first, then you can concentrate on making your table structure fast and efficient.

Each of your tables can have the BusinessEntityId (perhaps a guid, maybe ints created by a SEQUENCE), then additionally, the individual tables sequential int identifier to use as the clustered index (guids are not great for clustered indexes). You can then use a SQL UNION to bring back all your IAuditable objects from disparate tables.

I would very much recommend against marrying all these to one table as it will cost you flexibility and performance problems.

This approach will result in clustered index fragmentation, which can drastically slow queries.

Database – How would you design a user database with custom fields

Please consider this as an alternative. The previous two examples will both require that you make changes to the schema as the application's scope grows in addition the "custom_column" solution is difficult to extend and maintain. Eventually you'll end up with Custom_510 and then just imagine how awful this table will be to work with.

First let's use your Companies schema.

[Companies] ComnpanyId, COMPANY_NAME, CREATED_ON

Next we'll also use your Users schema for top level required attributes that will be used/shared by all companies.

[Users] UserId, COMPANY_ID, FIRST_NAME, LAST_NAME, EMAIL, CREATED_ON

Next we build a table where we will define our dynamic attributes that are specific to each companies custom user attributes. So here an example value of the Attribute column would be "LikeMusic":

[UserAttributeDefinition] UserAttributeDefinitionId, CompanyId, Attribute

Next we define a UserAttributes table that will hold user attribute values

[UserAttributes] UserAttributeDefinitionId, UserId, Value

This can be modified in many ways to be better for performance. You can use multiple tables for UserAttributes making each one specific to the data type being stored in Value or just leave it as a VarChar and work with it as a keyvalue store.

You also may want to move CompanyId off of the UserAttributeDefiniton table and into a cross reference table for future proofing.

Best Answer

Related Solutions

Database – Using a single table for identity and metadata

Database – How would you design a user database with custom fields

Related Topic