Java application – How to optimize database calls and space & time which is more important

databasedesignjavaperformance

I am writing a simple Java app to run weekly. The app need call database to get data, check it and update.

The flow I need is little as following:

select configure,orgID where status=true from orgs;
- orgs has thousands of rows, configure is blob
Check blob object configure, and filtered the orgID list
for filtered orgID list, select * from users where status=true and orgID in (orglist)
- users is a huge table. for each orgID, there can be as much as 400k users.
for users information, we update one column of all these users

I have couple questions:

when we select configure this field is blob and it can be as large as 1k. Is it good practice to get thousands of rows at one time? or it is better to make multiple db call? time and space, which is more important?
users table has millions rows, we need update perhaps 1 column for 1 million rows. what is a good practice to make this update? is it better to make 1 million db call? or as little db calls as possible?

Best Answer

when we select configure this field is blob and it can be as large as 1k. Is it good practice to get thousands of rows at one time? or it is better to make multiple db call? time and space, which is more important?

This is totally dependent on the application. The question is: does it meet your performance requirements?

For a simple app that runs once a week, use of time and memory is probably not very important at all. I would write the simple version (get all at once in a single query) first. As long as it performs fine, I wouldn't worry about it. And 1K times several thousand is probably not going to cause a problem.

If you experience any performance problems, only then should you bother to modify it to use multiple DB calls.

users table has millions rows, we need update perhaps 1 column for 1 million rows. what is a good practice to make this update? is it better to make 1 million db call? or as little db calls as possible?

As above, your performance requirements should be driving such decisions. But in this case, making millions of DB calls is more likely to be problematic. And in addition, a single update that modifies many rows at once is typically easy to write.

So, I would aim for a single update statement that updates everything, or at most one update statement per orgID.

Related Solutions

Database Security – How to Handle Database Security from a Desktop Application

I'm afraid adding a Web Service layer is probably the correct solution to your problem.

Separating the client from the underlying database implementation will probably help you in the long run too.

Adding a web service layer doesn't necessarily have to hurt performance...

Indeed, with an appropriate API, a web service can actually improve performance, by batching together multiple database queries within the data center LAN, rather than requiring multiple round trips over the WAN.

And of course a web service layer can often be scaled horizontally, and add appropriate caching to your database queries, perhaps even a change notification mechanism.

A server layer adds security that you cannot possibly ensure with apps running on a remote client. Anything that runs on a client can be "hacked" and should not really be considered in any way trusted. You should only really put presentation logic in the client, and host anything important on hardware you have complete control of.

I don't know about your apps, but my web apps are naturally split into several layers, with the presentation code separated from the persistence layer by at least one level of business logic that keeps the two apart. I find this makes it much easier to reason about my app, and so much faster to add or modify functionality. If the layers are separated anyway, it is relatively easy to keep the presentation layer in the client, and the rest on a server under my control.

So while you can solve your problems without introducing a "web service" layer, by the time you have written all the stored procedures (or equivalent) necessary to fill in the holes in the standard database security implementation, you would probably be better off writing a server-side application that you can write proper unit tests for.

Database – how to compare two large sets of data using PL/SQL

If there are no security restrictions that prevent that, probably the easiest way is to create a database link, so those two databases see each other; then use the minus clause to find what is in one table but not the other.

CREATE DATABASE LINK otherdb
  CONNECT TO otheruser IDENTIFIED BY otherpwd USING 'otherdb.wherever';

SELECT mycolumn FROM mytable MINUS SELECT mycolumn FROM mytable@otherdb;

SELECT mycolumn FROM mytable@otherdb MINUS SELECT mycolumn FROM mytable;

Copying the table from one machine to the other (using 'CREATE TABLE xxx AS SELECT * FROM mytable@otherdb') and using that for the comparison might be faster, though.

Best Answer

Related Solutions

Database Security – How to Handle Database Security from a Desktop Application

Database – how to compare two large sets of data using PL/SQL

Related Topic