I would create two (sets of) test cases. One that runs often, is fast and validates only statistically and another one that runs less often (depending on your project schedule - every week or month) but checks all the data.
The fast test would import all the data then check:
- the number of records
- take a number of records and check their integrity (pick every K records where K could be something like
number of records / 100
)
- if you have columns that support aggregate operations which execute fast you might want to check those too in the fast tests.
For the test that is executed less often just do a full integrity check either by doing 1-1 comparisons or by computing hashes.
I have been asked to write the same type of interfaces and this is
what I have been told/learned.
I believe what you are referring to is the "database layer of an application".
I have seen this called the "Data access layer (of an application) also:
http://en.wikipedia.org/wiki/Data_access_layer
http://www.paulgrenyer.com/Integration_Testing_A_Java_Enterprise_Application_Data_Access_Layer.pdf
1. In order to have efficient data access
the underlying database objects
themselves have to not have performance related issues.
E.G. If you are using tables, the tables should have appropriate
indexes, the statistics on the tables have to be up-to-date,
and if you don't need the most recent data in the table,
it is a good idea to consider using materialized views.
This seems obvious, but make sure you have access to the database
itself and the database doesn't have any problems of its own,
E.G. too many outstanding sessions, I/O problems, etc.
2. In the code for the database access,
it is a good idea to
avoid hard parsing and using bind variables
will help avoid this.
3. If the work that is done in your data access layer
consists mostly of short lived transactions it may be a good idea
to use connection pooling. I have
http://www.javaranch.com/journal/200601/JDBCConnectionPooling.html
I am not sure whether the class you are using will work/is designed
for connection pools but if you are using a Session object (usage
of getDefaultInstance leads me to believe this might be a Session
object) and you are using this for Hibernate then this URL does state
it will work with a connection pool
http://docs.jboss.org/hibernate/orm/3.5/api/org/hibernate/Session.html
I have a feeling you could be using a different object - the object
documentation should state whether it uses/is compatible with
connection pooling
4. This URL states getDefaultInstance will always create
a new object and there are alternatives. I am not sure if that is
what you want - it is something to consider
https://stackoverflow.com/questions/4184204/what-is-the-difference-between-getdefaultinstance-and-getinstance-in-session
Again I apologize I am not sure if this document is actually for the
class you are using - this is just a best guess.
Best Answer
The problem with using a 'real' database for unit testing is the setup, take down, and isolation of the tests. You don't want to have to spin up an entirely new MySQL database and create tables and data just for one unit test. The problems with this have to do with the external nature of the database and your test database is down, your unit tests fail. There are also issues with making sure you have a unique database for testing with. They can be overcome, but there is a simpler answer.
Mocking the database is one option however it doesn't test the actual queries that are run. It can be used as a much simpler solution when you want to make sure the data from the DAO goes through the system properly. But for testing the DAO itself you need something behind the DAO has the data and the queries run properly.
The first thing to do is use a in memory database. HyperSQL is an excellent choice for this because it has the ability to emulate the dialect of another database - so that the minor differences between databases stays the same (data types, functions and the like). hsqldb also has some nice features for unit testing.
This loads the state of the database (the tables, initial data) from the
testData
file.shutdown=true
will automatically shut down the database when the last connection closes.Using dependency injection, have the unit tests select a different database than what the production (or test, or local) builds use.
Your DAO then uses the injected database for which you can launch tests against the database.
The unit tests will then look something like (bunch of boring stuff not included for brevity):
And thus, you've got a unit test that calls the DAO and is using the data that was set up in an on the fly database that exists for the duration of the test. You don't have to worry about external resources or the state of the database before the run, or restoring to a known state (well, the 'known state' is 'does not exist' which is trivial to revert to).
DBUnit can make much of what I described a simpler process in setting up the database, creating the tables and loading the data. If you were going to need to use the actual database for some reason, this is by far the better tool to use.
The above code is part of a maven project I wrote for proof of concept TestingWithHsqldb on github