Web Development – Databases and Unit/Integration Testing

databaseintegration-testsunit testingweb-development

I have had a discussion with someone about unit/integration testing with web applications and I have a disagreement about 1 core idea. The issues is that the person I am talking to think that the database the unit test work off of should have pre-populated data in it and I think it should be completely empty before and after the tests are executed.

My concern with pre-populated data in the database is that there is no way to make sure that data is maintained in a good state. The tests themselves are going to be creating, deleting, and modifying data in the database so I really don't see how having data in the database before you start the tests is a good thing.

Is seems the the best way of testing database functionality would be having the following setups:

  1. In a "setup" phase before the test actually run, you first truncate all the tables in the database
  2. Then you insert all the data needed for the test cases you are about to run
  3. Then you run and validate the test cases
  4. Then in a "teardown" phase you once again truncates all the tables in the database

I don't see any other better way to ensuring that the data you are testing against in is a good testable test.

Am I missing something here? Is this not the best way to test database related functionality? Is there some benefit to have pre-populated database that always exists in the database (even before you start the tests or after the tests are done)? Any help in ideas to explain my process differently to better get my point across would also be great (that is if my point has merits).

Best Answer

For me unit tests should not deal with the database, integration tests deal with the database.

Integration tests that deal with the database should in practice have a empty database with a tear up and tear down approach, using a transaction based approach is quite a good way to go (i.e. create a transaction on setup and rollback on tear down).

What your friend sounds like they want to do is test from a 'regression' point of view, i.e. have real data there and see how the system reacts, after all no system is perfect and there can usually be bad data lying around somewhere that provide some quirks to your domain model.

Your best practices are the way to go, and what I tend to do, is if I find a scenario for bad data, write an integration test with a setup up and tear down with that exact scenario.

Related Topic