SQL Testing – How to Test Data Based on SQL Queries

dataMySQLsqlunit testing

I'm trying to set up a test framework for (SQL) datasets that get updated daily. I want to write assertions that involve comparisons between two tables–for example, the sum of column visitors in the customer table, grouped by day, should be within 5% of the total in the transactions table (but might not line up precisely due to noise in how each table is populated.)

These tests are hard to specify via constraints, but easy to run queries against. For example, one option would be to write queries that insert failing cases into a logging table, and have unit tests that run the queries then check whether the logging table is empty.

How do you approach tests like this? Do you use specialized tools, or standard unit testing frameworks?

Best Answer

Been there, done that.

First let's make something clear:

  • That's not unit testing. Unit testing is about code. You are not runnning the tests after a code change to test if code alterations introduced a bug or unwanted behavior. Instead you want to run some routines at the end of a business day to see if some business performance indicator has met some predefined goal or is between some predefined threshold. If the data doesn't comply with the 5% rule, that's not because a bug in the software.
  • What you are really doing is monitoring your business performance based on the data records.
  • That could also be made to detect suspected deviations.

What I have made is this:

  • I've programmed some routines (the language is not important) and scheduled them in a crontab to run every date at a certain time.
  • That routines populate a table.
  • That table is used by a web app to render a report.

I have not used any specialized software so I cannot recommend you one.

You can integrate those reports into any portal or any Balanced Scorecard software that allows ad-hoc conectors to customize dashboards.