Unit Testing – Does Scientific Model Software Require Unit Tests?

testingunit testing

I work in a field where lots of code is written, but hardly ever tested. This is because we are foremost scientists who try to solve problems with code. The few coding courses we had, focused on the basics and many have never heard of git, unit testing, clean code after graduating. Many haven't even heard of those during their PhD…

Maybe its better now, but 10-5 years ago we did not have any mandatory courses which cover those areas.

Often the software solves differential equations numerically. In many cases PDEs with many feedbacks going on.

Think of weather predictions, chemical reactions, atmospheric models and so on.

So now my questions, would you trust results of a complex software with many hundreds or thousands of functions without a single unit test? If there are tests then they are rather high level, like to check if the results stay the same with the same input or if the results of a very simple case fit an analytical solution.

Even if you know that the numerical solution of the equation is sound, based on some years old publication, would you trust the model to make predictions?
Would you trust it if it can cause billions of damage of even loss of live?

On a side note, often these models are compared against each other with the same simplified inputs.

Best Answer

A few aspects I would like to touch on.

I work in a field where lots of code is written, but hardly ever tested. This is because we are foremost scientists who try to solve problems with code

I think this is common in science. And I think it's only partly due to lack of courses or motivation.

I think the main reason is that a lot of scientific code is more prototyping than application development. A lot of it is used for a few analyses and abandoned. It's small, so you can test by hand.

One of the main benefits of unit tests is for long-term maintenance and refactoring. If your code won't be maintained long, and you won't refactor it, it's reasonable to prioritize unit tests less.

But a part of the software is reused a lot (unfortunately not usually clear beforehand). And then...

Would you trust it if it can cause billions of damage of even loss of live?

At this point we've left 'prototyping' and entered application development. I'd assume the code is maintained a long time by multiple people. It'll likely be refactored if it keeps growing. It has probably long ago stopped being possible to test everything by hand for most changes.

And, of course, risk tolerance would be much lower if the possible damage is greater.

Unit tests become much more valuable due to all that. I think it pays to follow better software engineering principles like unit testing at this point, and honestly a while before this point.

Often the software solves differential equations numerically. In many cases PDEs with many feedbacks going on.

I think the more important quality is scale (lifetime, collaboration, change frequency, complexity...), not so much whether there are scientific models.

But I'll say that such things are actually quite easy to test automatically (whether or not you'd still call it a 'unit' test). No UI or external dependencies to be mocked.

The more examples and edge cases are covered, the more one would trust it. It probably takes some scientific insight into how 'well behaved' the model, and knowledge of the risks, to know how much is enough.

often these models are compared against each other with the same simplified inputs.

That would actually give me quiet a bit of confidence. I think it's a good method of validation and bug detection.

It doesn't help much with localizing problems though - you might not even know which of the models is wrong, let alone what is wrong with it. Unit tests could help with that.