Unit Testing – Is Reading a File Before Testing a Method an Integration Test or Unit Test?

integration-testsunit testingunit-test-data

Let's say I write a parser. It takes an argument that is a String and does something with that. This String can be very long so keeping it in a test class can dirty my code. I think that it would be better to keep this String in a file and read it during unit test.

I have doubts if it is still unit test or maybe integration test. I think that it is a unit test because still I test only my parser, not reading data from file. But I cannot find a source that can confirm my assumption. Can you tell me what type of test it is? It would be also great to prove this with some resource or something.

Best Answer

What makes a test an integration or a system test is the fact that the method under test relies on other parts of the system.

Imagine the following process:

  • The data is loaded from the database as an enumeration,
  • A method walks through the enumeration and aggregates the elements some way,
  • The result of the aggregation is stored back in the database.

When unit testing the method which aggregates the elements, you can't simply make it rely on the database calls: this wouldn't be a unit test because it has external dependencies; a mistake in the database code would fail the test, even if the method itself is correct. In other words, you can't correlate the test failure with the incorrectness of the method:

  • A test may start failing even if the method haven't changed since the last pass.
  • A regression may not fail the test because it is compensated by a change in the external dependency.

So you'll create stubs and mocks for the data access logic, in order to isolate the method. You pass a given enumeration to the method, and you check the result. No database involved.

Now, where do you get the data for the enumeration doesn't really matter. It may come from code. Or from the database. Or from the flat file. This is part of the test, not the code under test. If the test actually calls an API to get the elements, it's still a unit test.

However, unit tests are usually short and simple, because it's easier to reason about them when they are short and simple. Moving data to a file or a database or an API makes the test [unnecessarily] complicated. Now, when you need to check the unit test, you need to look not in one, but two locations. So while those are still unit tests, they can quickly become a maintenance nightmare. The second issue is that a failure in the external data source may lead to a failed unit test, unexpectedly.

  • Using a flat file may be OK (after all, source code is nothing more than a flat file too).

    However, I would find it strange that you need so much data for your unit test that you actually need to put it in a file.

    You must also be sure file system permissions don't get in the way of unit tests, and that the file won't be altered by mistake.

  • Using a database would be weird. How do you ensure the data is correct (if you fill this data in the test setup, then it doesn't make sense: you could pass it directly to the tested method, bypassing the database)? What about methods executing in parallel?

    Moreover, ensuring database is installed correctly and works is not an easy task. Unit tests probably don't need that complexity.

  • Using an API is even weirder. You don't normally control the API, which means that it could randomly fail, failing your unit tests with no reason.

Related Topic