How to Write Tests Without Mocking or Stubbing

bddintegration-teststddtestingunit testing

I have been using TDD when developing some of my side projects and have been loving it.

The issue, however, is that stubbing classes for unit tests is a pain and makes you afraid of refactoring.

I started researching and I see that there is a group of people that advocates for TDD without mocking–the classicists, if I am not mistaken.

However, how would I go about writing unit tests for a piece of code that uses one or more dependencies? For instance, if I am testing a UserService class that needs UserRepository (talks to the database) and UserValidator (validates the user), then the only way would be… to stub them?

Otherwise, if I use a real UserRepository and UserValidator, wouldn't that be an integration test and also defeat the purpose of testing only the behavior of UserService?

Should I be writing only integration tests when there is dependency, and unit tests for pieces of code without any dependency?

And if so, how would I test the behavior of UserService? ("If UserRepository returns null, then UserService should return false", etc.)

Thank you.

Best Answer

This answer consists of two separate views on the same issue, as this isn't a "right vs wrong" scenario, but rather a broad spectrum where you can approach it the way it's most appropriate for your scenario.

Also note that I'm not focusing on the distinction between a fake, mock and stub. That's a test implementation detail unrelated to the purpose of your testing strategy.

My company's view

Otherwise, if I use a real UserRepository and UserValidator, wouldn't that be an integration test and also defeat the purpose of testing only the behavior of UserService?

I want to answer this from the point of view of the company I currently work at. This isn't actually something I agree with, but I understand their reasoning.

They don't unit test single classes, instead they test single layers. I call that an integration test, but to be honest it's somewhere in the middle, since it still mocks/stubs classes, just not all of a class' dependencies.

For example, if UserService (BLL) has a GetUsers method, which:

Checks with the UserAuthorizationService (BLL) if the current user is allowed to fetch lists of users.
- The UserAuthorizationService (BLL) in turn depends on the AuthorizationRepository (DAL) to find the configured rights for this user.
Fetches the users from the UserRepository (DAL)
Check with the UserPrivacyService (BLL) if some of these users have asked to not be included in search results - if they have, they will be filtered out
- The UserPrivacyService (BLL) in turn depends on the PrivacyRepository (DAL) to find out if a user asked for privacy

This is just a basic example. When unit testing the BLL, my company builds its tests in a way that all (BLL) objects are real and all others (DAL in this case) are mocked/stubbed. During a test, they set up particular data states as mocks, and then expect the entirety of the BLL (all references/depended BLL classes, at least) to work together in returning the correct result.

I didn't quite agree with this, so I asked around to figure out how they came to that conclusion. There were a few understandable bullet points to that decision:

The problem domain of the application is liable to constant business refactoring, where the business layer itself may subdivide into more niche classes without changing the public contract. By not testing every BLL class individually, tests need to be rewritten much less often since a test doesn't need to know the exact dependency graph of the class it's testing.
Access logic is very pervasive over the domain, but its implementation and structure changes with the modern times. By not having to rewrite tests whenever the access logic changes, the company intends to lower the threshold for developers being open to innovating the access logic. No one wants to take on a rewrite of >25000 tests.
Setting up a mocked situation is quite complex (cognitively), and it's easier for developers to understand how to set the data state (which is just an event store) instead of mocking all manner of complex BLL dependencies who essentially just extract information from that data store in their own unique way.
Since the interface between the BLL classes is so specific, you often don't need to know exactly which BLL class failed, since the odds are reasonably big that the contract between the failed class and its dependency (or vice versa) is part of the problem that needs to be adjusted. Almost always, the BLL call stack needs to be investigated in its entirety as some responsibilities may shift due to uncovered bugs (cfr the first bullet point).

I wanted to add this viewpoint because this company is quite large, and in my opinion is one of the healthiest development environments I've encountered (and as a consultant, I've encountered many).

While I still dislike the lack of true unit testing, I do also see that there are few to no problems arising from doing this kind of "layer integration" test for the business logic.

I can't delve into the specifics of what kind of software this company writes but suffice it to say that they work a field that is rife with arbitrarily decided business logic (from customers) who are unwilling to change their arbitrary rules even when proven to be wrong. My company's codebase accommodates a shared code library between tenanted endpoints with wildly different business rules.

In other words, this is a high pressure, high stakes environment, and the test suite holds up as well as any "true unit test" suite that I've encountered.

One thing to mention though: the testing fixture of the mocked data store is quite big and bulky. It's actually quite comfortable to use but it's custom built so it took some time to get it up and running.
This complicated fixture only started paying dividends when the domain grew large enough that custom-defining stubs/mocks for each individual class unit test would cost more effort than having one admittedly giant but reusable fixture with all mocked data stores in it.

My view

Should I be writing only integration tests when there is dependency, and unit tests for pieces of code without any dependency?

That's not what separate unit and integration tests. A simple example is this:

Can Timmy throw a ball when he has one?
Can Tommy catch a ball when it approaches him?

These are unit tests. They test a single class' ability to perform a task in the way you expect it to be performed.

Can Timmy throw a ball to Tommy and have him catch it?

This is an integration test. It focuses on the interaction between several classes and catches any issues that happen between these classes (in the interaction), not in them.

So why would we do both? Let's look at the alternatives:

If you only do integration tests, then a test failure doesn't really tell you much. Suppose our test tells use that Timmy can't throw a ball at Tommy and have him catch it. There are many possible reason for that:

Timmy's arms are broken. (= Timmy is defective)
Tommy's arms are broken. (= Tommy is defective)
The ball cannot travel in a throwing arc, e.g. because it is not inflated. (= Timmy and Tommy are fine but a third dependency is broken)

But the test doesn't help you narrow your search down. Therefore, you're still going to have to go on a bug hunt in multiple classes, and you need to keep track of the interaction between them to understand what is going on and what might be going wrong.

This is still better than not having any tests, but it's not as helpful as it could be.

Suppose we only had unit tests, then these defective classes would've been pointed out to us. For each of the listed reasons, a unit test of that defective class would've raised a flag during your test run, giving you the precise information on which class is failing to do its job properly.

This narrows down your bug hunt significantly. You only have to look in one class, and you don't even care about their interaction with other classes since the faulty class already can't satisfy its own public contract.

However, I've been a bit sneaky here. I've only mentioned ways in which the integration test can fail that can be answered better by a unit test. There are also other possible failures that a unit test could never catch:

Timmy refuses to throw a ball at Tommy because he (quote) "hates his stupid face". Timmy can (and is willing to) throw balls at anyone else.
Timmy is in Australia, Tommy is in Canada (= Timmy and Tommy and the ball are fine, but their relative distance is the problem).
We're in the middle of a hurricane (= temporary environmental "outage" similar to a network failure)

In all of these situations, Timmy, Tommy and the ball are all individually operational. Timmy could be the best pitcher in the world, Tommy could be the best catcher.

But the environment they find themselves in is causing issues. If we don't have an integration test, we would never catch these issues until we'd encounter them in production, which is the antithesis of TDD.
But without a unit test, we wouldn't have been able to distinguish individual component failures from environmental failures, which leaves us guessing as to what is actually going wrong.

So we come to the final conclusion:

Unit tests test uncover issues that render a specific component defective
Integration tests uncover issues with individually operational components that fail to work together in a particular composition.
Integration tests can usually catch all of the unit test failures, but it cannot accurately pinpoint the failure, which significantly detracts from the developer's quality of life.
When an integration tests fails but all dependent unit tests pass, you know that it's an environmental issue.

And if so, how would I test the behavior of UserService? ("If UserRepository returns null, then UserService should return false")

Be very careful of being overly specific. "returning null" is an implementation detail. Suppose your repository were a networked microservice, then you'd be getting a 404 response, not null.

What matters is that the user doesn't exist in the repository. How the repository communicates that non-existence to you (null, exception, 404, result class) is irrelevant to describing the purpose of your test.

Of course, when you mock your repository, you're going to have to implement its mocked behavior, which requires you to know exactly how to do it (null, exception, 404, result class) but that doesn't mean that the test's purpose needs to contain that implementation detail as well.

In general, you really need to separate the contract from the implementation, and the same principle applies to describing your test versus implementing it.

Related Solutions

TDD Practices – Does Proper TDD Result in Code Without Tests?

However, it implies that large parts of the applications code are not covered by tests. Why? Because if you have units (and you need a lot of units to get your Unit Tests right) you need code that wires the units together. This code, IMHO, will get complicated enough that it deserves to be tested on a more granular level that integration tests while it probably falls into "Dirty Hybris":

Your assumption is faulty because you are neglecting a layer of testing - acceptance testing.

Your unit tests cover individual units - the classes and methods that compose them. This enables you to test methods and classes in isolation to ensure that they are behaving as expected. Above this lies your integration tests, which tests the collaboration between classes and ensures that larger modules (packages and even inter-package collaboration) work as expected. Finally, your acceptance tests are used to verify and validate your entire system, as assembled, against the user requirements.

Assuming that you have the appropriate unit and integration tests that correspond to requirements and well-defined acceptance criteria and acceptance test plans, then everything in your system is tested. Other aspects of testing - smoke tests, regression tests, and so forth, are simply an appropriate subsampling of the unit, integration, and acceptance tests.

TDD is a robust way of designing software components (“units”) interactively so that their behaviour is specified through unit tests

That particular quote is also missing something. As I was taught, TDD isn't just about unit tests, but developing all tests first. That includes not only unit tests, but the necessary acceptance and integration tests as well.

TDD – When to Write Integration Tests

The Rspec Book, among other BDD resources, suggests a cycle like this:

enter image description here

In essence, the process is:

While behaviour required
    Write an integration test for a specific behaviour
    While integration test failing
        Write a unit test to fulfil partial behavior
        While unit test failing
            Write code to make unit test pass
        Commit
        While refactoring can be done
            Refactor
            While unit test failing
                Write code to make unit test pass
            Commit
    Push

Disclaimer: There's no doubt in my mind that this leads to the best code and product, but it can be time-consuming. There are all sorts of difficulties around data and determinism, when it comes to saying that integration tests should always pass. It's not appropriate in all circumstances; sometimes you just have to get stuff out of the door.

That said, having an ideal process in mind is great. It gives you a point from which to compromise.