I'm integration testing a system, by using only the public APIs. I have a test that looks something like this:
def testAllTheThings():
email = create_random_email()
password = create_random_password()
ok = account_signup(email, password)
assert ok
url = wait_for_confirmation_email()
assert url
ok = account_verify(url)
assert ok
token = get_auth_token(email, password)
a = do_A(token)
assert a
b = do_B(token, a)
assert b
c = do_C(token, b)
# ...and so on...
Basically, I'm attempting to test the entire "flow" of a single transaction. Each step in the flow depends on the previous step succeeding. Because I'm restricting myself to the external API, I can't just go poking values into the database.
So, either I have one really long test method that does `A; assert; B; assert; C; assert…", or I break it up into separate test methods, where each test method needs the results of the previous test before it can do its thing:
def testAccountSignup():
# etc.
return email, password
def testAuthToken():
email, password = testAccountSignup()
token = get_auth_token(email, password)
assert token
return token
def testA():
token = testAuthToken()
a = do_A(token)
# etc.
I think this smells. Is there a better way to write these tests?
Best Answer
If this test is intended to run frequently, your concerns would rather be focused on how to present test results in a way convenient to those expected to work with these results.
From this perspective,
testAllTheThings
raises a huge red flag. Imagine someone running this test every hour or even more frequently (against buggy codebase of course, otherwise there would be no point to re-run), and seeing every time all the sameFAIL
, without a clear indication of what stage failed.Separate methods look much more appealing, because results of re-runs (assuming steady progress in fixing bugs in code) could look like:
Side note, in one of my past projects, there were so many re-runs of dependent tests that users even began complaining about not willing to see repeated expected failures at later stage "triggered" by a failure at the earlier one. They said this garbage makes it harder to them to analyze test results "we know already that the rest will fail by test design, don't bother us repeating".
As a result, test developers were eventually forced to extend their framework with additional
SKIP
status and add a feature in test manager code to abort execution of dependent tests and an option to dropSKIP
ped test results from the report, so that it looked like: