How to structure tests where one test is another test’s setup

testing

I'm integration testing a system, by using only the public APIs. I have a test that looks something like this:

def testAllTheThings():
  email = create_random_email()
  password = create_random_password()

  ok = account_signup(email, password)
  assert ok
  url = wait_for_confirmation_email()
  assert url
  ok = account_verify(url)
  assert ok

  token = get_auth_token(email, password)
  a = do_A(token)
  assert a
  b = do_B(token, a)
  assert b
  c = do_C(token, b)

  # ...and so on...

Basically, I'm attempting to test the entire "flow" of a single transaction. Each step in the flow depends on the previous step succeeding. Because I'm restricting myself to the external API, I can't just go poking values into the database.

So, either I have one really long test method that does `A; assert; B; assert; C; assert…", or I break it up into separate test methods, where each test method needs the results of the previous test before it can do its thing:

def testAccountSignup():
  # etc.
  return email, password

def testAuthToken():
  email, password = testAccountSignup()
  token = get_auth_token(email, password)
  assert token
  return token

def testA():
  token = testAuthToken()
  a = do_A(token)
  # etc.

I think this smells. Is there a better way to write these tests?

Best Answer

If this test is intended to run frequently, your concerns would rather be focused on how to present test results in a way convenient to those expected to work with these results.

From this perspective, testAllTheThings raises a huge red flag. Imagine someone running this test every hour or even more frequently (against buggy codebase of course, otherwise there would be no point to re-run), and seeing every time all the same FAIL, without a clear indication of what stage failed.

Separate methods look much more appealing, because results of re-runs (assuming steady progress in fixing bugs in code) could look like:

    FAIL FAIL FAIL FAIL
    PASS FAIL FAIL FAIL -- 1st stage fixed
    PASS FAIL FAIL FAIL
    PASS PASS FAIL FAIL -- 2nd stage fixed
    ....
    PASS PASS PASS PASS -- we're done

Side note, in one of my past projects, there were so many re-runs of dependent tests that users even began complaining about not willing to see repeated expected failures at later stage "triggered" by a failure at the earlier one. They said this garbage makes it harder to them to analyze test results "we know already that the rest will fail by test design, don't bother us repeating".

As a result, test developers were eventually forced to extend their framework with additional SKIP status and add a feature in test manager code to abort execution of dependent tests and an option to drop SKIPped test results from the report, so that it looked like:

    FAIL -- the rest is skipped
    PASS FAIL -- 1st stage fixed, abort after 2nd test
    PASS FAIL
    PASS PASS FAIL -- 2nd stage fixed, abort after 3rd test
    ....
    PASS PASS PASS PASS -- we're done
Related Topic