Python – How to Test Code in Python Other Than by Doing It by Hand

automationpythontest-automationtesting

I am used to user testing in Java, and I also manually test each section of code I write, but now, I want to automate it.

There is no GUI for this project, so user testing is not required and I can focus code testing. The fact is, I don't know how to test code other than by calling the methods by hand, passing different parameters to them, and check the return value.

I suppose automation is a solution, but don't have a clue how to pass from manual testing to automated one. Are there tools which can create the relevant tests for me, or do I need to write tests manually? What can be automated?

Here is an example of code I want to test. Do I need to transform it somehow in order for it to be unit tested?

def netInfoToSend(tm, our_or_ip_version, our_or_addr_len, our_op_ip, version_their_ips,
                  num_their_ips, len_their_ips, their_ips):

    cellNetInfopkt = struct.pack(">I", time.time())

    #         Number of addresses    [1 byte]
    # cellNetInfopkt += struct.pack("B", num_their_ips)
    #         Their OR's addresses    [variable]
    print "peerAddress", peerAddress
    print "peerAddress type", type(peerAddress)

    cellNetInfopkt += struct.pack(">B", 4)
    cellNetInfopkt += struct.pack(">B", 4)
    cellNetInfopkt += struct.pack("B" * len(peerAddress), *peerAddress)

    cellNetInfopkt += struct.pack(">B", 1)

    # Address format is a type/length/value
    #         This OR's address     [variable]
    cellNetInfopkt += struct.pack(">B", our_or_ip_version) # IPV4
    cellNetInfopkt += struct.pack(">B", our_or_addr_len)
    cellNetInfopkt += struct.pack("B" * len(ownAddress), *ownAddress)

    return cellNetInfopkt

Best Answer

Method to test

The method takes several arguments and returns a value, without using global state nor calling other parts of the application. It does one and one only thing.

This is a good start.

  • If the method was using global variables, testing could be difficult.

  • If the method was using other parts of the application, you could be forced to use mocks or stubs.

  • If the method did several things, testing would quickly become a mess. You would have to refactor the code in order to create multiple methods, each one in charge of a single task, and test them separately.

Testing framework

Running unit tests can be as simple as writing a Python script which will call a bunch of methods and compare their output with the one which is expected. But don't do that. Instead, you can use a unit tests framework. While not mandatory, it will simplify your life later. The most basic framework will:

  • Provide you with methods like assertEqual or assertNotIn.

    Some will have more of such methods; others—less. The only one you need is an ordinary assert of a condition, but additional methods can be very handy. For example, .NET has an impressive range of assertions related to collections, which helps avoiding to reinvent the wheel every time you work with collections in tests.

  • Run the tests in your place.

    This becomes very handy in some situations, such as when one of your tests throws an exception. A testing framework automatically handles that, records the test as failed (unless the exception is expected) and runs other tests.

  • Show the results in a form which helps understanding how many tests passed and which ones failed.

    Some integrate well into an IDE, making it visually attractive to use. If you work mostly in a terminal, all testing frameworks are able to display the results in a terminal.

The benefits of a testing framework doesn't stop here. For example:

  • Some frameworks can easily be extended to include code coverage.

  • Others integrate well with your Continuous Integration environment.

  • Every serious framework will include features such as code to run before or after running every test (setup and shutdown code) or specific statements which indicate that a test should throw an exception.

I hope you're convinced now that you need a framework. But previously, I provided a link to the list of some unit tests frameworks for Python, so which one to chose? It's up to you: some will have features which others don't have; some will integrate better with different environments. unittest is a good start for two reasons:

  • It's already installed on your machine.

  • It has everything you need at the beginning, and you may never need anything other.

But feel free to pick a different framework if you discover it has a feature you think you will need later.

Testing ordinary cases

Now that you have a framework, start by testing ordinary cases of your method.

What are those ordinary cases? For example, if somebody was implementing abs method which determines an absolute value of a number, ordinary testing cases would be:

def test_positive(self):
    actual = demo.abs(3)
    expected = 3
    self.assertEqual(expected, actual)

def test_negative(self):
    actual = demo.abs(-7)
    expected = 7
    self.assertEqual(expected, actual)

def test_float(self):
    actual = demo.abs(-2.91)
    expected = 2.91
    self.assertEqual(expected, actual)

When writing the very first test, you may be blocked: how do you know whether the value returned by the method is correct?

  • First, if you don't want to do binary comparison and want to deal with more humanly-readable values (are they?), you can use binascii.hexlify.

  • As for determining the expected value, either you can use a third party application which will do the same as what your method does, or you compute one by hand. In both cases, if the test fails, you should ask yourself if your code is wrong or if the expected value is.

Remember, you don't need to test every possible input value. Not only you can't in most cases, but also, there is practically no benefit in doing it. For example, once I tested abs with a value of 3, I don't need to test it with the value of 4, then 5, then 6. I'm confident that either abs will fail with 3 (for example by throwing an exception or returning -3 or 0), or it will work for any other positive integers as well.

Looking at the code, i.e. performing white-box testing, is helpful here. If you see that the method has somewhere in it if x > 100, then you may be confident that you should test the method with at least one x inferior to 100 and one superior to it. But don't rely on the actual implementation of the code you test too often: black-box testing is often much more effective and reduces the risk of reproducing in tests the bug in the code. If you find yourself relying on actual implementation too much, test driven development can help.

Testing edge cases

Once those tests pass, you need to think about edge cases, the sort of cases which are not necessarily frequent and can eventually correspond to the situations which were missed by the author of the code to test.

Example:

def test_zero(self):
    actual = demo.abs(0)
    expected = 0
    self.assertEqual(expected, actual)

def test_nan(self):
    with self.assertRaises(ValueError):
        demo.abs(float("nan"))

def test_wrong_type(self):
    with self.assertRaises(TypeError):
        demo.abs("I'm not really a number.")

def test_none(self):
    with self.assertRaises(ValueError):
        demo.abs(None)

When you search for edge cases, act like a hacker: imagine you want to break the method, and you want to find cases which the original author didn't expected.

If the method expects a string, what if you pass a 1 000 000 000 characters-long string? What if you pass an empty one? What if you use a different Unicode normalization form or if the string you pass to the method contains no printable characters?

If the method expects a float, floating-point rules can be tricky in some languages and should be tested as well.

Automation

You may have noticed that I haven't used the term of "automated testing". This is because there is no such a thing as manual testing. When you play with the application, searching by hand things which may fail, it's experimentation, not testing. Both tasks have a goal of discovering bugs, but they are very different.

The major benefit of real tests (compared to experimentation) is especially noticeable when it comes to regression testing, i.e. when you need to run the same tests again and again over time to ensure that new changes haven't broken anything.

A year ago, I was contributing to a project for a multinational company. The company working on the project was not using any testing, because they considered it a waste of time. The multinational company had "testers" who struggled with the web interface of the project, filling forms by hand and watching the result.

This was a five pages project and they had two "testers". At the beginning, everything was good. Quickly, the "testers" were unable to follow: with more than ten deployments per day and more than hundred bugs (some taking a long time to reproduce by hand), it was absolutely impossible to test every case.

When they were finding, randomly, that old bugs were reappearing, it was often bugs reintroduced days or weeks ago. Such regressions are extremely difficult and slow to solve, delaying the final release for months.

This story clearly shows that "testing by hand" doesn't work. It might work for tiny projects developed in a short period of time and not maintained later. But those projects usually can stay untested.

Once you have your tests, your project is partially automated. In order to run every of a few thousand tests, the only thing you need is to run a command or click on a button and then wait.

But clicking on a button is a complex task as well. One day you forget to run the tests before/after every commit. The next day, you run the tests only once. A week later, you notice that you haven't run the tests for a few days, and when you do it, the number of failed tests is overwhelming: instead of fixing them, you simply remove all the tests and continue the project without testing.

In order to avoid testing decay, there are a few hints:

  • Use full automation, i.e. integrate tests into Continuous Integration.

  • If tests fail, fixing them should be your top priority.

  • Commit frequently. I often see programmers who barely do one commit per day, when they leave their desk in evening (and sometimes can disappear from version control log for a few days). They do it wrong.

  • Ensure tests don't take hours to run (or that you can run them selectively while being confident that all tests potentially concerned by the recent changes are run).

Those three hints help making testing as painless as possible by making it a constant process as much as possible. With short iterations, you are notified quickly about failed tests and you solve them quickly: the more you wait, the more expensive and difficult the task would be.

Using regression testing has also an important benefit: if you know that mistakes in code will be easy to solve, you will be able to refactor code much more intensively. You can try it yourself: your NetInfoToSend contains code duplication; try to remove it.

The cake is a lie

The last part is more the response to "create the tests for me" in your question. Researching about tests, you'll find lots of products advertised as being THE solution to unit testing. Advertisement will assert that the product will write the tests for you, or improve your productivity by 100 or make you happy in any other way.

Don't trust them.

Some tools are helpful, but their importance is relative.

  • Unit test frameworks help, but they do nothing you can do manually.

  • Code coverage help finding parts which are not covered enough, but tools are stupid enough to put a red flag on boilerplate code where unit tests are useless anyway, or to make you feel safe when you reach 98% code coverage on business-critical code which requires 99.95%.

  • Tools creating unit tests for you may help in some cases, but they can focus on code which you don't necessarily need to test and leave aside the three lines of code you would like to test the most. I'm not saying tools such as Pex are useless: they are excellent at what they do, but they shouldn't replace you as a tester.

Related Topic