Until recently, I would have just pointed you at plenty of other existing questions that say that you generally shouldn't need access to any private stuff, in order to write unit tests - as the whole point of tests is to test the public (and protected) interface.
For example, in the question how to unit-test private methods in jquery plugins?, there is this answer:
The same applies here as with any other language and testing privates: To test private methods, you should exercise them via the public interface. In other words, by calling your public methods, the private methods get tested in the process because the public methods rely on the privates.
Generally private methods are not tested separately from the public interface - the entire point is that they are implementation details, and tests should generally not know too much about the specifics of the implementation
In your specific case, you say:
Since the data is private, I can only add things via the public interface of the object. This run codes that need not be run during a unit test and in some case is just a copy and paste from another test.
I would be cautious about having any tests that avoided code that normally needs to be run - for fear that your tests would spuriously pass, due to some difference between the test behaviour and the real world.
Having said all that, I recently learned about a really excellent article on writing testsuite called The Way of Testivus. It's a PDF that's only 12 pages long, and a really easy and enjoyable read.
What's relevant here is the section entitled "Sometimes, the test justifies the means" - in the context of violating encapsulation to enable testing. I highly recommend it.
(Another favourite of mine is "An ugly test is better than no test.")
So, in your context, if you still decide you really want a different way to set up data purely for tests, it might be reasonable to add a public setter method, specifically documented as only existing to aid writing of tests.
The standard flow for TDD is:
- Write a failing test. (Red)
- Make the smallest code change that makes it pass (Green)
- Refactor (Keeping it green)
The test for your tests in this case is step 1 - making sure that the test fails before you make any code changes.
Another test that I like is whether you can delete some code and re-implement it a different way, and your tests fail after deletion but work with a different algorithm in place.
As with all things, there is no magic bullet. Forgetting to write a required test is just as easy for a developer to do as forgetting to write the code. At least if you're doing both, you have twice as many opportunities to discover your omission.
Best Answer
There are two issues we have to look at here.
The first is that you seem to be looking at all of your tests from the unit test perspective. Unit tests are extremely valuable, but are not the only kinds of tests. Tests can actually be divided into several different layers, from very fast unit tests to less fast integration tests to even slower acceptance tests. (There can be even more layers broken out, like functional tests.)
The second is that you are mixing together calls to third-party code with your business logic, creating testing challenges and possibly making your code more brittle.
Unit tests should be fast and should be run often. Mocking dependencies helps to keep these tests running fast, but can potentially introduce holes in coverage if the dependency changes and the mock doesn't. Your code could be broken while your tests still run green. Some mocking libraries will alert you if the dependency's interface changes, others cannot.
Integration tests, on the other hand, are designed to test the interactions between components, including third-party libraries. Mocks should not be used at this level of testing because we want to see how the actual object interact together. Because we are using real objects, these tests will be slower, and we will not run them nearly as often as our unit tests.
Acceptance tests look at an even higher level, testing that the requirements for the software are met. These tests run against the entire, complete system that would get deployed. Once again, no mocking should be used.
One guideline people have found valuable regarding mocks is to not mock types you don't own. Amazon owns the API to S3 so they can make sure it doesn't change beneath them. You, on the other hand, do not have these assurances. Therefore, if you mock out the S3 API in your tests, it could change and break your code, while your tests all show green. So how do we unit test code that uses third-party libraries?
Well, we don't. If we follow the guideline, we can't mock objects we don't own. But… if we own our direct dependencies, we can mock them out. But how? We create our own wrapper for the S3 API. We can make it look a lot like the S3 API, or we can make it fit our needs more closely (preferred). We can even make it a little more abstract, say a
PersistenceService
rather than anAmazonS3Bucket
.PersistenceService
would be an interface with methods like#save(Thing)
and#fetch(ThingId)
, the types of methods we might like to see (these are examples, you might actually want different methods). We can now implement aPersistenceService
around the S3 API (say aS3PersistenceService
), encapsulating it away from our calling code.Now to the code that calls the S3 API. We need to replace those calls with calls to a
PersistenceService
object. We use dependency injection to pass ourPersistenceService
into the object. It's important not to ask for aS3PersistenceService
, but to ask for aPersistenceService
. This allows us to swap out the implementation during our tests.All the code that used to use the S3 API directly now uses our
PersistenceService
, and ourS3PersistenceService
now makes all the calls to the S3 API. In our tests, we can mock outPersistenceService
, since we own it, and use the mock to make sure that our code makes the correct calls. But now that leaves how to testS3PersistenceService
. It has the same problem as before: we can't unit test it without calling to the external service. So… we don't unit test it. We could mock out the S3 API dependencies, but this would give us little-to-no additional confidence. Instead, we have to test it at a higher level: integration tests.This may sound a little troubling saying that we shouldn't unit test a part of our code, but let's look at what we accomplished. We had a bunch of code all over the place we couldn't unit test that now can be unit tested through the
PersistenceService
. We have our third-party library mess confined to a single implementation class. That class should provide the necessary functionality to use the API, but does not have any external business logic attached to it. Therefore, once it is written, it should be very stable and should not change very much. We can rely on slower tests that we don't run that often because the code is stable.The next step is to write the integration tests for
S3PersistenceService
. These should be separated out by name or folder so we can run them separately from our fast unit tests. Integration tests can often use the same testing frameworks as unit tests if the code is sufficiently informative, so we don't need to learn a new tool. The actual code to the integration test is what you would write for your Option 1.