Let's look at the options, where we can place the validation code:
- Inside the setters in builder.
- Inside the
build()
method.
- Inside the constructed entity: it will be invoked in
build()
method when the entity is being created.
Option 1 allows us to detect problems earlier, but there can be complicated cases when we can validate input only having the full context, thus, doing at least part of validation in build()
method. Thus, choosing option 1 will lead to inconsistent code with part of validation being done in one place and another part being done in other place.
Option 2 isn't significantly worse than option 1, because, usually, setters in builder are invoked right before the build()
, especially, in fluent interfaces. Thus, it's still possible to detect a problem early enough in most cases. However, if the builder is not the only way to create an object, it will lead to duplication of validation code, because you'll need to have it everywhere where you create an object. The most logical solution in this case will be to put validation as close to created object as possible, that is, inside of it. And this is the option 3.
From SOLID point of view, putting validation in builder also violates SRP: the builder class already has responsibility of aggregating the data to construct an object. Validation is establishing contracts on its own internal state, it's a new responsibility to check the state of another object.
Thus, from my point of view, not only it's better to fail late from design perspective, but it's also better to fail inside the constructed entity, rather than in builder itself.
UPD: this comment reminded me of one more possibility, when validation inside the builder (option 1 or 2) makes sense. It does make sense if the builder has its own contracts on the objects it is creating. For example, assume that we have a builder that constructs a string with specific content, say, list of number ranges 1-2,3-4,5-6
. This builder may have a method like addRange(int min, int max)
. The resulting string does not know anything about these numbers, neither it should have to know. The builder itself defines the format of the string and constraints on the numbers. Thus, the method addRange(int,int)
must validate the input numbers and throw an exception if max is less than min.
That said, the general rule will be to validate only the contracts defined by the builder itself.
I believe neither of your approaches violate anything and both can be used just fine.
Passing parameters to the builder can be done either using constructor or setter methods. I do not see any problem with it.
I tend to pass parameters via constructor if there are not so many of them. If I have more than 3-5 configuration parameters I switch to using methods to configure Builder.
Best Answer
The Builder Pattern does not solve the “problem” of many arguments. But why are many arguments problematic?
true
for some reason.Faking named parameters
The Builder Pattern addresses only one of these problems, namely the maintainability concerns of function calls with many arguments∗. So a function call like
might become
∗ The Builder pattern was originally intended as a representation-agnostic approach to assemble composite objects, which is a far greater aspiration than just named arguments for parameters. In particular, the builder pattern does not require a fluent interface.
This offers a bit of extra safety since it will blow up if you invoke a builder method that doesn't exist, but it otherwise does not bring you anything that a comment in the constructor call wouldn't have. Also, manually creating a builder requires code, and more code can always contain more bugs.
In languages where it is easy to define a new value type, I've found that it's way better to use microtyping/tiny types to simulate named arguments. It is named so because the types are really small, but you end up typing a lot more ;-)
Obviously, the type names
A
,B
,C
, … should be self-documenting names that illustrate the meaning of the the parameter, often the same name as you'd give the parameter variable. Compared with the builder-for-named-arguments idiom, the required implementation is a lot simpler, and thus less likely to contain bugs. For example (with Java-ish syntax):The compiler helps you guarantee that all arguments were provided; with a Builder you'd have to manually check for missing arguments, or encode a state machine into the host language type system – both would likely contain bugs.
There is another common approach to simulate named arguments: a single abstract parameter object that uses an inline class syntax to initialize all fields. In Java:
However, it is possible to forget fields, and this is a quite language-specific solution (I've seen uses in JavaScript, C#, and C).
Fortunately, the constructor can still validate all arguments, which is not the case when your objects are created in a partially-constructed state, and require the user to provide further arguments via setters or an
init()
method – those require the least coding effort, but make it more difficult to write correct programs.So while there are many approaches to address the “many unnamed parameters make code difficult to maintain problem”, other problems remain.
Approaching the root problem
For example the testability problem. When I write unit tests, I need the ability to inject test data, and to provide test implementations to mock out dependencies and operations that have external side effects. I can't do that when you instantiate any classes within your constructor. Unless the responsibility of your class is the creation of other objects, it shouldn't instantiate any non-trivial classes. This goes hand in hand with the single responsibility problem. The more focussed the responsibility of a class, the easier it is to test (and often easier to use).
The easiest and often best approach is for the constructor to take fully-constructed dependencies as parameter, though this shoves the responsibility of managing dependencies to the caller – not ideal either, unless the dependencies are independent entities in your domain model.
Sometimes (abstract) factories or full dependency injection frameworks are used instead, though these might be overkill in the majority of use cases. In particular, these only reduce the number of arguments if many of these arguments are quasi-global objects or configuration values that don't change between object instantiation. E.g. if parameters
a
andd
were global-ish, we'd getDepending on the application, this might be a game-changer where the factory methods end up having nearly no arguments because all can be provided by the dependency manager, or it might be a large amount of code that complicates instantiation for no apparent benefit. Such factories are way more useful for mapping interfaces to concrete types than they are for managing parameters. However, this approach tries to addresses the root problem of too many parameters rather than just hiding it with a pretty fluent interface.