When Should a Method Return the Same Instance After Modifying Itself?

designmethod-chainingmethodsobject-orientedreturn-type

I have a class that has three methods A(), B() and C(). Those methods modify the own instance.

While the methods have to return an instance when the instance is a separate copy (just as Clone()), I got a free choice to return void or the same instance (return this;) when modifying the same instance in the method and not returning any other value.

When deciding for returning the same modified instance, I can do neat method chains like obj.A().B().C();.

Would this be the only reason for doing so?

Is it even okay to modify the own instance and return it, too? Or should it only return a copy and leave the original object as before? Because when returning the same modified instance the user would maybe assume the returned value is a copy, otherwise it would not be returned? If it's okay, what's the best way to clarify such things on the method?

Best Answer

When to use chaining

Function chaining is mostly popular with languages where an IDE with auto-complete is common place. For example, almost all C# developers use Visual Studio. Therefore, if you're developing with C# adding chaining to your methods can be a time saver for users of that class because Visual Studio will assist you in building the chain.

On the other hand, languages like PHP that are highly dynamic in nature and often don't have auto-complete support in IDEs will see fewer classes that support chaining. Chaining will only be appropriate when correct phpDocs are employed to expose the chainable methods.

What is chaining?

Given a class named Foo the following two methods are both chainable.

function what() { return this; }
function when() { return new Foo(this); }

The fact that one is a reference to the current instance, and one creates a new instance doesn't change that these are chainable methods.

There is no gold rule that a chainable method must only reference the current object. Infact, chainable methods can be across two different classes. For example;

class B { function When() { return true; } };
class A { function What() { return new B(); } };

var a = new A();
var x = a.What().When();

There is no reference to this in any of the above example. The code a.What().When() is an example of a chaining. What's interesting is that the class type B is never assigned to a variable.

A method is chained when it's return value becomes used as the next component of an expression.

Here are some more example

 // return value never assigned.
 myFile.Open("something.txt").Write("stuff").Close();

// two chains used in expression
int x = a.X().Y() * b.X().Y();

// a chain that creates new strings
string name = str.Substring(1,10).Trim().ToUpperCase();

When to use `this` and `new(this)`

Strings in most languages are immutable. So chaining method calls always results in new strings being created. Where as an object like StringBuilder can be modified.

Consistency is best practice.

If you have methods that modify the state of an object and return this, then don't mix in methods that return new instances. Instead, create a specific method called Clone() that will do this explicitly.

 var x  = a.Foo().Boo().Clone().Foo();

That is a lot clearer as to what is going on inside a.

The Step Outside And Back Trick

I call this the step out and back trick, because it solves a lot of common problems related to chaining. It basically means that you step out of the original class into a new temporary class and then back to the original class.

The temporary class exists only to provide special features to the original class, but only under special conditions.

There are often times when a chain needs to change state, but class A can not represent all of those possible states. So during a chain a new class is introduced that contains a reference back to A. This allows the programmer to step into a state and back to A.

Here is my example, let the special state be known as B.

class A {
    function Foo() { return this; }
    function Boo() { return this; }
    function Change() return new B(this); }
}

class B {
    var _a;
    function (A) { _a = A; }
    function What() { return this; }
    function When() { return this; }
    function End() { return _a; }
}

var a = new A();
a.Foo().Change().What().When().End().Boo();

Now that is a very simple example. If you wanted to have more control, then B could return to a new super-type of A that has different methods.

Related Solutions

Java – Object-Oriented Class Design

Example 2 is quite bad for testing... and I don't mean that you can't test the internals. You also can't replace your XmlReader object by a mock object as you have no object at all.

Example 1 is needlessly hard to use. What about

XmlReader reader = new XmlReader(url);
Document result = reader.getDocument();

which is not any harder to use than your static method.

Things like opening the URL, reading XML, converting bytes to strings, parsing, closing sockets, and whatever, are uninteresting. Creating an object and using it is important.

So IMHO the proper OO Design is to make just the two things public (unless you really need the intermediate steps for some reason). Static is evil.

Java Methods – Naming Methods with Different Return Types

Why do you need to extend this class? And why do you need to name your own method the same as showDialog?

In reality your method does something entirely different than what showDialog does. A better name for your method would be showDialogAtLocationAndReturnSelectedFile as your method does more or less these things. Naming it showDialog will only confuse your code users.

Also, without knowing anything else, I'd say you're trying to shove too much in a single method. How do you react on a cancel press? How about an error? Do you return null? If so, you're forcing the user of the code to check the return value yet again. This has the potential of being just another "Leaky Abstraction", and Java APIs already have enough of these.

An important part of API design is making sure that the name of a function/class/method matches what it really does. And that is why in JFileChooser the method's name is showDialog. It just shows the dialog. It doesn't open the file for reading, it doesn't perform a check whether the filename is valid, and honestly, why would it? The user of the code just asked the class to show the dialog.

The creator of Ruby calls this the 'Principle of Least Surprise'*, and while I don't really know Ruby, this is a great line to learn from. Your code should be in the service of its user, and a part of this service is embedding the contract of the method/class in its name.

You might think you're not designing an API, but I doubt you work alone: there's probably someone else in the team, and they will appreciate this. Also, I heartily recommend this lecture on API Design: How To Design A Good API and Why it Matters. It was presented to Googlers by a Java designer, so it kinda matters.

Maybe this is more than you asked for, but I feel you seem to be somewhat missing the point of naming methods.

UPDATE: * I seem to be mistaken, the creator of Ruby has actually stated that he designed Ruby with the "Principle of Least Astonishment", not "Principle of Least Surprise". In any case, what I said still holds.