How Far to Push Refactoring Without Changing External Behavior

refactoring

According to Martin Fowler, code refactoring is (emphasis mine):

Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior. Its heart is a series of small behavior preserving transformations. Each transformation (called a 'refactoring') does little, but a sequence of transformations can produce a significant restructuring. Since each refactoring is small, it's less likely to go wrong. The system is also kept fully working after each small refactoring, reducing the chances that a system can get seriously broken during the restructuring.

What is "external behaviour" in this context? For example, if I apply move method refactoring and move some method to other class, it looks like I change external behaviour, doesn't it?

So, I'm interested in figuring out at what point does a change stop being a refactor and becomes something more. The term "refactoring" may be misused for larger changes: is there a different word for it?

Update. A lot of interesting answers about interface, but wouldn't move method refactoring change the interface?

Best Answer

"External" in this context means "observable to users". Users may be humans in case of an application, or other programs in case of a public API.

So if you move method M from class A to class B, and both classes are deep inside an application, and no user can observe any change in the behaviour of the app due to the change, then you can rightly call it refactoring.

If, OTOH, some other higher level subsystem/component changes its behaviour or breaks due to the change, that is indeed (usually) observable to users (or at least to sysadmins checking logs). Or if your classes were part of a public API, there may be 3rd party code out there which depends on M being part of class A, not B. So neither of these cases are refactoring in the strict sense.

there is a tendency to call any code rework as refactoring which is, I guess, incorrect.

Indeed, it is a sad but expected consequence of refactoring becoming fashionable. Developers have been doing code rework in an ad hoc manner for ages, and it is certainly easier to learn a new buzzword than to analyse and change ingrained habits.

So what is the right word for reworks which change external behaviour?

I would call it redesign.

Update

A lot of interesting answers about interface, but wouldn't move method refactoring change the interface?

Of what? The specific classes, yes. But are these classes directly visible to the outside world in any way? If not - because they are inside your program, and not part of the external interface (API / GUI) of the program - no change made there is observable by external parties (unless the change breaks something, of course).

I feel that there is a deeper question beyond this: does a specific class exist as an independent entity by itself? In most cases, the answer is no: the class only exists as part of a larger component, an ecosystem of classes and objects, without which it can't be instantiated and/or is unusable. This ecosystem does not only include its (direct/indirect) dependencies, but also other classes / objects which depend on it. This is because without these higher level classes, the responsibility associated with our class may be meaningless/useless to the users of the system.

E.g. in our project which deals with car rentals, there is a Charge class. This class has no use to the users of the system by itself, because rental station agents and customers can't do much with an individual charge: they deal with rental agreement contracts as a whole (which include a bunch of different kinds of charges). The users are mostly interested in the sum total of these charges, that they are to pay in the end; the agent is interested in the different contract options, the length of the rental, the vehicle group, insurance package, extra items etc. etc. selected, which (via sophisticated business rules) govern what charges are present and how the final payment is calculated out of these. And country representatives / business analysts care about the specific business rules, their synergy and effects (on the income of the company, etc.). A single charge by itself has no meaning without the bigger picture.

Recently I refactored this class, renaming most of its fields and methods (to follow the standard Java naming convention, which was totally neglected by our predecessors). I also plan further refactorings to replace String and char fields with more appropriate enum and boolean types. All this will certainly change the interface of the class, but (if I do my job correctly) none of it will get visible to the users of our app. None of them cares about how individual charges are represented, even though they surely know the concept of charge. I could have selected as example a hundred other classes not representing any domain concept, so being even conceptually invisible to the end users, but I thought it is more interesting to pick an example where there is at least some visibility at the concept level. This shows nicely that class interfaces are only representations of domain concepts (at best), not the real thing*. The representation can be changed without affecting the concept. And users only have and understand the concept; it is our task to do the mapping between concept and representation.

*And one can easily add that the domain model, which our class represents, is itself only an approximate representation of some "real thing"...

Related Topic