When decomposing a large function, how can I avoid the complexity from the extra subfunctions

design-patternslanguage-agnosticprogramming practices

Say I have a large function like the following:

function do_lots_of_stuff(){

    { //subpart 1
      ...
    }

    ...

    { //subpart N
      ...
    }
}

a common pattern is to decompose it into subfunctions

function do_lots_of_stuff(){
    subpart_1(...)
    subpart_2(...)
    ...
    subpart_N(...)
}

I usually find that decomposition has two main advantages:

  1. The decomposed function becomes much smaller. This can help people read it without getting lost in the details.
  2. Parameters have to be explicitly passed to the underlying subfunctions, instead of being implicitly available by just being in scope. This can help readability and modularity in some situations.

However, I also find that decomposition has some disadvantages:

  1. There are no guarantees that the subfunctions "belong" to do_lots_of_stuff so there is nothing stopping someone from accidentally calling them from a wrong place.
  2. A module's complexity grows quadratically with the number of functions we add to it. (There are more possible ways for things to call each other)

Therefore:

Are there useful convention or coding styles that help me balance the pros and cons of function decomposition or should I just use an editor with code folding and call it a day?


EDIT: This problem also applies to functional code (although in a less pressing manner). For example, in a functional setting we would have the subparts be returning values that are combined in the end and the decomposition problem of having lots of subfunctions being able to use each other is still present.

We can't always assume that the problem domain will be able to be modeled on just some small simple types with just a few highly orthogonal functions. There will always be complicated algorithms or long lists of business rules that we still want to correctly be able to deal with.

function do_lots_of_stuff(){
   p1 = subpart_1()
   p2 = subpart_2()
   pN = subpart_N()
   return assembleStuff(p1, p2, ..., pN)
}

Best Answer

Keep each function as simple as possible.

Think of it in simple terms, the way a function is meant to be:

  • gets 0 to N inputs,
  • returns 0 or 1 result (possible a composite or collection),
  • and isn't tied to state.

    Programming functions should aim to be equivalent to mathematical functions

When you stick to functional programming idioms, you get rid of most of these questions you're asking yourself. Sure, your class will get bigger in terms of number of functions. But if the methods are not tied to each by internal state changes, they get easier to understand, manage and compose to achieve an end result.

Function composition

Also, try to give them appropriate accesses based on the above design decisions. Helpers will commonly easily be declared as statics (and if they don't seem to need to be private or can be reused, they could extracted to an helper class), which gives a strong hint to the other developers: this thing is meant to be independent and side-effect free.

Repeat the following mantras to aim for purity:

  • My function shall be:
    • short,
    • side-effects free,
    • realizing one and one function only.
  • My function shall be strict on output. [1]
  • My function shall be testable, and tested.
  • My function shall be readable and read like a natural language expression.
  • My function shall be documented. [2]
  • My function shall be null-hostile.

[1] Whether it shall be strict or lenient on input depends on whether it's consumer code or library code.
[2] Self-documentation counts, comments for tricky parts count as well.


Of course, if you are in a generally non-FP-oriented code base, you won't manage to avoid shared mutable state for ever, but it's a very good, sensitive and no-BS guideline to follow. Even if you do get it wrong by over-modularizing and complexifying your class, it'll still be easier to pick up from there and refactor again than from a giant dump of code with high complexity and tight coupling.

Regarding the rules of the compositionality of your functions, these are your business rules. They are dictated by what you want to achieve, there's no automagical way of determining it for you.

Related Topic