When decomposing a large function, how can I avoid the complexity from the extra subfunctions

design-patternslanguage-agnosticprogramming practices

Say I have a large function like the following:

function do_lots_of_stuff(){

    { //subpart 1
      ...
    }

    ...

    { //subpart N
      ...
    }
}

a common pattern is to decompose it into subfunctions

function do_lots_of_stuff(){
    subpart_1(...)
    subpart_2(...)
    ...
    subpart_N(...)
}

I usually find that decomposition has two main advantages:

The decomposed function becomes much smaller. This can help people read it without getting lost in the details.
Parameters have to be explicitly passed to the underlying subfunctions, instead of being implicitly available by just being in scope. This can help readability and modularity in some situations.

However, I also find that decomposition has some disadvantages:

There are no guarantees that the subfunctions "belong" to do_lots_of_stuff so there is nothing stopping someone from accidentally calling them from a wrong place.
A module's complexity grows quadratically with the number of functions we add to it. (There are more possible ways for things to call each other)

Therefore:

Are there useful convention or coding styles that help me balance the pros and cons of function decomposition or should I just use an editor with code folding and call it a day?

EDIT: This problem also applies to functional code (although in a less pressing manner). For example, in a functional setting we would have the subparts be returning values that are combined in the end and the decomposition problem of having lots of subfunctions being able to use each other is still present.

We can't always assume that the problem domain will be able to be modeled on just some small simple types with just a few highly orthogonal functions. There will always be complicated algorithms or long lists of business rules that we still want to correctly be able to deal with.

function do_lots_of_stuff(){
   p1 = subpart_1()
   p2 = subpart_2()
   pN = subpart_N()
   return assembleStuff(p1, p2, ..., pN)
}

Best Answer

Keep each function as simple as possible.

Think of it in simple terms, the way a function is meant to be:

gets 0 to N inputs,
returns 0 or 1 result (possible a composite or collection),
and isn't tied to state.

When you stick to functional programming idioms, you get rid of most of these questions you're asking yourself. Sure, your class will get bigger in terms of number of functions. But if the methods are not tied to each by internal state changes, they get easier to understand, manage and compose to achieve an end result.

Function composition

Also, try to give them appropriate accesses based on the above design decisions. Helpers will commonly easily be declared as statics (and if they don't seem to need to be private or can be reused, they could extracted to an helper class), which gives a strong hint to the other developers: this thing is meant to be independent and side-effect free.

Repeat the following mantras to aim for purity:

My function shall be:
- short,
- side-effects free,
- realizing one and one function only.
My function shall be strict on output. ^[1]
My function shall be testable, and tested.
My function shall be readable and read like a natural language expression.
My function shall be documented. ^[2]
My function shall be null-hostile.

^{[1] Whether it shall be strict or lenient on input depends on whether it's consumer code or library code.}
^{[2] Self-documentation counts, comments for tricky parts count as well.}

Of course, if you are in a generally non-FP-oriented code base, you won't manage to avoid shared mutable state for ever, but it's a very good, sensitive and no-BS guideline to follow. Even if you do get it wrong by over-modularizing and complexifying your class, it'll still be easier to pick up from there and refactor again than from a giant dump of code with high complexity and tight coupling.

Regarding the rules of the compositionality of your functions, these are your business rules. They are dictated by what you want to achieve, there's no automagical way of determining it for you.

A continuation is the type of its inputs and outputs

The closest thing you will find to a non-procedure based continuation is likely the continuation monad in Haskell as it is expressed as a type, for which many functions may be used to interact with the type to interrupt, resume, backtrack, et al.

You can encapsulate that closure in a type such as the Cont type in Haskell where you get the monad abstraction as a "higher level abstraction", and there are other forms of abstraction over continuations you get when you look at the continuation as a type instead of simply a procedure, for instance

You can take two continuations and do an alternative between them if the type follows the laws to be a monoid
You can abstract over the type to change the input or output types of the continuation if you encapsulate the closure in a type that abides the laws of a functor
You can arbitrarily and partially apply or decorate your continuation with functionality such as input validation or input conversion if you encapsulate the closure in a type that follows the laws of an applicative functor

Closure vs. Procedure

At the end of the day you're basically right; a continuation is a "procedure", though I would rather refer to it as a closure. Often times continuations are best expressed as first class closures that have enclosed a bound environment. In a pure functional language you might say this is not particularly reasonable because you lack references; this is true but you can enclose values and single assignment makes enclosing the value vs. the reference the exact same thing. This gives rise to in Haskell:

(\x -> \y -> insideYIcanAccess x (and y))

A language that lacks the ability to enclose a binding environment may technically lack first class closures, but even then there is some environment (generally the global) which is available to the closure.

So I would say it's more accurate to describe a continuation as: A closure being used in a particular way.

Conclusion

To the question of "Is a continuation implementable in any way other than a procedure?" No. If you don't have first class functions you really can't have continuations as such (yes function pointers count as first class functions, so alternatively arbitrary memory access can suffice).

Now to the question of "Are there any ways to express a continuation in a more abstract way than a procedure?" Expressing it as a type gives you a much greater abstraction, allowing you to treat the continuation in very general ways such that you can interact with the continuation in many more ways than just executing it.

Best Answer

Related Solutions

Design Patterns – How to Decouple Configuration Data from Programs

Programming Languages – Example of a Continuation Not Implemented as a Procedure

A continuation is the type of its inputs and outputs

Closure vs. Procedure

Conclusion

Related Topic