Code Quality – Does Simplicity Always Improve Readability?

code-qualityreadability

Recently, I was developing a set of coding standards for our company. (We're a new team branching out into a new language for the company.)

On my first draft, I set the purpose of our coding standards as improving Readability, Maintainability, Reliability, and Performance. (I ignored writability, portability, cost, compatibility with previous standards, etc.)

One of my goals while writing this document was to push through the idea of simplicity of code. The idea was that there should be only one function call or operation per line. My hope was that this would increase readability. It's an idea that I carried over from our previous language.

However, I've questioned the assumption behind this push:

Does simplicity always improve readability?

Is there a case where writing simpler code decreases readability?

It should be obvious, but by "simpler", I don't mean "easier to write", but less stuff going on per line.

Best Answer

"Simple" is an overused word. "Readable" can profitably be defined as "simple to understand", in which case increasing (this measure of) simplicity by definition increases readability, but I don't think this is what you mean. I've written about this elsewhere, but generally something can be called "simpler" either by being more abstract (in which case fewer concepts can express more phenomena) or by being more concrete (in which case a concept does not require as much background knowledge to understand in the first place). I'm arguing that, depending on perspective, a more abstract concept can reasonably be called simpler than a more concrete concept, or vice versa. This, even though "abstract" and "concrete" are antonyms.

I'll use as an example some Haskell code I wrote a while ago. I asked a question on stackoverflow about using the List monad to calculate a counter in which each digit could have a different base. My eventual solution (not knowing much Haskell) looked like:

count :: [Integer] -> [[Integer]]
count [] = [[]]
count (x:xs) =
  -- get all possible sequences for the remaining digits
  let
    remDigits :: [[Integer]]
    remDigits = count xs
  in
  -- pull out a possible sequence for the remaining digits
  do nextDigits <- remDigits
     -- pull out all possible values for the current digit
     y <- [0..x]
     -- record that "current digit" : "remaining digits" is
     -- a valid output.
     return (y:nextDigits)

One of the answers reduced this to:

count = mapM (enumFromTo 0)

Which of these is "simpler" to understand (i.e. more readable) depends entirely on how comfortable the reader has become with (abstract) monadic operations (and, for that matter, point-free code). A reader who's very comfortable with these abstract concepts will prefer to read the (short) more abstract version, while one who is not comfortable with those operations will prefer to read the (long) more concrete version. There is no one answer about which version is more readable that will hold for everybody.

Related Solutions

S-expressions readability

How do you parse

if (a > b && foo(param)) {
  doSomething();
} else {
  doSomethingElse();
}

The parse tree probably looks something like

if:
  condition:
    and:
      lt:
        left: a
        right: b
      function:
        name: foo
        param: param
  true-block:
    function:
      name: doSomething
  false-block:
    function:
      name: doSomethingElse

hmm... let's serialize this tree into a list, prefix notation

if(and(<(a, b), function(foo, param)), function(doSomething), function(doSomethingElse))

This parse tree format is pretty easy to manipulate, but I have one problem. I hate separators. I like terminators. At the same time, I like sprinkling in whitespace.

if( and (<(a b) function(foo param)) function (doSomething) function ( doSomethingElse))

hmm... the additional whitespace makes certain things harder to parse... Maybe I could just make a rule that the tree is represented as (root leaf leaf leaf).

(if (and (< a b) (function foo param)) (function doSomething) (function doSomethineElse)

Now my serialization of a parse tree is lisp (rename function to apply, and this probably runs). If I want programs that write programs, it's kind of nice to just manipulate parse trees.

This isn't entirely how s-expressions came about, but it was identified early, and it is one feature that lisp programmers use. Our programs are pre-parsed in some sense, and writing programs to manipulate programs is fairly easy because of the format. That's why the lack of syntax is sometimes considered a strength.

But as David said, use an s-expression aware editor. You are more likely to lose track of a closing brace in an s-expression than a closing brace in xml (</foo> only closes <foo>, but right paren closes ANY s-expression). In racket, the use of square brackets for some expressions, coupled with good indenting style, fixes most problems.

The lisp version:

(if (and (< a b) (foo param))
  (doSomething)
  (doSomethingElse))

Not too bad.

C++ – Using Lambdas to Improve Function Readability

Does it improve readability ?

Your way of using lambdas to break-down a larger function in smaller parts is similar to the nested functions in Pascal, ADA and other languages.

It indeed improves the readability of the main part of your function body: there are less statements to read to understand what it does. This is the main purpose of nested functions. Of course, I assume that nowadays, most programmers are familiar with the syntax of lambdas.

However, is it a good idea ?

Scott Meyers, in his book Effective Modern C++ warns against the use of default capture in lambdas. His main worry is about dangling references (e.g. if a lambda is defined in a block and is used out of the scope of this block when the variable doesn't exist anymore), which seems not to be an issue in your case.

But he also underlines another problem: the illusion of having a self-contained function. And here lies the major weakness of your approach:

you have the impression that your lambda is self contained, but in fact it's completely dependent of the rest of the code, and you don't see easily in your lambda where the captured values are coming from, which assumptions you can make on them, etc...
as the link with the main body is based on the captured variables, which can be read or written, it is in fact very difficult to guess all the side effects hidden in your lambda invocation, which could influence your main part.
so it's very difficult to identify assumptions and invariants in the code, both of the lambda, and of your mega function
in addition, you could accidentally change a variable that you forgot to declare locally in your lambda, and one happens to have the same name in the function.

First advice: at least, enumerate explicitly the variables captured by your lambda, in order to better control the potential side effects.

Second advice: once this works, you could think of strengthening your structure further, by evolving from capture to parameter passing. If there are too many of them, you'd have to refactor. One approach could be to make your function a callable class, promoting your throw away lambdas to member functions, and making the variables used throughout the computation member variables. But it's difficult to say if it's the best option from the elements you gave.

And why are you in such a situation ?

The next think to think about, is why you have such a big function in the first place. If you'd follow Uncle Bob's advice given in his book Clean Code (summary of the function topic on this blog page) you should have:

small functions,
that do one thing (single responsibility),
and that do only things at one level of abstraction