When there’s no TCO, when to worry about blowing the stack

recursionstackstackoverflowtail-call

Every single time there's a discussion about a new programming language targetting the JVM, there are inevitably people saying things like:

"The JVM doesn't support tail-call optimization, so I predict lots of exploding stacks"

There are thousands of variations on that theme.

Now I know that some language, like Clojure for example, have a special recur construct that you can use.

What I don't understand is: how serious is the lack of tail-call optimization? When should I worry about it?

My main source of confusion probably comes from the fact that Java is one of the most succesful languages ever and quite a few of the JVM languages seems to be doing fairly well. How is that possible if the lack of TCO is really of any concern?

Best Answer

Consider this, let's say we got rid of all loops in Java (the compiler writers are on strike or something). Now we want to write factorial, so we might right something like this

int factorial(int i){ return factorial(i, 1);}
int factorial(int i, int accum){
  if(i == 0) return accum;
  return factorial(i-1, accum * i);
}

Now we're feeling pretty clever, we've managed to write our factorial even without loops! But when we test, we notice that with any reasonably sized number, we're getting stackoverflow errors since there's no TCO.

In real Java this isn't a problem. If we ever have a tail recursive algorithm, we can transform it into a loop and be just fine. However, what about languages with no loops? Then you're just hosed. That's why clojure has this recur form, without it, it's not even turing complete (No way to do infinite loops).

The class of functional languages that target the JVM, Frege, Kawa (Scheme), Clojure are always trying to deal with the lack of tail calls, because in these languages, TC is the idiomatic way of doing loops! If translated to Scheme, that factorial above would be a good factorial. It'd be awfully inconvenient if looping 5000 times made your program crash. This can be worked around though, with recur special forms, annotations hinting at optimizing self calls, trampolining, whatever. But they all force either performance hits or unnecessary work on the programmer.

Now Java doesn't get off free either, since there's more to TCO then just recursion, what about mutually recursive functions? They can't be straightforwardly translated to loops, but are still unoptimized by the JVM. This makes it spectacularly unpleasant to try to write algorithms using mutual recursion using Java since if you want decent performance/range you have to do dark magic to get it to fit into loops.

So, in summary, this isn't a huge deal for many cases. Most tail calls either only proceed one stackframe deep, with things like

return foo(bar, baz); // foo is just a simple method

or are recursion. However, for the class of TC that don't fit into this, every JVM language feels the pain.

However, there is a decent reason why we don't yet have TCO. The JVM gives us stack traces. With TCO we systematically eliminate stackframes that we know are "doomed", but the JVM might actually want these later for a stacktrace! Say we implement a FSM like this, where each state tail-calls the next. We'd erase all record of previous states so a traceback would show us what state, but not anything about how we got there.

Additionally, and more pressingly, much of bytecode verification is stack based, eliminating the thing that lets us verify bytecode is not pleasant prospect. Between this and the fact that Java has loops, TCO looks like a bit more trouble than it's worth to the JVM engineers.