Why don’t “multi-infinite” list comprehensions work with lazy evaluation

As a simple demonstration of the efficiency of Haskell style, I thoughtlessly ran the following:

take 100 [(a, b, c) | a <- [1..], b <- [1..], c <- [1..], a^2 + b^2 == c^2]

This should be a way of deriving the first 100 Pythagorean triples, with duplicates. In practice however, it never halts, because the algorithm itself defies lazy evaluation.

To think about it in terms of actual implementation, the following should be something similar to how the list comprehension is actually evaluated, in an imperative style:

results = []
for (a = 0; a < ∞; a++) {
  for (b = 0; b < ∞; b++) {
    for (c = 0; c < ∞; c++) {
      if (a^2 + b^2 == c^2) {
        results[] = [a, b, c]
      }
    }    
  }
}

When written like this, it becomes obvious that the function can never yield results, because infinite time will be spent testing whether 1^2 + 1^1 == c^2, as only the innermost for loop will advance, and a and b will remain '1'.

The common solution in this particular case is to constrain the values of the smallest two variables to that of the largest:

take 100 [(a, b, c) | c <- [1..],a <- [1..c], b <- [1..c],a^2 + b^2 == c^2]

However, this seems like an obvious oversight for implementors of the language. When you think about it, any list comprehension with more than one infinite source of search space will never halt, for the same reason, except some will yield useful results when (1, 1, x) is useful. There are questions discussing this problem, but most discuss specific cases, rather than the problem overall. Why isn't fixing this within the language with a different iteration pattern trivial?

Best Answer

It is certainly possible to define an iteration order for x1 <- [1..], x2 <- [1..], ..., xK <- [1..] that will reach any tuple (c1, c2, ..., cK) after some finite amount of steps, and tries each tuple only once. Take any computable bijection between the natural numbers N and the k-tuples of natural numbers, N^k. Such a bijection exists, as the Cartesian product of countable sets is again countable. One example would be repeatedly applying the (inverse of the) Cantor pairing function. The precise bijection being used would have to be specified by the Haskell report, because it affects the order in which results are found.

However, this is of limited utility, because it would only cover the very specific pattern with [1..]. There are countless other ways to produce infinte lists, and not only is it far more involved to define a suitable iteration order for those cases, it is also impossible for the compiler to detect infinite lists in all cases (Rice's theorem). So you'd have to define some heuristic for when this transformation applies. In the interest of being feasible to describe and implement across multiple interpreters and compilers, this would likely be a very simplistic heuristic.

So what you are left with is a subtle change to the semantics of a program, with no indication in the program source code, that only happens when some arcane and limited heuristic fires. Thanks, I'll pass. Just explicitly write the program to do what you want.

Best Answer

Related Solutions

Why is the concept of lazy evaluation useful

Related Topic