So, I'm working to teach myself Scala, and one of the things I've been playing with is the Stream
class. I tried to use a naïve translation of the classic Haskell version of Dijkstra's solution to the Hamming number problem:
object LazyHammingBad {
private def merge(a: Stream[BigInt], b: Stream[BigInt]): Stream[BigInt] =
(a, b) match {
case (x #:: xs, y #:: ys) =>
if (x < y) x #:: merge(xs, b)
else if (y < x) y #:: merge(a, ys)
else x #:: merge(xs, ys)
}
val numbers: Stream[BigInt] =
1 #:: merge(numbers map { _ * 2 },
merge(numbers map { _ * 3 }, numbers map { _ * 5 }))
}
Taking this for a spin in the interpreter led quickly to disappointment:
scala> LazyHammingBad.numbers.take(10).toList
java.lang.StackOverflowError
I decided to look to see if other people had solved the problem in Scala using the Haskell approach, and adapted this solution from Rosetta Code:
object LazyHammingGood {
private def merge(a: Stream[BigInt], b: Stream[BigInt]): Stream[BigInt] =
if (a.head < b.head) a.head #:: merge(a.tail, b)
else if (b.head < a.head) b.head #:: merge(a, b.tail)
else a.head #:: merge(a.tail, b.tail)
val numbers: Stream[BigInt] =
1 #:: merge(numbers map {_ * 2},
merge(numbers map {_ * 3}, numbers map {_ * 5}))
}
This one worked nicely, but I still wonder how I went wrong in LazyHammingBad
. Does using #::
to destructure x #:: xs
force the evaluation of xs
for some reason? Is there any way to use pattern matching safely with infinite streams, or do you just have to use head
and tail
if you don't want things to blow up?
Best Answer
a match {case x#::xs =>...
is about the same asval (x, xs) = (a.head, a.tail)
. So the difference between the bad version and the good one, is that in that in the bad version, you're callinga.tail
andb.tail
right at the start, instead of just use them to build the tail of the resulting stream. Furthermore when you use them at the right of#::
(not pattern matching, but building the result, as in#:: merge(a.b.tail)
you are not actually calling merge, that will be done only later, when accessing the tail of the returned Stream. So in the good version, a call to merge does not calltail
at all. In the bad version, it calls it right at start.Now if you consider numbers, or even a simplified version, say
1 #:: merge(numbers, anotherStream)
, when you call you calltail
on that (astake(10)
will),merge
has to be evaluated. You calltail
onnumbers
, which callmerge
withnumbers
as parameters, which callstails
onnumbers
, which callsmerge
, which callstail
...By contrast, in super lazy Haskell, when you pattern match, it does barely any work. When you do
case l of x:xs
, it will evaluatel
just enough to know whether it is an empty list or a cons. If it is indeed a cons,x
andxs
will be available as two thunks, functions that will eventually give access, later, to content. The closest equivalent in Scala would be to just testempty
.Note also that in Scala Stream, while the
tail
is lazy, thehead
is not. When you have a (non empty) Stream, the head has to be known. Which means that when you get the tail of the stream, itself a stream, its head, that is the second element of the original stream, has to be computed. This is sometimes problematic, but in your example, you fail before even getting there.