Untyped Lambda Calculus – Why Call-by-Value is Strict

lambda

I'm currently reading Benjamin C. Pierce's “Types and Programming Languages”. Before really getting into type theory it explains lambda calculus and evaluation strategies.

I am a bit confused by the explanation of call by name vs call by value in this context.

The two strategies are explained in the following manner:

call by name

Like normal order in that it chooses the leftmost, outermost redex first, but more restrictive by not allowing reductions inside abstractions. An example:

  id (id (λz. id z))
→ id (λz. id z)
→ λz. id z

call by value

Only the outermost redexes are reduced and a redex is reduced only when its right-hand side has already been reduced to a value—a term that is finished computing and cannot be reduced any further. An example:

  id (id (λz. id z))
→ id (λz. id z)
→ λz. id z

(identical to the call by name evaluation)

Ok, so far so good. But this is followed by the following paragraph:

The call-by-value strategy is strict, in the sense that the arguments to functions are always evaluated, whether or not they are used by the body of the function. In contrast, non-strict (or lazy) strategies such as call-by-name and call-by-need evaluate only the arguments that are actually used.

I know what call-by-value and call-by-name means practically, from having used (among others) C and Haskell, but I cannot see why the evaluation strategy explained above leads to this in the lambda calculus. Is this an additional rule that always accompany call-by-value, or does if follow from the reduction strategy outlined above?

Especially since the reduction steps in the examples are identical, I fail to see the difference between the two strategies and would love if someone could help me gain some intuition.

Best Answer

Yes, the evaluation strategy as described leads to strict semantics, and the examples are spectacularly badly chosen to conceal the difference between the two semantics. I think it goes something like this:

id (id (λz. id z))  # strict means we evaluate the right hand side
→ id (λz. id z)     # RHS has been reduced (id (λz. id z)) → (λz. id z) by inner id 
→ λz. id z          # now we have called the outer id to obtain the final value

id (id (λz. id z))  # normal form means we call the outer id and pass RHS as a closure
→ id (λz. id z)     # outer id just returned its argument unevaluated → id (λz. id z) 
→ λz. id z          # now same thing is repeated with inner id

So the derivation steps look syntactically the same, but different things are happening. Under the lazy evaluation scheme, id doesn't actually force evaluation of its argument: it simply returns that argument itself. So not only do id x and x yield the same value, but they are actually equivalent: id x really yields x itself, and then x yields its value later when actually needed. So likewise, id (id (λz. id z)) simply yields the unevaluated right hand side (id (λz. id z)).

What's confusing in the example is that it's based on nesting the same function, which is just id, such that two different reductions both yield id (λz. id z). In one this is just a copy of the inner id expression, and in the other, it's the value of the inner expression, being passed an argument to the outer id.