Implementing Lazy Evaluation in if() Statements – Guide

compilerinterpreterslanguage-design

I am currently implementing an expression evaluator (single line expressions, like formulas) based on the following:

the entered expression is tokenized to separate literal booleans, integers, decimals, strings, functions, identifiers (variables)
I implemented the Shunting-yard algorithm (lightly modified to handle functions with variable number of arguments) to get rid of parenthesis and order the operators with a decent precedence in a postfixed order
my shunting-yard simply produces a (simulated) queue of tokens (by means of an array, my Powerbuilder Classic language can define objects, but only have dynamic arrays as native storage – not true list, no dictionary) that I evaluate sequentially with a simple stack machine

My evaluator is working nicely, but I am still missing an if() and I am wondering how to proceed.

With my shunting-yard postfixed and stack based evaluation, if I add if() as another function with a true and false parts, a single if(true, msgbox("ok"), msgbox("not ok")) will show both messages while I would like to show only one. This is because when I need to evaluate a function, all of its arguments has already been evaluated and placed on the stack.

Could you give me some way to implement if() in a lazy way?

I though about processing these as a kind of macro, but at early time I have not yet the condition evaluation. Perhaps that I need to use an other kind of structure than a queue to keep separately the condition and the true / false expressions? For now the expression is parsed before evaluation, but I also plan to store the intermediate representation as kind of precompiled expression for future evaluation.

Edit: after some though on the problem, I think I could build a tree representation of my expression (an AST instead of a linear token stream), from which I could easily ignore one or another branch of my if().

Best Answer

There are two options here.

1) Don't implement if as a function. Make it a language feature with special semantics. Easy to do, but less "pure" if you want everything to be a function.

2) Implement "call by name" semantics, which is much more complicated, but allows compiler magic to take care of the lazy evaluation problem while keeping if as a function instead of a language element. It goes like this:

if is a function that takes two parameters, both of which are declared as "by name". When the compiler sees that it's passing something to a by-name parameter, it changes the code to be generated. Instead of evaluating the expression and passing the value, it creates a closure that evaluates the expression, and passes that instead. And when invoking a by-name parameter inside the function, the compiler generates code to evaluate the closure.

Related Solutions

Python – How Python Interpreter Recognizes Code Blocks

The Python documentation, in the Lexical Analysis section, describes briefly how the indentation parsing works. In short, the tokeniser generates special INDENT and DEDENT tokens that are used by the parser when deciding where blocks of code start and end. These tokens (roughly) correspond to the { and } tokens in C-like languages.

How to alter the code at runtime in an interpreter

Side note: use compiler/interpreter distinction carefully, as it has its caveats.

Some folks use this distinction to assert that their language is faster because it is compiled/interpreted. The real answer is that a language cannot be faster/slower than another one: is German faster than Japanese?
Many other assertions related to this distinction don't make sense neither.
Many languages use more complicated approaches. For example, C# is compiled to IL, which is then “translated into native code or executed by a virtual machine”.
Some languages are “interpreted, but not really”.

Now to answer your question:

Altering the code:

Since the code is interpreted, it's easier to change it on the fly when executing it. One of such examples is eval(), used in some languages to dynamically inject source code available only as a string.

This being said, this point is both incomplete and confusing. C# is a compiled language (aside DLR), and still, there are ways to inject custom code during the execution of the app.

Code optimization:

Often, compilers are not limited to simply converting code written in source language to target language. They also do optimizations. Imagine you write:

const int ratio = 14;

function getFactor()
{
    int factor = 2 * ratio;
    string debug = "The factor is " + factor;
    return factor;
}

function computeSomething(int expenses)
{
    int factor;
    float result = (float)(expenses + factor) / 2;
    return result;
}

A compiler can use a few basic tricks to optimize this code:

Inline

The constant can be inlined, as well as the first function. This gives:

function computeSomething(int expenses)
{
    int factor = 2 * 14;
    string debug = "The factor is " + factor;
    float result = (float)(expenses + factor) / 2;
    return result;
}

Remove unused code

Here, this will result in an important improvement, since string concatenation is often an expensive operation.

function computeSomething(int expenses)
{
    int factor = 2 * 14;
    float result = (float)(expenses + factor) / 2;
    return result;
}

Optimize mathematical operations

Float multiplication is often cheaper than division.

function computeSomething(int expenses)
{
    int factor = 2 * 14;
    float result = (float)(expenses + factor) * 0.5;
    return result;
}

Remove intermediary variables

function computeSomething(int expenses)
{
    return (float)(expenses + (2 * 14)) * 0.5;
}

Note that many today's interpreters also do those optimizations, which means that when you use an interpreted language, you should write your code for humans, not for computers, letting the interpreter do work.

Best Answer

Related Solutions

Python – How Python Interpreter Recognizes Code Blocks

How to alter the code at runtime in an interpreter

Related Topic