How to implement an ‘if’ statement in an interpreter

compilerinterpretersvirtual machine

If I were writing a compiler (say for a stack-based VM), the code for an if statement:

if (<some_expression>)
{
    <some_instructions>
}

Would be translated to the following psuedo-assembly:

<evaluate expression and push result on stack>
JUMP-IF-TRUE label
<some_instructions>
:label

This is easy to implement in a compiler, however I'm not sure how to implement this in an interpreter.

The difference between an interpreter and a compiler, is that a compiler outputs instructions to be performed later, and an interpreter performs right away.

So e.g. instead of outputting the instruction PUSH, an interpreter would simply PUSH on a stack it maintains. And so on.

So regarding JUMP instructions (such as the JUMP-IF-TRUE psuedo-instruction), I guess my question is how are these implemented in interpreters?

More accurately, how can I make an interpreter 'jump' over a piece of code, for example when interpreting an if statement?

Best Answer

As a picture is worth a thousand words, let's write an interpreter together! Of course, a very simple one. We use OCaml for this, but if you do not know OCaml, this is fine, because we will write very little code and comment it anyway.

We first define what is a program for us. We consider a very simple language, allowing to print text, add numbers, and test if a number is zero with an if-statement:

type statement =
| Print of string
| If of expr * statement * statement
| Sequence of statement list
and expr =
| Constant of int
| Plus of expr * expr

The code above is a type declaration and defines what our idea of a program is. It defines an abstract symbolic form for programs – as opposed to the concrete tetual form – which is easy to manipulate in OCaml. This type definition is a plain translation of the english description of what a program is, with the addition of a sequence, to execute more operations. (If we remove the Plus and the Sequence operations we pretty much obtain the smallest possible language demonstrating an If statement, but I felt it could be a bit too boring!)

Real life note In real life, a program contains a lot of annotations. An important one is that program elements are typically annotated with their location in termes of file, line, column, which is useful to report errors.

We can already write an entertaining program in our language:

let program = Sequence([
  Print("Hello, world!");
  If(Plus(Constant(1),Constant(0)),
    Print("One is the truth"),
    Print("One is a lie"));
  Print("Oh, that was a tough one!");
])

This should print the text Hello, world! to please Dennis, then compute 1 + 0 and compare the result to 0 – as for the definition of If in our language - if the result is distinct from 0, then the first branch is executed and the perlish message One is the truth is printed, otherwise the also perlish message One is a lie is printed. After such a tough challenge, our strained computer will share its feelings Oh, that was a tough one!.

Real life note In the real world, one needs to write a parser to translate plain text into an abstract program as the one held by the variable program. We can use lex and yacc for this, and their various derivatives.

Now let us write an interpreter for our language. (If you felt asleep, wake up now, the actual answer to the question will soon be presented!)

let rec interpreter = function
| Sequence(hd::tl) -> interpreter hd; interpreter(Sequence(tl))
| Sequence([]) -> ()
| Print(message) -> print_endline message
| If(condition, truebranch, falsebranch) ->
    (* The answer to the question is here! *)
    if (eval condition) <> 0 then
      interpreter truebranch
    else
      interpreter falsebranch
and eval = function
| Constant(c) -> c
| Plus(a,b) -> (eval a) + (eval b)

Most of us are likely not familiar with OCaml, so let us walk gently through this piece of code:

let rec interpreter = function
| Sequence(head::tail) -> interpreter head; interpreter(Sequence(tail))
   (* To interpret a list of statements, we interpret the first
      and then interpret the tail of the list. *)

| Sequence([]) -> ()
   (* To interpret an empty list of statements, we just do nothing.
      Fair enough, right? *)

| Print(message) -> print_endline message
    (* To print a message, we use the host-language printing facility. *)

| If(condition, truebranch, falsebranch) ->
    (* The answer to the question is here!

       We first evaluate the condition, with the eval function below
       and use the host-language if facility to take the decision
       of recursively calling the interpreter on the truebranch or
       the falsebranch. *)

    if (eval condition) <> 0 then
      interpreter truebranch
    else
      interpreter falsebranch

and eval = function
| Constant(c) -> c
    (* Constants evaluate to themselves *)
| Plus(a,b) -> (eval a) + (eval b)
    (* A Plus statement evaluates to the sum of its parts. *)

Now let's execute the program:

let () = interpreter program

which causes the output

Hello, world!
One is the truth
Oh, that was a tough one!

It you want to try it, just copy the program snippets in a file interpreter.ml and run with ocaml from your shell prompt:

% ocaml imterpreter.ml

Conclusion When we write an interpreter using an intermediate abstract representation of the program like in this small example. The If statement is easily implemented in terms of conditionals in the host language. Other strategies are possible, it is perfectly imaginable to compile the program “on the fly” and to feed it to a virtual machine. (Some people will argue that there is essentially no difference between the two approaches.)

Related Topic