Type Inference in Interpreters – How Should It Change the Parsed AST?

functional programminginterpreterstype-systems

When writing an interpreter, how should the type inference algorithm change the parsed AST? Should it? Or parsing and inference are done necessarily simultaneously?

I have implemented a strongly typed functional language interpreter, in which each function can be declared only one time (i.e., with no signature variants).
As soon as the non-typed implementation parser was concluded and working, I have written the type inference algorithm, feeded by the parser's resulting AST.

Hindley-Damas-Milner is working like a charm, but the program use it and if the type checking passes, it just forgets about the matter and pass the AST to the interpreter. To the moment, the "target language" (i.e., the one in which the interpreter is written) have satisfied the interpreter type needs, but I am wondering if this approach is usual: to discard all the type inference fruits after an "ok" result.

Also, would writing a compiler instead of an interpreter make any difference is this sense?

Best Answer

No, it is definitely not the usual to throw away good analysis results. For a stark difference you may want to check out the Scala compiler. It is based on execution phases and you can add plugins which will be executed after certain phases. It is particularly interesting to place your plugin after the typing phase, because that gives you all of the inferred types in a very handy way.

The AST, however, is not so much changed as it is annotated. The basic structure of your statements, expressions, function definitions, etc. that are held in your AST do remain unchanged of course. What you really do during type inference is to annotate the tree nodes with their respective types.

For example, before typing inference ran you would have an AST with a function node for the function f, that is located somewhere within the AST and has some child nodes, which resemble arguments and statements/expressions. During type inference you determine the types of these child nodes, i.e. of the statements/expressions, and from them you determine the type of the function f. Now you could be satisfied detecting, that f is returning an int and it is only ever used in places, where int is used, but there is no reason to lose this information.

As for interpreter vs compiler, there are huge benefits from having types annotated to your AST, which go far beyond that. Consider for example IDEs.. plugin authors are very thankful for having a type-annotated AST available. It is so much more efficient to simply know that f returns an int, when you want to implement code completion, instead of having to run type inference yet again.

You may want to extend your language at one point. If the extension is anywhere past the type inference, you typically need to rely on its results. For example, consider adding type-safe macros, or some kind of semantic checks.

In summary: Type information in a strongly typed lanugage is so important, that it resembles blasphemy if you perform type inference only to throw away its results. You rob yourself and others of information that is both, extremely powerful when available, and rather costly to generate.

Related Topic