Compiler Optimization – Code Generation for Expressions

compileroptimization

I don't know if this question has a simple answer or not, but I just have to ask.

I'm reading this tutorial (I don't know if this is a well-known series, but it's name is 'Let's build a compiler!' by Jack Crenshaw). It's a nice series where you gradually learn how to construct a compiler (it is a learn-by-doing thing, really).

In the second chapter already, I felt the need to optimize some things. In the series, some concessions are made, in order to keep things simple: there is no difference between lexical scanner (although I believe it is introduced later in the series) and the code generation: they are done in one go. Also, the author heavily relies on the stack to create the code to handle expressions.
(Also, the output is assembly code, I don't know if this is normal for compilers, so I just thought I'd mention it…)

For example, parsing the expression 5 + (7 * 3 – 21 / 3) (and storing the result in eax) results roughly in the following code (I use x86 asm, as opposed to the series, so this is the assembly code my compiler gave):

    push 5   ; These lines is generated by a simple number parser
    push 7   ; This parser pushes its result to the stack
    push 3   ; (like most other parsing functions)

    pop eax  ; Code for multiplication
    pop ebx
    mul ebx
    push eax

    push 21  ; Result of parsing these numbers
    push 3

    pop ebx  ;  Division code
    pop eax
    div ebx
    push eax

    pop ebx  ; Subtraction code
    pop eax
    sub eax, ebx
    push eax

    pop eax  ; Addition code
    pop ebx
    add eax, ebx
    push eax

As you can see, the code for handling the addition, subtraction, multiplication, etc. can stay the same each time the operation is used. This is very convenient, but of course, the assembly code generated is very slow and esoteric. When writing by hand, one would rather write something like:

    mov eax, 21
    div 3
    mov ebx, eax
    mov eax, 7
    mul 3
    sub eax, ebx
    add 5

My question is if there is any way to optimize the compiler output. Of course there is, but is there any specific way to make my code more like the second, ie: not using the stack so intensively, but instead relying on registers to pass values.

I myself have been thinking about parsing the whole expression and representing it internally as a tree-like structure (I supposethis is the kind of thing a lexical parser does) before generating the code, but I'm not entirely sure how to proceed.
On the other hand, there is peephole optimization and the possibility to optimize the assembler code after it has been generated, but I have no idea how effective this will be. Generating better code in the first place seems logical and more feasible to me.

I know there are many techniques available to optimize my code (for example, constant folding, which would reduce the whole expression to 19. I have deliberately not used this kind of optimization to bring my point across better), I am hoping for a single technique that is well-suited for this kind of stuff.

Best Answer

Understand that Jack Crenshaw's tutorial is about the basics of compiler design. It is not intended to take you all the way to production quality in one swell foop. There are other basic texts available: the various dragon books are well-known. I personally like Wirth's text: it is almost as approachable as Crenshaw's.

For production quality code generation, you have to do a lot more. The canonical reference here is Wulf's "Design of an Optimizing Compiler". Surprisingly, Amazon lists several used copies available at reasonable prices.

Also note that generating optimal, or even sort-of-optimized, code for the x86 architecture is a lot like breeding elephants: it takes lots of time, lots of noise, lots of effort, lots of bellowing, and there is a real risk of getting crushed underfoot.