I think I understand the goal of an AST, and I've built a couple of tree structures before, but never an AST. I'm mostly confused because the nodes are text and not number, so I can't think of a nice way to input a token/string as I'm parsing some code.
For example, when I looked at diagrams of AST's, the variable and its value were leaf nodes to an equal sign. This makes perfect sense to me, but how would I go about implementing this? I guess I can do it case by case, so that when I stumble upon an "=" I use that as a node, and add the value parsed before the "=" as the leaf. It just seems wrong, because I'd probably have to make cases for tons and tons of things, depending on the syntax.
And then I came upon another problem, how is the tree traversed? Do I go all the way down the height, and go back up a node when I hit the bottom, and do the same for it's neighbor?
I've seen tons of diagrams on ASTs, but I couldn't find a fairly simple example of one in code, which would probably help.
Best Answer
The short answer is that you use stacks. This is a good example, but I'll apply it to an AST.
FYI, this is Edsger Dijkstra's Shunting-Yard Algorithm.
In this case, I will use an operator stack and an expression stack. Since numbers are considered expressions in most languages, I'll use the expression stack to store them.
(Please be nice about my code. I know it's not robust; it's just supposed to be pseudocode.)
Anyway, as you can see from the code, arbitrary expressions can be operands to other expressions. If you have the following input:
the code I wrote would produce this AST:
And then when you want to produce the code for that AST, you do a Post Order Tree Traversal. When you visit a leaf node (with a number), you generate a constant because the compiler needs to know the operand values. When you visit a node with an operator, you generate the appropriate instruction from the operator. For example, the '+' operator gives you an "add" instruction.