Visitor Pattern – Traversing an AST Using Visitors

cdata structuresdesign-patternsparsingvisitor-pattern

I'm writing a compiler for a C-like language, and I'm looking for an elegant way to traverse my abstract syntax tree. I'm trying to implement the Visitor pattern, although I'm not convinced that I'm doing it correctly.

struct Visitor {   
    // Expressions
    virtual void visit(AsgnExpression&);
    virtual void visit(ConstantExpression&);
    ...
    virtual void visit(Statement&);
    ...

    virtual void finished(ASTNode&);

protected:
    virtual void visit(ASTNode&) = 0;
};

visit is overloaded for each type, and by default each overload will call visit(ASTNode&) which subclasses are forced to implement. This makes it easier to do quick and dirty things, although defining a visit for each type is tedious. Each subclass of ASTNode must implement an accept method which is used to traverse the tree structure.

class ASTNode {
public:
    virtual ~ASTNode();
    virtual void accept(Visitor& visitor) = 0;
};

However, this design is quickly becoming tedious because the accept methods are often very similar.

Who should be responsible for traversing the structure, the nodes or the visitor? I'm leaning towards having ASTNode provide an iterator for accessing its children, and then having the visitor traverse the structure. If you have any experience designing Abstract Syntax Trees, please share your wisdom with me!

Best Answer

Who is responsible for the traversal depends for a large part on the analysis you want to do in your visitors, the details of the language structure and also a part personal preference.

In particular, if there are cases where the visitor to a parent node needs to take an action halfway through the processing of the children, then you must put the traversal logic in the visitor. For example, if your language has a construct where a newly introduced variable is available only in some of the child nodes of the node that introduces the variable.
Another case is when you need a mixture of pre-order and post-order traversal. With traversal in the nodes, each node must call the visitor twice, once before and once after the children. In that case, it might be easier to let the visitor do the traversal.

Otherwise, it is mostly a matter of preference. The traversal can be either in the nodes or in the visitor.