Object Oriented Compiler Architecture

Out of some curiosity, I quickly read this paper today:

The Object-Oriented Architecture of High-Performance Compilers, Lutz Hamel, Diane Meirowitz and Spiro Michaylov, Technical Report, Thinking Machines Corporation, 1996.

Even though they assume that the compilers internal representation is some kind of AST, the underlying lessons are worth communicating, and are applicable even if your compiler has a block or sea-of-nodes representation.

First, we should separate the data structure of the IR from the algorithms (optimization passes and code transforms) which manipulate the IR. The Visitor pattern allows us to do this in the most generic way possible. This allows the algorithms to differ in their order of traversal. The only constraint is that they must adhere to the IR’s API, which should be kept very small. Since learning the Visitor pattern, I’ve been thinking that the phases of a compiler make it a Meta-Visitor or a Visitor of Visitors. It’s nice to see that viewpoint is shared by others.

Second, attributes (such as use-def chains) can be added to the IR separately, so that they are not visible outside of the transformations which require them. They propose attaching attributes to IR nodes by a templatized AttributeManager class, backed by an associative array. I would prefer if the Node class itself was extended (via subclass or decorator) so that it had a (possibly templatized) setAttr(...) method. However, we should still keep it possible to add and remove attributes “on-the-fly” so that they do not stick around past their usefulness, and so that they may be created on demand. I’m not sure how best to handle updating the attributes when the IR is transformed (i.e. lifting code from a loop might change the nesting count attribute). Perhaps a call-back mechanism to update or just invalidate that attribute’s manager.