How Compiler Construction Remains Relevant (part 2)

At the end of part 1, I said that we, as educators, should “consider requiring a software engineering class that gives students practice with larger code bases, before allowing enrollment in compilers.” I’d now like to rescind that suggest in favor of a better one. The compilers class should be the software engineering class!

Compilers are non-trivial projects. They have several components with strong interdependencies. Care must be taken in the design and layout of the data structures, lest one end up with a hairy confusing ball of text. Without good discipline, and clear direction it is very easy for a naive, new-comer approach to fall flat on it face.

Compilers require a good background in data structures. For example, the SymbolTable can be implemented as a LinkedList of HashTable’s, with a recursive lookup method. The intermediate representation is usually one of two popular options: Abstract Syntax Tree or Graph of BasicBlocks.

Compilers also require a good background in algorithms. Performing type-checking correctly is important. Nobody wants to have a language where the type-checker can make illogical or unsound deductions. It’s also impossible to get right if approached in an ad-hoc manner. Performing any optimization requires passing over the code and applying specific transforms applicable only if specific requirements are met. Register allocation and instruction scheduling also requires passing over code in a specific manner. Many of these algorithms are based in Graph Theory. If you don’t take care in the design of the algorithm, you will encounter an input program that creates a graph of basic blocks or instructions, which violates a hidden, implicit assumption.

Unifying both the algorithms and data structures are the Design Patterns. I’ve found in my experience, that the Composite Pattern is excellent for a type-checking system[1], and the Visitor Pattern is excellent for any pass that performs a code transformation. Because a compiler can be structured as a chain of small transforms, the Visitor pattern is especially relevant. At the high-level a compiler is a Visitor of Visitors.

Stepping back to look at the components involved: data structures, algorithms, design patterns and the implied code discipline to tie it all together into a cohesive whole, we see that the compiler is an excellent case-study in software engineering. If the students don’t already come into the class with a strong knowledge of the programming craft, then a compiler class is a great place for them to learn it! It nicely ties together all the stuff they should have observed (if not absorbed) in prior classes.

There is also supportive evidence[1], that when students must implement a clearly OO-designed component (such at the type-checker) they perform substantially better, than when they must implement the same functionality, but without explicit OO-design (such as an attribute grammar). I think that the experience reported in [1], demonstrates my main premise: The compiler construction course should be the culminating software engineering class of our undergraduate program.

[1]
@article{ortin2007design,
  title={Design patterns for teaching type checking in a compiler construction course},
  author={Ortin, F. and Zapico, D. and Cueva, J.M.},
  journal={Education, IEEE Transactions on},
  volume={50},
  number={3},
  pages={273--283},
  year={2007},
  publisher={IEEE}
}