Nanopass Compiler

Through a friend, I got hold of a provocative paper A Nanopass Framework for Compiler Education, by Sarkar, Waddell, and Dybvig. They describe a compiler written in scheme that makes 50ish passes. Each pass is described as a language transform, where both the starting input and output language must have precise grammars. To reduce the code bloat, languages descriptions can be inherited; so the developer must really only specify what is changed via each transform. The remove-not pass was 7 lines.

The framework also provides some nice debugging features. Since each pass is quite small, it’s easier to get right. It’s also logically separated from other transforms. Adherence to each IR language is checked via scheme’s dynamic typing system. And there’s also a parser/unparser pair that allow visualization of the IR language.

I find most fascinating the extreme language-centric view of the entire approach. It strikes me as something that only scheme/lisp people would invent. The framework itself reads as a DSL for describing incremental language transforms. The DSL removes the crufty parts from my idea: that the compiler is a Visitor of visitors.