Considerations of Programming Language Design

Reddit modded up a nice review of Considerations When Designing your Own Programming/Scripting Language (it’s worth following the links provided there to Clementson’s Blog, to get a larger picture of the issue).

There’s really a ton of stuff to think about. Mostly the field of computer science concerns itself with taming complexity. All too often software projects buckle under the weight of feature creep or code bloat. Eventually, in the career of any decent programmer, the ability to clearly see and identify the common organizational patterns becomes a core focus. Anyone with exposure to several different languages, notices that these organizational patterns are expressed differently among the babble, that even the same old problems (i.e. parsing, printf, or search/logic) can be cast in completely different light by a shift in linguistic perspective.

We see that language design suffers exactly the same problems as other domains. But, because it is a language, botching your solution to these problems, will affect the way that programmers solve their problems. Once such non-trivial task that involves many sub-problems which can clearly bring to light these issues is that of writing a compiler. Compilers involve a great deal of data structures and algorithms: text-manipulation, parsing, trees, graphs, fixed-point algorithms (dataflow analysis), NP-complete problems (register allocation, instruction scheduling). From this simple observation, we see that we should follow Wirth’s language-complexity metric: that languages should be compared based on the relative sizes of their self-compilers. (It really is an elegant fixed-point metric to the circularity involved in the complexity-of-expression/complexity-of-implementation trade-off).

So, to highlight these ideas, let’s take a particularly poor specimen: C++. This linguistic monstrosity is both verbose and inconvenient. Yes, it let’s you get close to the ‘bare-metal’, and it requires you to really think about what you are doing. (to a fault actually: I’ve noticed that the more C++ I learn the more tricky and horrendous my bugs become). But, even though I use it most often, I can’t help but feel that it score’s incredibly poor in the design space. The features don’t interact well together: inheritance vs templates, memory management vs exceptions, etc. The lower-level exposure consistently prevents higher-level conveniences. For example, pointer arithmetic prevents garbage collection (although garbage collection was proposed for C++0x it got thrown out at the last minute, hopefully it will be introduced eventually. I’m not sure I want to know what horrible contortions of logic must be followed to actually implement it.) Finally, it should be noted, that the complexity of implementing C++ in C++ is incredibly daunting.

So, designing a language is incredibly difficult. The implementation details of certain features can easily wreak havoc on the design. What’s it take to implement closures? Do you want a fast linear stack, or would you like to have continuations? Do you want type-safety or does that drive you to drop first-class functions out of implementor’s laziness? Nor do the features always work well together, as demonstrated by the C++ trade-offs above. Yes, before designing your own language, take great heed of the accumulated wisdom within that introductory post.