Strong Typing for Security

I got into a mild argument about static vs. dynamic typing. I recognize that static typing can be verbose to the point of being repetitious. Take Java generics for example:

List<String> astr = new ArrayList<String>();

There really isn’t a great reason why the compiler can’t infer the type of the variable on the right hand side of the assignment. C# already implements type inference for this case, and C++ is adding it. ML and Haskell are strongly typed and have practiced type-inference since their inception. So we should actually dismiss the verbosity objection to static typing right now, because it’s an artifact of implementation that the more popular languages, C++ and Java, represent really poor examples of what could otherwise be a really good thing.

In my opinion, a static typing system is actually a proof system over your code. We shouldn’t complain about having compiler errors, rather we should rejoice that the compiler is able to automatically detect cases where we were ambiguous or tried to do something ill-defined. We should try to write our code so that the compiler can tell us when we make a mistake. Really, we want to express as many constraints as possible, so that the machine can do more checking and we end up with less buggy code. Statically typing all our variables and expressing systematic constraints is an effort that pays off in spades for large code bases.

But couldn’t we all just use the more flexible dynamic typing languages, and catch the bugs with testing? In my opinion, no. Testing should be done anyway, but it isn’t enough to prove the absence of a bug. Only a proof checker, such as a static typing system, can come close to doing that. I think I can really drive this point home by examining web applications.

The Problem

Web applications are really glorified string processors. HTML requests come in as strings, and web pages are emitted as strings. JavaScript processes more strings in the page layout, potentially requesting even more information from the server in response to user-generated events. Forums, Social Networking, and other participatory applications allow for user generated content. This widespread and popular practice actually leaves our glorified string parser (web app) at risk: for, if we are not careful, a malicious user can supply a string which, if it appears in the ‘wrong’ context, might be interpreted as legitimate JavaScript code by the application. That is, malicious users can execute arbitrary code, with the full rights and privileges as the application itself. This vulnerability is known as Cross-Site Scripting (XSS).

So, we find ourselves writing a string processor which must deal with strings of various encodings, special characters, and escape conventions. Namely, HTML, JavaScript, XML, CSS, URL. If one of these strings (even from our own database) manages to arrive in a context without first going through a filter to sanitize it, then our application has a security vulnerability. Do you think that it’s possible to write test cases (or even auto-generate them) given all the code paths, all the different sources (user, cookie, url, database, etc) and all the contexts in which a string might appear. In my opinion, the exponential complexity makes testing an infeasible approach. What we really need, then, is a proof system to verify that no strings end up in the wrong context.

Static Typing to the Rescue

If we are willing to go back to our application and examine it in detail, we find that we should really be treating each of the above strings as different types. HtmlString should be a different type from JSString, which are again both different from UrlString. Simply expressing each context as a different type enables our static typing system to verify that we never use the wrong kind of string in the wrong context. We can also provide explicit conversion functions, which provide the proper escaping and sanitization when moving from one context to another.

void addToDocument(HtmlString hStr);
HtmlString fromURL(UrlString uStr);
UnsafeString HttpRequest(UrlString uStr);

Language Support

What’s most unfortunate about this approach is that neither C++ nor Java provide us with an easy way to distinguish two strings. We certainly don’t want to use C’s typedef, because that enables automatic coercion between the different kinds of string, which defeats the point. So, we’re forced into creating a separate class for each of these strings, including implementing all the operators that make for convenient string manipulation. I’d really love a language that would allow me to extend my existing string type without fully re-implementing everything, yet still be able to treat the extension as a completely different type.

Conclusion

Essentially we’re using the static typing as a proof system to constrain our programming practices. The static type verification provides a proof that we never use a string in the wrong context. In my opinion, this coding technique is of enormous benefit, and represents a use-case that dynamic typing + unit testing simply cannot approach.

The real trick is recognizing that two strings aren’t necessarily the same type.

Just for reference, I did not come up with this example myself.
Joel Spolsky advocates using Hungarian notation, which I think is too weak for solving security vulnerabilities.
Tom Moertel provides an inplementation of this approach in Haskell.