The Future of the Web should be Lisp

I was reading Steve Yegge’s drunken rant on The Emacs Problem. It wasn’t able to convince me that Lisp was a great language for text processing, but it did convince me that Lisp is a fantastic language for data interchange. Especially, if that data happens to have hierarchical structure. Say for example, something like HTML.

Steve was kind enough to point out a really nice XML logfile example, which I reproduce here:

<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE log SYSTEM "logger.dtd">
<log>
<record>
  <date>2005-02-21T18:57:39</date>
  <millis>1109041059800</millis>
  <sequence>1</sequence>
  <logger></logger>
  <level>SEVERE</level>
  <class>java.util.logging.LogManager$RootLogger</class>
  <method>log</method>
  <thread>10</thread>
  <message>A very very bad thing has happened!</message>
  <exception>
    <message>java.lang.Exception</message>
    <frame>
      <class>logtest</class>
      <method>main</method>
      <line>30</line>
    </frame>
  </exception>
</record>
</log>
(log
'(record
  (date "2005-02-21T18:57:39")
  (millis 1109041059800)
  (sequence 1)
  (logger nil)
  (level 'SEVERE)
  (class "java.util.logging.LogManager$RootLogger")
  (method 'log)
  (thread 10)
  (message "A very very bad thing has happened!")
  (exception
    (message "java.lang.Exception")
    (frame
      (class "logtest")
      (method 'main)
      (line 30)))))

What’s super-amazingly-awesome about this transformation is three-fold:

  1. The transformation is structure-preserving.
  2. The syntax is orders of magnitude simpler.
  3. The tags can be interpreted as Lisp functions.

Let’s focus on these in more detail:
Because the transformation is structure-preserving, the transition is theoretically achievable. The Web is currently a huge stinking polyglot of HTML, XML, XHTML, JavaScript, XMLHTTPRequest. It’s been festering, and each time someone scratches an itch, we all have to deal with that solution and it’s interactions with existing technologies. Right now we have a huge Tower of Babal, and the web browser has to support it all!

In my work as a web security researcher, I’ve come to the conclusion that the root of code injection attacks is precisely this polyglot monstrosity! If the web had a simpler, unified syntax for all it’s technologies, many of these problems would go away, and the remaining ones could be more easily mitigated. No more special cases, means less buggy code, fewer opportunities for things to go wrong, and a lower profile exposed to attacks.

Finally, because we’ve encoded the HTML data as a set of Lisp lists: the document can easily become self-modifying! HTML was envisioned to hold static documents, and roughly describe their structure to a browser that would render it. This worked well back in the early days, when all we had was some ascii pr0n, Star Trek lore, and home pages of CERN employees. But over time, as more people started using the web, we craved more exciting things. For example all the people on Geocities wanted that <blink> tag that made we want to scratch out my own eyes by prevented that by triggering an epileptic fit.

Eventually, businesses got in on the action. And they had frighteningly different demands: they wanted more automation, they wanted glitz that would attract users. It wasn’t enough to have a server-side script create and deliver a page based on what’s currently present in the inventory database. No! What they wanted was User Interaction. How do you make HTML more dynamic? You have to give it the ability to self-modify. But HTML isn’t a programming language, it’s a document layout language!

Enter JavaScript. Netscape (now Mozilla) birthed a language that would allow HTML pages to self-modify and self-introspect, and respond to user interactions. People, Businesses, Everybody just ate it up. There’s more JavaScript now than any other language.

Not only has the introduction of JavaScript compounded the polyglot problem, it introduced a whole new class of security risks because of the self-modifying capabilities. Now, your browser actually downloads code from anywhere on the web, and happily executes it. Demand for dynamic content so overwhelmed folks at the time that nobody seriously questioned the security risks! A very common attack nowadays is the XSS attack. If you can find a way to get JavaScript onto a page (say by posting it on a web forum) then you can take control of every browser that sets eyes on that page. This was how Samy (my Hero!) took out MySpace.

I’m not going to spend any more space here arguing against the idea of a self-modifying document. It’s way too late for that. AJAX applications like maps and mail are way too useful.

Let’s look at where the web is currently headed. I’ve heard about using web architecture as a service. I’ve heard about it using it for application delivery. The natural extension here is that your browser becomes the next operating-system in a box. But is this nasty polyglot, ad-hoc model of languages and their unholy spaghetti of interaction really the way to achieve that? It’s probably going to happen anyway. We can already see how: nobody writes HTML and JavaScript now, it’s all machine generated. Generated by Rails, and other web frameworks. When machine architectures became too tedious, we stopped programming assembly. We let the compilers figure it out. Now that web programming has become a nightmare, we turn to the frameworks to save us. Let the frameworks figure it out. The frameworks have become the compilers of the web.

But if you are really just going to machine generate so much… Why stick with the crufty interfaces? Why not replace it with, what I now consider the best data-interchange format of all time? Is Lisp really that bad?

There’s one other feature that Steve mentioned in his article that I haven’t addressed yet. Suppose that we decide to replace that HTML with Lisp, then what? How do we get back those dynamic pages? Well, look at that example again. Go on, look. I’ll wait.

It’s in Lisp. That means it’s potentially executable. Each of those entries, log, record, date, etc… can be a Lisp function. For HTML, we’d have the DOM structure, and each item in it would be executable. Some convenient hooks into the renderer, and your Lispified HTML renders itself! Another hook, say for the script tag, and your document becomes self-modifiable! We’re missing none of the dynamic content, just making it easier to parse and manipulate. I think if we switched we could build cathedrals on this stuff!

So please! What the web really needs is for this hideous architectural and syntactic nightmare to be slain like the monster it’s become! Since HTML really started as a document encoding format that focused on hierarchical structure, there’s no reason we can’t switch this to Lisp, like in Yegge’s logfile example. We lose none of the structure, and gain in simplified syntax. We loose none of the functionality, and gain enormously in our ability to parse, manipulate, transform the document. Further, since Lisp is so elegant, we can also do more of the analyses required for securing, optimizing, and jit-compiling.