May 2012
M T W T F S S
« Apr    
 123456
78910111213
14151617181920
21222324252627
28293031  

Categories

Archives

Information Uncertainty Principle

Through Denning’s Presentation Great Principles of Computing I heard of this fascinating tale regarding Buridan’s ass.

It refers to a hypothetical situation wherein an ass is placed precisely midway between a stack of hay and a pail of water. Since the paradox assumes the ass will always go to whichever is closer, it will die of both hunger and thirst since it cannot make any rational decision to choose one over the other.

What’s more interesting than the philosophy behind the donkey’s dilemma, are the actual applications:

Leslie Lamport calls this result Buridan’s principle, and states it as: A discrete decision based upon an input having a continuous range of values cannot be made within a bounded length of time.

A version of Buridan’s principle actually occurs in electrical engineering. Specifically, the input to a digital logic gate must convert a continuous voltage value into either a 0 or a 1 which is typically sampled and then processed. If the input is changing and at an intermediate value when sampled, the input stage acts like a comparator. The voltage value can then be likened to the position of the ass, and the values 0 and 1 represent the bales of hay. Like the situation of the starving ass, there exists an input on which the converter cannot make a proper decision, resulting in a metastable state. Having the converter make an arbitrary choice in ambiguous situations does not solve the problem, as the boundary between ambiguous values and unambiguous values introduces another binary decision with its own metastable state.

I think that it might be possible to turn this into an Information uncertainty principle, analogous to that of the physicist’s quantum uncertainty principle.

Specifically, an observer of a Turing machine calculation can only be in one of two scenarios:

  1. uncertain about the decision a computation will take within a given (determined) time, or
  2. certain about the decision, but uncertain about the time it takes to decide.

If the knowledge trade-off between two extremes can be quantified, in the same way that ħ quantifies the quantum uncertainty trade-off between position and momentum, we might really have something substantial!

Why Grading Sucks

It’s not just that the student provided solutions are unexpectedly “creative” and don’t fit the designed rubric. Nor that they make you wonder “Why am I even trying?”. Nor that they invoke thoughts such as “How did we ever make it as a species?” and “Dear God, I hope this kid never writes the code for my airplane or pacemaker!”. Nor that the responses drive one to despair and alcoholism. No, that’s not the worst part.

The worst part is that, according to some cogsci research that I picked up from the David Rock’s Google Tech Talk about his book Your Brain at Work, our brains have a very limited capacity for logical processing. So, when you have a rubric of a dozen items you’d like to see in each student solution, by the time that you’ve got to the 3rd submission you’ve executed at least 12 * 3 = 36 boolean decisions. Unless you’ve been training for exactly this activity (so well that you aren’t using the prefrontal cortex) you’ve worn out your hardware. Yet, there are still 30 to 70 solutions to grade!

It’s the limited capacity of the rational mind that makes grading so difficult.

Segregate Third-Party JS Libraries

Typically, web authors simply load whatever library they’d like to use with full trust. In JS, such loading amounts essentially to a #include. I’m flabbergasted that this practice remains normal. It could be paranoia, but even without invoking all the security concerns, I’d be reluctant to include other people’s code simply because of the potential for a naming conflict on a global variable.

But, recently, in reviewing a new paper about JS information flow security, I had an interesting thought. (I should credit this thought to my co-worker, Christoph Kerschbaumer, because it was his phrasing of a bullet point that made me question existing practice.) Why don’t we use iframes to segregate the included code?

Currently, that can’t be done because the inter-iframe communication channel is hideous: uses fragment identifiers in a url to re-navigation a frame, which continuously polls the document url looking for that update. Of course re-navigation has it’s own security issues[Securing Frame Communication in Browsers]. This mechanism just operates too slowly and unreliably to use with a library.

Additionally, there’s a strong reason to keep iframes completely separate from each other: advertisements. Where a JS library is third-party code that you want to use, advertisement code is something that you expressly don’t want polluting namespace or accessing your variables and memory. Because of syndication, it’s also code that you shouldn’t trust in any way. iframes currently provide the best mechanism to segregate an advertisement from the rest of the page.

But! what if the iframes came equipped with a communication channel specifically meant for inter-frame communication? This channel, should of course come with a mechanism to attach security monitors so that each use can be customized for an appropriate level of security. The presence of the channel should act also as a telephone: that is, JS on either side, should be able to opt-out, and not answer a request for communication from the transmitter. The channel will need to be high-bandwidth, so that it can be used with libraries. And the segregation can use either message-passing semantics [good for memory separation (parallelism), might cost copies during communication (c.f. erlang for how to optimize this)] or reference-passing semantics [less communication overhead, bad for memory separation, potential for abuses: library might access complete data structure via referential transitivity].

I think the advantages here are pretty cool:

  • If the browser can execute iframes in separate memory spaces, perhaps this could be a way to have parallel code execution.
  • The iframe keeps untrusted JS behind a monitored firewall.
  • The ability to attach a security monitor to the channel should mollify most complaints about cross-iframe script contamination.
  • Talking to an iframe can have its own linguistic support by creating a JS object with overloaded getter/setter that wraps the communication. Calling library methods should look nearly the same as before, modulo a prefix for the interface (which is good, more like including a module).

Update Tue Mar 13 20:45:34 PDT 2012: Christoph alerted me that I have to be more careful in my reading. Indeed, Barth’s paper (cited above) actually contains exactly this proposal, as well as an implementation (adopted in WebKit), postMessage. The remaining difficulty lies in library adoption. Many JS libs now have to be rewritten to use the new communication mechanism (or clients have to care enough to write wrappers).

Editor Wars

Normally, I use vi for most of my editing work. But, I’ve been hearing much about emacs and its ability to do a better job at syntax highlighting, code completion, spell checking, even a writegood-mode for detecting passive voice. Emacs might be a pretty good editing OS, but it doesn’t come with an editing language. Here’s what I mean:

The beginning of understanding the Zen of Vi comes when you realize that you are not memorizing key-bindings, but rather, you are learning a language.

What a beautiful concept! You are not memorizing that “d$” means “delete from here to the end of the line”. You are instead learning how to tell Vi that you want to delete (“d”) from here to the end of the line (“$”). The operation command, “d”, can be seen as a verb, while the target of the operation, the end of the line, “$”, can be seen as the object.

Because these are “movements” they can also be used as subjects for other “statements.”

Searching forwards or backwards are movements in vi. Thus they can also be used as “subjects” in our “statements.”

In addition to “verbs” and “subjects” vi also has “objects” (in the grammatical sense of the term).

This notion of “prefixes” also adds the analogs of grammatical “adjectives” and “adverbs’ to our text manipulation “language.” Most commands (verbs) and movement (verbs or objects, depending on context) can also take numeric prefixes.

I can’t help but think, that for all the verbiage spilled about how Lisp encourages solving your problem in a domain specific language you create just for that purpose[Clementson's Blog, The Lisp Difference, John Foderaro, Lisp is a Chameleon, Paul Graham, Beating The Averages] How did this concept of an editing language elude the author of Emacs?

So I read what RMS had to say about Emacs’s beginnings. It didn’t start out with Lisp, but rather was an interpreter written for TECO, which itself did have an editing language. But RMS describes it as “extremely ugly …, as ugly as could possibly be.” After adding several extension to TECO RMS observes “the language that you build your extensions on shouldn’t be thought of as a programming language in afterthought; it should be designed as a programming language. In fact, we discovered that the best programming language for that purpose was Lisp.”

I’m still flummoxed. Why, after seeing that TECO was an editing language, were the Emacs commands not created likewise? Why is delete-char a separate keystroke, rather than a command followed by a command? Why are Emacs movement commands flung over the keyboard according to name, rather than position? If you find you need Lisp to design an extension language, why didn’t you also see that editing itself should be a language? Why did this not escape Bill Joy, who was implementing vi in C, so that it could be used over a 300 baud modem?

Illegal Everything

John Stossel (I’m really starting to like this guy) recently did an episode titled “Illegal Everything” in which he make the following observations/arguments:

  • Legalizing drugs does not lead to more use, but does lead to less violence. (example: Portugal, as can be read about in Scientific American, and Wikipedia).
  • Cops have raided food co-ops for selling ‘raw milk’, and shut down children lemonade stands; for ‘not having the proper permits’.
  • Legalizing prostitution actually benefits the workers with legal protection, leading to less violence.
  • That the EPA and other government organizations operate with large budgets, and are too difficult for the average person to fight. (example: a couple wishing to build a house on their own property stopped by the EPA over accusations of building on a wetland, which it wasn’t.)
  • It usually starts by people arguing for big-brother oversight of morality (as in drug use and prostitution) or for professionalism and regulation (as in cab drivers requiring $1M medallions). Licencing serves to prevent new comers from entering the market.
  • New laws create power for activist government agencies, but almost never get repealed.

I have one question America: Are you going to let FULL TIME legislatures create laws which make so many rules that everyone becomes a criminal? So that, for anyone perceived as a threat or nuisance to their authority, can have charges trumped up by combing the books? So that, what you are comfortable doing today, makes you a criminal in the future?

What you should know about type systems

I read today, a really good essay on type systems. It covers what they can be used for and the differences between dynamic and static systems (although the word ‘type’ is used in both, it doesn’t mean the same thing at all.)

Even though I have not used Haskell, ML, or any of the other languages which make a compelling case for strong, static typing; I have been favoring such systems for a long time. I’m reaching the point in my education about programming practice, where I want the computer to do as much reasoning as possible about my programs. So any system of logic, which the computer can use to reason about my program, seems to me of enormous benefit. Especially so, when I can extend that system (by creating my own types) to prove properties of my programs that I’m worried about.

The article gives some really good examples, that serve to show how powerful a computer reasoning system can be, how much it can help you to know:

  • http://wiki.di.uminho.pt/twiki/pub/Personal/Xana/WebHome/report.pdf. Uses types to ensure that the correct kinds of data are gotten out of a relational database. Via the type system, the compiler ends up understanding how to work with concepts like functional dependencies and normal forms, and can statically prove levels of normalization.
  • http://www.cs.bu.edu/~hwxi/academic/papers/pldi98.pdf. Uses an extension to ML’s type system to prove that arrays are never accessed out of bounds. This is an unusually hard problem to solve without making the languages that solve it unusable, but it’s a popular one to work on.
  • http://www.cis.upenn.edu/~stevez/papers/LZ06a.pdf. This is great. This example uses Haskell’s type system to let someone define a security policy for a Haskell program, in Haskell, and then proves that the program properly implements that security. If a programmer gets security wrong, the compiler will complain rather than opening up a potential security bug in the system.
  • http://www.brics.dk/RS/01/16/BRICS-RS-01-16.pdf. Just in case you thought type systems only solved easy problems, this bit of Haskell gets the type system to prove two central theorems about the simply typed lambda calculus, a branch of computation theory!

And just like any good research paper, it ends with a series of open questions:

  1. For what interesting properties of programs can we build static type systems?
  2. How close can we bring those type systems to the unattainable ideal of never rejecting a correct program?
  3. How easy can it be made to program in a language with such a static type system?
  4. What is the cost associated with accidentally rejecting a correct computer program?
  5. For what interesting properties of programs can we build test suites via dynamic typing?
  6. How close can we bring those test suites to the unattainable ideal of never accepting a broken program?
  7. How easy can it be made to write and execute test suites for programs?
  8. What is the cost associated with accidentally accepting an incorrect computer program?

Grading

Ugh. I spent too much time today and yesterday doing grading. I shouldn’t be this slow at it. Here’s the general process:

Look at 5–10 submissions and note what are the most common errors. From this, a rubric can be generated. Then, comb through all assignments and dock points according to the rubric. Even after looking at only a small sample, almost every submission falls within the guidelines of the rubric.

However, in the interests of providing good feedback, each item in the rubric corresponds to a line (or small paragraph) explaining the student’s mistake.

So, here’s what I need: An application which has a grid (perhaps of checkboxes) along one axis the class roster, along the other the rubric. My job is to click checkboxes corresponding to each item in the rubric. The application can then calculate both the grade for that assignment, and generate a feedback report for the student.

The closest thing that I could find: http://androidforacademics.com/grade-rubric/

Tricks of the Trade

Today, Howard Rheingold was searching for examples of clear posts in which to use as examples of online communication. One of the respondents identified a really good series set of descriptions answering the question, What is it like to have an understanding of very advanced mathematics? (Interestingly Terry Tau is referenced quite often.)

That reminded me of some things that I’ve read previously, first from Feynman:

One day he told me to stay after class.”Feynman,” he said, “you talk too much and you make too much noise. I know why. You’re bored. So I’m going to give you a book. You go up there in the back, in the corner, and study this book, and when you know everything that’s in this book, you can talk again.”

So every physics class, I paid no attention to what was going on with Pascal’s Law or whatever they were doing. I was up in the back with this book: Advanced Calculus, by Woods. [He] knew I had studied Calculus for the Practical Man a little bit, so he gave me the real works – it was for a junior or senior course in college. It had Fourier series, Bessel functions, determinants, elliptic functions – all kinds of wonderful stuff I didn’t know anything about.

That book also showed how to differentiate parameters under the integral sign – it’s a certain operation. It turns out that’s not taught much in the universities; they don’t emphasize it. But I caught on how to use that method, and I used that one damn tool again and again. So because I was self-taught using that book, I had peculiar methods of doing integrals.

The result was, when the guys at MIT or Princeton had trouble doing a certain integral, it was because they couldn’t do it with the standard methods they had learned in school. If it was contour integration, they would have found it; if it was a simple series expansion, they would have found it. Then I come along and try differentiating under the integral sign, and often it worked. So I got a great reputation for doing integrals, only because my box of tools was different from everybody else’s, and they had tried all their tools on it before giving the problem to me.

[Feynman, R. P. "A Different Set of Tools." In 'Surely You're Joking, Mr. Feynman!': Adventures of a Curious Character. New York: W. W. Norton, 1997]

And second about Paul Erdos, which I read in [Bruce Schechter. "My Brain is Open: The Mathematical Journeys of Paul Erdos"], but for which I don’t have an exact quote. Erdos apparently used the same few tricks to solve hundreds of problems. I think his success in publishing was likely related to his ability to encounter (and remember) many thousands of problems. Using that strategy, not only might you increase the number of tools in your box; but you’ll definitely encounter problems which your set of tools can solve (as Feynman encountered above).

Generalizing from these two examples: Most trades with some sort of detailed work, develop a large body of ‘Tricks of the Trade’. For example, Physicists/Engineers discovered that if you wish to identify whether a joint of a water carrying pipe might be leaking, it helps to wrap an alka seltzer tablet in the tape seal around the joint. That way, should water leak, the seltzer will make it noticeable. And it’s not just the Engineering/Math/Physics: I’m willing to believe that even Lawyers develop a standard set of arguments and pattern match which arguments are best for which situations.

Computer Scientists collect similar tricks, but we codify (pun!) them into terminology: virtual dispatch, multi-method, closure, trampoline, call with continuation, worklist queue. And write up books about the more general ones: Design Patterns [GoF].

Introverts

Even though extroverts make up the majority of people on this pale blue dot, I happen to feel comfortably at home in academia. The profession seems overwhelmed with introverts. It’s not as if we are shy, or afraid of interaction, or uncomfortable around people. Rather it’s that we prefer the quiet solitude of our own mind. Scientific American’s article, The Power Of Introverts, has picked up an author, Susan Cain, who focuses on this issue; and nicely explains the subtlety of introversion. I found most fascinating three points that Cain mentioned:

  1. “It’s never a good idea to organize society in a way that depletes the energy of half the population. We discovered this with women decades ago, and now it’s time to realize it with introverts.”
  2. “Forty years of research shows that brainstorming in groups is a terrible way to produce creative ideas. … An interesting line of research by the psychologists Mihaly Csikszentmihalyi and Gregory Feist suggests that the most creative people in many fields are usually introverts. This is probably because introverts are comfortable spending time alone, and solitude is a crucial (and underrated) ingredient for creativity.”
  3. “According to groundbreaking new research by Adam Grant, a management professor at Wharton, introverted leaders sometimes deliver better outcomes than extroverts do. Introverts are more likely to let talented employees run with their ideas, rather than trying to put their own stamp on things. And they tend to be motivated not by ego or a desire for the spotlight, but by dedication to their larger goal.”

As I sit content and comfortably alone on a Monday holiday, I feel solace in articles like this. Too bad I feel the extroverts will hardly ever take the time (a) to read enough to find such an article and (b) attempt the understanding / attempt to relate.

Great Presentations

Back when my interest in teaching was only just a spark on the kindling of my brain, I started to investigate the tricks of communication. Up to that point in my life I had completely underestimated the importance of communication. I thought, as any myopic engineer would, that sound ideas will speak of their own merit, that they do not need to be advertised. Oh, how wrong I was. All we have to do is read about the merits the second oldest language from 1958, or to watch the Mother of all Demos from 1968, to get visceral sense of outrage: The best ideas remain sadly unadopted!

As an educator, I think the secret is to tell stories. Tell stories about what problem you had, and how you solved it. Put emotion into your technical communication, other researchers share your problems. Tell a story so that they can identify! Make each listener the protagonist of your tale. Nancy Duarte: The secret structure of great talks