Not All Flows are Considered Equal

When I was writing last post about information flow terminology, I noticed something interesting: when knowledge of control flows are used to determine the values of variables, some branches yield more information than others. Previously, I had only considered the binary if-then-else branch. Today, I shall examine a switch-case statement, which exhibits asymmetric information flow.

/* pick s from 1 to 3, inclusive */
switch(s) {
case 1: k = 1; break;
case 2: k = 2; break;
case 3: k = 2; break;
}

After this code segment has executed: k is one of two values: 1 or 2. However, one of these values is more informative for the observer. If an observer sees k == 1 then she can infer that s == 1. On the other hand, if the observer sees k == 2 then she can only infer s != 1 (or s in {2,3} with knowledge of s‘s distribution).

For control structures this simple, a static analysis could be run on the code, and provide a result graph of how influential each variable is on others. This graph could be used to quantify how much information might be leaked to an observer. Unfortunately, it comes with the limitation that each branch is considered as drawn from a uniform distribution, which may not be true of actual runtime values.

Nevertheless, I think the ability to statically quantify how much information can be leaked via control flow branches, might be useful for:

  1. deciding which variables need tracking,
  2. where instrumentation/wrappers/labels might be omitted,
  3. where inspections/checks must be inserted,
  4. finding variables which fall below a certain leak threshold.

I would really like such an analysis to tell me, for example, whether a variable holding a password is potentially leaked, and if so, how many bits. Or, to list me all the variables of my program in order of how much they might leak.

The inference here sounds like the same as in Probabilistic Predicative Programming by Eric C.R. Hehner (whom I met when Wayne Hayes invited him to give a Friday Seminar talk in Dec 2010), except that it runs backwards on the control flow.