18 Jul, 2017

Of Airbags and Modeling, Part 0

by Stefan Edwards

I was in a car accident the last year, and was talking with our CEO Jack after the fact. He asked if the air bags had deployed, which I said they didn’t (in fact, if they had, I probably wouldn’t have been injured). Jack responded with:

That’s the thing with air bags, you assume they work and they’ll save your life until they don’t.

Now, being the jerk that I am, I responded:

Basically like all security controls?

However, Jack actually kicked off another round of my philosophical questioning, which I plan on torturing you with through a series of articles beginning with this one.

An Epistemology of Code

I dual majored Computer Science & Philosophy in uni (university). I enjoyed both subjects equally, and going to a small liberal arts uni meant that I could soak in large amounts of both with all the spare time I was granted. One of the first questions a professor asks when teaching philosophy usually is along the lines of:

How do you know what you know, and how do you know if it’s true?

Usually first-year students flounder and provide all sorts of answers, whilst upper-class students laugh for a moment before getting the thousand-yard-stare and mention all the problems with this.

Despite the philsophical background, we can actually apply this to CS as well:

How do you know what the code you wrote does, and how do you know if that’s correct?

Let’s take a simple example:

int
f(int a, int b, int c, int d) {
    if(a <= b) {
        return foo(a + c, b, c, d + a);
    } else {
        return d;
    }
}

What does this program-snippet actually do, and how do you know what it’s supposed to do?

Usually I get answers similar to:

Read it! Clearly looping around a & b, with c as a step & d as an accumulator!

or

Rewrite it! Remove recursion, replace it with an explicit loop, and rename variables!

Ok, let’s take the second approach for the moment:

int
accumulator(int val, int max, int step) {
    int accum = 0;
    while( val <= max) {
        accum += val;
        val += step;
    }
    /* The above could just as easily be a `for` loop:
     * for(; val <= max; val += step, accum += val) ;
     * ... right?
     */
    return accum;
}

Now this is clear, right? We’re clearly trying to accumulate values, the variables are named as a human can understand them, and it works as we expect:

f(0, 10, 1, 0); // returns 55
accumulator(0, 10, 1); // also returns 55

Clearly, no one can misunderstand how this function is to be used…

accumulator(0, 10, -1);

Oh, bother. Clearly, we could make some of those ints into uints, and types are a great way to signal programmer intent (I’ve talked about this internally at nVisium, and I’ll publish something about it here as well), but we’re still not sure what this program is meant to do, and code alone isn’t useful per se.

Whither Modeling?

This notion of program correctness started in at least the late 1960’s with Tony Hoare’s An Axiomatic basis for computer programming, which was inspired by Robert Floyd’s work on program flowcharts in Assigning Meaning to Programs. Hoare’s work lead to Hoare Logic. Very roughly, Hoare logic allows us to state:

  • What are the preconditions to running this program?
  • What are the post-conditions of running this program?
  • What are some things that never change whilst running this program?
  • What is the result of running this program?

Let’s markup our previous example with some Hoare-style annotations:

//@ requires val >= 0 && step > 0 && max_step >= 0
//@ ensures \result > max_step 
int
accumulator(int val, int max_step, int step) {
    int accum = 0;
    while( val <= max) {
        accum += val;
        val += step;
    }
    return accum;
}

So, we’re informing our model checker that val and max_step must be greater than or equal to 0, and that step must be greater than 0. Indeed, if we attempt to use:

accumulator(0, 10, -1);

Our extended checker will note that step > 0 fails when step = -1. In effect, by adding some minor annotations to our program, we can model our program’s inputs and outputs, and check those relationships throughout. Note too that this is purely static; the model can warn on things that appear to violate the model as well, such as unbounded integers used from user input.

Next Steps

As we’ll see in part 1 (the second part) of the series, this level of modeling can be used to enforce quite a number of security properties, as well as general “program correctness” proofs. Then, once we’ve gotten used to modeling our programs, we’ll continue this series into modeling our architectures and indeed our entire infrastructure. Until next time!