An Incompleteness Theorem for Bytecode Verifiers

The bytecode verifier is a key component of Java security. Practical bytecode verifiers divide bytecode programs into three classes: those that will not cause problems when they run, those that will cause problems when they run, and those where the verifier is not certain. You can improve a bytecode verifier by reducing its area of uncertainty. Can you eliminate uncertainty completely? Can you build a complete bytecode verifier that determines whether a program is safe or not before it runs?

The answer is no, you cannot. It is mathematically impossible. This short chapter shows why. 1

To demonstrate this, we focus on one aspect of bytecode verification, stack-underflow checking. This involves determining whether a bytecode program will underflow the stack, by removing more items from it than were ever placed on it. Then we use the argument known as reductio ad absurdum. We assume that there is a complete stack-underflow checker and show that this assumption leads to a contradiction. This means that the assumption must have been false - a complete stack-underflow checker is impossible. Since a complete bytecode verifier must contain a complete stack-underflow checker, a complete bytecode verifier is impossible too.

Suppose then that there is such a thing as a complete stack-underflow checker. We write a method in standard Java bytecode which takes as its argument the name of a class file and returns the value true if the specified class file does not underflow the stack, and false if it does. 2 We call this method doesNotUnderflow( ).

We now consider the bytecode program Snarl, whose main method contains:

if doesNotUnderflow( classFile )

while true pop( ); // thus underflowing Snarl's stack

else

{ } // exiting gracefully

pop( ) - which removes the top element from the stack - may not be pure Java, but can certainly be written in bytecode. The bytecode program Snarl is compiled into the class file Snarl.class. 3

What happens if we give Snarl itself as a parameter? The first thing it does is to invoke the method doesNotUnderflow on Snarl.class:

This contradiction means that there could never have been a method doesNotUnderflow which worked for all class files. The quest for a way of determining statically that a class would behave itself at run time was doomed. Complete checking for stack underflow must be done at runtime if it is to be done at all.

This result can be generalized and applied to any aspect of bytecode verification where you try to determine statically something that happens at runtime. So all bytecode verifiers are incomplete. This does not, of course, mean that they are not useful - they contribute significantly to Java security - nor that they cannot be improved. It does mean, however, that some checking has to be left until runtime.

 


1. The problem has been deliberately stated in terms that mathematicians may recognize as being similar to the halting problem. The proof, a diagonalization argument, follows the flow of Christopher Strachey's halting-problem proof (Computer Journal 1967).

2. We have here used Church's Thesis, which states that a programming language (such as the Java bytecode language) which can code a Turing machine can code any computable function.

3. Snarl is a pretty nasty piece of programming, and most practical bytecode verifiers would reject it out of hand. The reason for this is that while true pop ( ); is disastrous if executed and has no practical purpose; a good rule of thumb is to leave it out. But there's nothing invalid about Snarl - if we really have finite bytecode for the method doesNotUnderflow( ), then we can readily construct the bytecode for Snarl - and doesNotUnderflow( ), being complete, has no need for rules of thumb.