The Bytecode Verifier in Detail

The first stage of the bytecode verifier process is the identifying of bytecode instructions and their arguments. This operation is completed in two passes. The first pass locates the start of each instruction and stores it in a table. Having found the start of each instruction, the verifier makes a second pass, parsing the instructions. This involves building a structure for each instruction, storing the instruction and its arguments. These arguments are checked for validity at this point. Specifically:

The Data Flow Analyzer

Having established that the bytecodes are syntactically correct, the bytecode verifier now has the task of analyzing the runtime behavior of the code (within the limitations we examined in See An Incompleteness Theorem for Bytecode Verifiers ).

To perform this analysis, the bytecode verifier has to keep track of two pieces of information for each instruction:

Where types are concerned, the analyzer does not need to distinguish between the various normal integer types (byte, short, char) since, as we discuss in See Java Bytecode , they all have the same internal representation.

The first stage is the initialization of the data flow analyzer:

Finally, the data flow analyzer runs, looping through the following steps:

Find a virtual machine instruction whose "changed" bit is set.
  1. If no instruction remains whose "changed" bit is set, the method has successfully been verified; otherwise turn off the changed bit of the instruction found and proceed to step 3.
  2. Emulate the effect of this instruction on the stack and local variables:
  3. If the instruction uses values from the stack, ensure that there are sufficient elements on the stack and that the element(s) on the top of the stack are of the appropriate type.
  4. If the instruction pushes values onto the stack, ensure that there is sufficient room on the stack for the new element(s) and update the stack status to reflect the pushed values.
  5. If the instruction reads a local variable, ensure that the specified variable contains a value of the appropriate type.
  6. If the instruction writes a value to a local variable, change the type of that variable to reflect that change.
  7. Determine the set of all possible instructions which could be executed next. These are:
  8. The next instruction in sequence if the current instruction isn't an unconditional goto, a return, or a throw.
  9. The target instruction of a conditional or unconditional branch.
  10. The first instruction of all exception handlers for this instruction.
  11. For each of the possible following instructions, merge the stack and local variables as they exist after executing the current instruction with the state prior to executing the following instruction. In the exception-handler case, change the stack so that it contains a single object of the exception type indicated by the exception handler information. Merging proceeds as follows:
  12. If the stacks are of different sizes then this is an error. Stop!
  13. If the stacks contain exactly the same types, then they are already merged
  14. If the stacks are identical other than having differently typed object references at corresponding places on the stacks then the merged stack will have this object reference replaced by an instance of the first common superclass or common superinterface of the two types. Such a reference type always exists because the type Object is a supertype of all class and interface types.
  15. If this is the first time the successor instruction has been visited, set up the stack and local variable values using those calculated in Step 2 and set the "changed" bit for the successor instruction. If the instruction has been seen before, merge the stack and local variable values calculated in Step 2 and Step 3 into the values already there; set the "change" bit if there is any modification.
  16. Go to Step 1.

If the data-flow analyzer runs on the method without reporting any failures, then the method has been successfully verified by Pass 3 of the class file verifier.