The Data Flow Analyzer
Having established that the bytecodes are syntactically correct, the bytecode verifier now has the task of analyzing the runtime behavior of the code (within the limitations we examined in
See An Incompleteness Theorem for Bytecode Verifiers
).
To perform this analysis, the bytecode verifier has to keep track of two pieces of information for each instruction:
-
The status of the stack prior to executing that instruction in the form of the number and type of items on the stack.
-
The contents of local variables prior to executing that instruction. Only the type of each local variable is tracked. The value is ignored.
Where types are concerned, the analyzer does not need to distinguish between the various normal integer types (byte, short, char) since, as we discuss in
See Java Bytecode
, they all have the same internal representation.
The first stage is the initialization of the data flow analyzer:
-
Each instruction is marked as unvisited. That is, the data flow analyzer has not yet examined that instruction.
-
For the first instruction, the stack is marked as empty and the local variables corresponding to the method's arguments are initialized with the appropriate types.
-
All other local variables declared as used by the method are marked as containing illegal values.
-
The "changed" bit of the first instruction is set, indicating that the analyzer should examine this instruction.
Finally, the data flow analyzer runs, looping through the following steps:
Find a virtual machine instruction whose "changed" bit is set.
-
If no instruction remains whose "changed" bit is set, the method has successfully been verified; otherwise turn off the changed bit of the instruction found and proceed to step 3.
-
Emulate the effect of this instruction on the stack and local variables:
-
If the instruction uses values from the stack, ensure that there are sufficient elements on the stack and that the element(s) on the top of the stack are of the appropriate type.
-
If the instruction pushes values onto the stack, ensure that there is sufficient room on the stack for the new element(s) and update the stack status to reflect the pushed values.
-
If the instruction reads a local variable, ensure that the specified variable contains a value of the appropriate type.
-
If the instruction writes a value to a local variable, change the type of that variable to reflect that change.
-
Determine the set of all possible instructions which could be executed next. These are:
-
The next instruction in sequence if the current instruction isn't an unconditional goto, a return, or a throw.
-
The target instruction of a conditional or unconditional branch.
-
The first instruction of all exception handlers for this instruction.
-
For each of the possible following instructions, merge the stack and local variables as they exist after executing the current instruction with the state prior to executing the following instruction. In the exception-handler case, change the stack so that it contains a single object of the exception type indicated by the exception handler information. Merging proceeds as follows:
-
If the stacks are of different sizes then this is an error. Stop!
-
If the stacks contain exactly the same types, then they are already merged
-
If the stacks are identical other than having differently typed object references at corresponding places on the stacks then the merged stack will have this object reference replaced by an instance of the first common superclass or common superinterface of the two types. Such a reference type always exists because the type
Object
is a supertype of all class and interface types.
-
If this is the first time the successor instruction has been visited, set up the stack and local variable values using those calculated in Step 2 and set the "changed" bit for the successor instruction. If the instruction has been seen before, merge the stack and local variable values calculated in Step 2 and Step 3 into the values already there; set the "change" bit if there is any modification.
-
Go to Step 1.
If the data-flow analyzer runs on the method without reporting any failures, then the method has been successfully verified by Pass 3 of the class file verifier.