SE Using Java, 2nd Edition -- Chapter 35

- Special Edition Using Java, 2nd Edition -

Chapter 35 Understanding the .class File

by Jordan Olin

The .class file is the fundamental unit of measure for a Java application with respect to the Java Virtual Machine (JVM). It represents a contract of sorts between a compiler and an implementation of the JVM. I mention “a compiler” versus “a Java compiler” because, as you will see, any language compiler could potentially generate .class files and Java bytecodes.

Physically, the .class file is an ordered set of bytes representing extremely dynamic structures and arrays that describe the compiled version (or runtime image) of an executable unit, called a class. Most of the components that make up the class file have a fixed structure followed by a set of variable length structures. Some pieces are mandatory, and others are optional. The important thing to keep in mind is that a process that generates a .class file must do so in the exact format and style in this chapter. Otherwise, the JVM’s class loader and verifier will not accept the submitted .class file.

The basic structure of the .class file
The .class file is the fuel to the JVM and as such presents a well defined interface for Java compilers to generate. Understanding the elements of the .class file also helps you in using JDB to debug your applications.
The Constant Pool
The method information structure

Elements of the .class file

Keeping the class file in the byte-stream oriented format is very critical for quickly loading and parsing the information it contains. Implementers of class loaders may take advantage of the stream I/O classes in Java—for example, to easily read in a .class file piece by piece parsing as it is read, or read it into a byte array and parse it manually.

The concept to keep in mind is that you must read in each section, in order, until its information is exhausted. And, you can’t really read in a section of the file without reading its “descriptive” information first. For example, a portion of the file is called the Constant Pool. The first item that you would read is the number of elements that will follow. Then, for each element you read in, a descriptor tells you the format of the next element. Finally, you read in the actual element based on it’s specific format.

The file itself can be broken up into logical sections:

At its highest level, the .class file represents a single compiled Java class. When you compile a Java source file (.java file), for every class defined within that one file, the compiler generates a .class file for each one.
The next level is the basic class structure. This level includes information about the class’s properties and subsections for describing the constant values collected for the class (the Constant Pool) when it was compiled, followed by the class’s interfaces, fields, methods, and class-level attributes.
Then, each subsection can be looked at independently: the Constant Pool and its elements, the table of interfaces, the field table including properties and attributes, and the method table including properties and attributes.

The information described in this chapter was gleaned from two sources:

The online reference materials at the JavaSoft web site (www.javasoft.com) and Sun Web site (www.sun.com).
Sessions at JavaOne, Sun’s Worldwide Java Developer Conference held in San Francisco in May, 1996.

Definitions

In order to fully understand the contents of the .class file, you need to first define some common structures that are used by the various sections it includes. You examine the Constant Pool, the format of a signature or type definition, and attributes.

The Constant Pool

The idea of a Constant Pool may be a new concept to you, but they have been used since the early days of compilers and runtime systems. A Constant Pool is used to contain each distinct literal value encountered while the source code for a class is being compiled. A literal value in this case may be an actual numeric value, a string literal, a class name, type description, or method signature.

Each time one of these literal values is encountered, the Constant Pool is searched for a matching value in order to avoid putting duplicate values into the pool. If the value was found, then its existing location in the Constant Pool is inserted into the class definition or compiled bytecode stream. If the value was not found in the Constant Pool, then it is added. At load time, the Constant Pool is placed into an array-like structure in memory for quick access. Then, as the rest of the class is loaded, and at runtime whenever a literal is needed, its value is located in the Constant Pool by its index and retrieved.

The use of a Constant Pool keeps the size of the compiled class smaller and hence loads faster. At runtime, the Sun implementation of the JVM has a mechanism to make resolving a Constant Pool reference only occur the first time a distinct value is needed. After that, the resolved value may be directly referenced in a special array off of the Constant Pool. The actual mechanism is supported by a special set of internal bytecodes called the _quick instructions. Because they are strictly implementation-dependent, they are not part of the formal definition of the Java bytecodes.

The Constant Pool as it is recorded in the .class file is in a very compacted format. It begins with a 16-bit unsigned integer value that is the count of elements that follow plus one. (The extra count is for the zero’th element which is only used at runtime and not included in the elements contained in the class file.) What follows the count value is a variable length array with each element being a variable length structure and no padding between elements.

Twelve different types of values and associated structures may be stored in the Constant Pool. Each structure begins with a single byte-sized integer value called a tag (see table 35.1). The tag is used to determine the format of the bytes that follow which make up the remainder of this element’s structure.

Table 33b.1 Constant Pool Tags

Tag	Meaning	Note
1	`Utf8` string
2	`Unicode` string	Not used at this
		point.
3	`Integer` value
4	`Float` value
5	`Long` value
6	`Double` value
7	Class reference	Only refers to class
		name.
8	String
9	Field reference	Only used in
		bytecode stream.
10	Method reference	Only used in
		bytecode stream.
11	Interface Method	Only used in
	reference	bytecode stream.
12	Name and Type reference

Now that you know the tag values, let’s look at each Constant Pool element type. All tags are one byte long, and all lengths and indexes are 16-bit unsigned integer values, unless otherwise noted.

`Tag 1:` `Utf8` `String`

The Utf8 constant is used to represent Unicode string values in as small a representation as possible (see table 35.2). In a Utf8 string, a character will use from 1–3 bytes depending on its value. It is very oriented towards ASCII values in that all non-null ASCII characters will fit in a single byte. The class file depends heavily on this Constant Pool entry type in that all actual string values (including class, field, and method names, and types and method signatures) are stored in Utf8 constants.

Table 33b.2 Utf8 String Constant

Field	Number of Bytes	Value
Tag	1	1
Size	2	Length in bytes of the `Utf8`
		string
Data	(Size)	The actual `Utf8` string

`Tag 2:` `Unicode` `String`

The Unicode constant is intended to hold an actual Unicode string, but is not used in the class file itself (see table 35.3). It may be used internally to hold a true Unicode string at runtime. Its format is similar to the Utf8 constant, but each character is a true 16-bit Unicode character.

Table 35.3 Unicode String Constant

Field	Number of Bytes	Value
Tag	1	2
Size	2	Number of characters in
		the Unicode string.
Data	(Size * 2)	The actual Unicode string.

`Tags 3 and 4:` `Integer` `and` `Float` `Values`

The Integer and Float constants are used to hold integer and float constant values, respectively, that may be used as initializers to fields or variables, as well as hard-coded literal values within a Java statement (see table 35.4).

Table 33b.4 Integer or Float Constant

Field	Number of Bytes	Value
Tag	1	3 for `Integer`; 4 for `Float`
Data	4	Actual integer or float value
		in big-endian (MSB first) order.

`Tags 5 and 6:` `Long` `and` `Double` `Values`

The Long and Double constants are used to hold long and double constant values, respectively, that may be used as initializers to fields or variables, as well as hard-coded literal values within a Java statement (see table 35.5). For internal reasons, each Long and Double constant uses up two elements in the Constant Pool. So, if a Long constant starts at Constant Pool location 4, then the next constant would be placed in location 6.

Table 33b.5 Long or Double Constant

Field	Number of Bytes	Value
Tag	1	5 for Long; 6 for `Double`
Data	8	Actual long or double value in big-endian (MSB first) order.

`Tag 7: Class Reference`

The Class Reference constant is an indirection that is used to refer to the actual literal name of a class (see table 35.6). All class names used within the .class file are referred to in this way, except when used in a field, variable, argument, or return type declaration (see the upcoming section “Type Information”). Also, because arrays are objects in Java, all array references are based on a Class Reference constant.

Table 33b.6 Class Reference Constant

Field	Number of Bytes	Value
Tag	1	7
Index	2	Location of a `Utf8` string in the Constant Pool containing the fully qualified class name.

`Tag 8: String Reference`

The String Reference is another indirection that is used whenever an actual string literal was encountered in the class definition or bytecode stream (see table 35.7). This string could have been used as an initializer to a String variable, directly in a Java expression, or perhaps as an argument to a method call.

Table 33b.7 String Reference Constant

Field	Number of Bytes	Value
Tag	1	8
Index	2	Location of a `Utf8` string in the Constant Pool containing the actual string value.

`Tags 9, 10, and 11: Field, Method, and Interface Method Reference`

The Field, Method, and Interface Method reference constants are used within the compiled Java bytecode stream in order to dynamically reference a field or method that resides in another class or interface (see table 35.8). The Class Reference is used to dynamically load in the referenced class, and the Name and Type Reference is used to find the specified field to use or method to call.

Table 33b.8 Field, Method, and Interface Method Reference Constant

Field	Number of Bytes	Value
Tag	1	9 for Field; 10 for Method;11 for Interface Method Reference.
Class Index	2	Location of a Class Reference in the Constant Pool containing the following field or method reference.
Name/Type Index	2	Location of a Name and Type Reference in the Constant Pool describing a field or method.

`Tag 12: Name and Type Reference`

The Name and Type Reference is used to hold the actual name of a field, variable, method, or argument and its associated type or signature (see table 35.9). These constant types are used anywhere fields, variables, methods, or arguments are defined and used. See the following section “Type Information” for the exact format of the contents of the Description field.

Table 33c.9 Name and Type Reference Constant

Field	Number of Bytes	Value
Tag	1	12
Name Index	2	Location of a `Utf8` string in the Constant Pool containing the name of a field, `var`, `arg`, or method.
Description Index	2	Location of a `Utf8` string in the Constant Pool containing the Name’s type or signature.

Type Information

In order to have a consistent way of describing the data types of fields, variables, arguments, and the signatures of methods, the .class file uses a very abbreviated notation. Essentially, each native type known by the JVM is represented by a single-character shortcut for its full name, with classes and arrays denoted by a special character for modification. Each type and signature shortcut is kept in a Utf8 formatted string in the Constant Pool. For the type of a field or variable, it is just a single type description; for a method signature, it is a series of type descriptions put together with the arguments first (in order surrounded by parenthesis) followed by the shortcut for the method’s result type.

Table 35.10 shows the abbreviated type name followed by its real data type.

Table 33b.10 Data Type Abbreviations Used by the .class File

Abbreviation	Java Type	Notes
B	`byte`
C	`char`
D	`double`
F	`float`
I	int
J	`long`
S	`short`
Z	`boolean`
V	`void`	Only used for methods.
L<classname>;	`class`	The capital letter L followed by a fully qualified class name terminated by a semicolon. Note that forward slashes, not periods, are used to delimit the actual package name tokens for the class name.
[	Array dimension	An open-bracket is used to denote each dimension of an array.

In order to see how these abbreviations are used, take a look at listing 35.1. You define a simple Java class, and for each variable and method, you put its shorthand version in a comment.

Listing 33b.1 Short-Hand Types and Signatures

class foo {
// TYPE FIELD NAME SHORT-HAND VERSION
int simpleInt; // I
boolean simpleBool; // Z
float[] floatArray; // [F
char[][] twoDimCharArray; // [[C
String[][][] threeDimStringArray; // [[[Ljava/lang/String;
// Note the use of slashes here
void DoSomething( long arg1, double[][] arg2 ) { }
// (J[[D)V
// Two arguments, a long and a two dimension double array, returning nothing
java.net.Socket OpenSocket( String hostname, int port ) { }
// (Ljava/lang/String;I)Ljava/net/Socket;
// Two arguments, a String object and an integer, returning a Socket object
void NoArgsNoResult( ) { }
// ()V
// No arguments, returning nothing
}

Attributes

Attributes are the mechanism that the designers of the .class file structure created to allow additional descriptive information about the class to be included in the file without changing its semantics. Attributes are dynamically structured modifiers that contain both mandatory and optional properties affecting the class, its fields, and methods. For example, information on local variables, arguments, and the compiled bytecode for a method are contained in a mandatory attribute called the Code attribute.

Also, with respect to using attributes to extend the information in a .class file, Microsoft’s JVM implementation provides support for interoperability with COM objects by adding new attributes to the .class file. A class loader and JVM implementation only need to recognize the mandatory attributes and may ignore the rest. That way, a class compiled for one VM may still be read (and possibly executed) by another VM.

Obviously, if you created a .class file that depended on a VM that supported COM objects, for example, it would not run with the Sun JVM 1.0.2.

Table 35.11 gives a brief description of the attributes that are recognized by Sun’s JVM Version 1.0.2.

Table 33b.11 Sun 1.0.2 Java .class file Attributes

Attribute Name	Mandatory	Level	Purpose
`SourceFile`	No	Class	Names the file containing Java source for this .class file.
`ConstantValue`	Yes	Field	Holds value of an initializer for a native typed field.
`Exceptions`	Yes	Method	Defines the exceptions that are thrown by this method.
`Code`	Yes	Method	Defines the physical structure and bytecodes for a method.
`LineNumberTable`	No	Code	Contains Program Counter to Line Number table for use in debugging.
LocalVariableTable	No	Code	Contains local variable descriptive information for use in debugging.

When .class file elements use attributes, they are kept in a table and are preceded by an unsigned 16-bit integer count field holding the number of attributes that immediately follow. The attributes physically are named variable-length structures that are similar in some respects to the entries in the Constant Pool described earlier in this section. Each attribute begins with a fixed length portion and is followed by a variable number of fields. Attributes may also be nested in order to allow for extensions to the information that they contain.

All attribute definitions have the same first two fields, as shown in table 35.12.

Table 33b.12 Attribute Definition: Fixed Portion

Field	Number of Bytes	Value
`Name Index`	2	Location of a `Utf8` string in the Constant Pool containing the literal name of this attribute, as defined in table 35.11.
`Length`	4	An unsigned integer containing the number of bytes of data that follow, excluding the six bytes that make up the fixed portion (Name Index and Length).
`Data`	(`Length`)	The actual variable length structure associated with this specific attribute definition.

I describe each attributes meaning and structure in context with its actual position in the .class file. In those discussions, it is assumed that each attribute begins with the Name Index and Length fields described in table 35.12.

The .class File Structure

Now that I have defined the dynamic elements that are used in the .class file, you can finally discover its real structure. Table 35.13 shows the first level of description for the fields in the .class file

Table 33b.13 First Level Fields in the .class File Structure

Field	Number of Bytes	Value
Magic Number	4	This value acts as a signature and is used to help ensure the validity of the actual class file. As of this writing, it must be the 32-bit value `0xCAFEBABE.`
`Minor Version`	2	Minor version number used by the compiler that generated this .class. This integer value is currently 3 in the JDK 1.0.2 javac compiler.
`Major Version`	2	Major version number used by the compiler that generated this .class. This integer value is currently 45 in the JDK 1.0.2 javac compiler.
`Constant Pool Size`	2	Number of entries in the following Constant Pool plus one. That is, this value represents the actual number of entries in the runtime version of the Constant Pool which includes the zero'th entry. That entry is not included in table 35.14.
`Constant Pool`	Varies	The actual Constant Pool entries as described in the earlier section "The Constant Pool."
`Class Flags`	2	A series of bit flags (defined in the following section) that specify the access permissions for this class or interface definition.
`Class Name`	2	Index to a Class Reference in the Constant Pool representing the fully qualified name of this class.
`Superclass Name`	2	Index to a Class Reference in the Constant Pool representing the fully qualified name for the ancestor class to this one. If this value is zero, then Class Name must refer to java.lang.Object (the only class without a direct ancestor).
No. of Interfaces	2	The count of interfaces implemented by this class.
`Interface List`	(the number * 2)	An array of Constant Pool indexes pointing to Class Reference entries that name the interfaces that this class implements. This array must be in the same order as the implements clause encountered when this class was compiled.
No. of Fields	2	The count of fields (`static` and instance) that are defined in this class.
`Field Table`	Varies	An array of field information structures as defined in the following section.
No. of Methods	2	The count of methods (`static` and instance) that are defined in this class.
`Method Table`	Varies)	An array of method information structures as defined in the following section.
No. of Attributes	2	The count of attributes that are defined for this class.
`Attribute Table`	Varies	The table of attributes included in this .class file. The only attribute recognized at this level by the Sun JVM 1.0.2 is the SourceFile attribute defined previously.

As you can now see, the .class file even at its highest level is very dynamic. There is no way to read in the top level and then go deeper and read the parts that you are interested in. It is totally sequential in nature and physical structure. Most of the individual fields are pretty clear from their description. The only exceptions are the flags and embedded arrays for the fields and methods.

The Class Flags field

The Class Flags field is a 16-bit unsigned integer that is used to represent a set of boolean values that define the structure and access permissions for this .class file (see table 35.14). They are predominantly used by the Verification Pass of the JVM to denote whether this is a class or interface, and modifiers with respect to class visibility and extension.

Table 33b.14 Class Flag Value Definitions

Bit Position (LSb = 1)	Logical Name Class	Applies To Interface	Definition of Set
1	`PUBLIC`	Yes Yes	The class is accessible from other classes outside of this package.
5	`FINAL`	Yes No	This class may not be subclassed.
6	`SUPER`	Yes Yes	Calls to methods in the superclass are special cased.
10	`INTERFACE`	No Yes	This class represents an interface definition.
11	`ABSTRACT`	Yes Yes	This class or interface is abstract and has methods that must be coded in a subclass or interface implementation.

The Field Information Structure

The Field Information structure is a second-level set of information used to describe the name, type, and access permissions associated with a field of this class (see table 35.15). The fields may be instance or static (class variables), and may represent native types, specific object references, or arrays of either one. The JVM uses this information to allocate the appropriate amount of space for the class definition in memory and each instance’s data space in memory.

Table 33b.15 Fields in the Field Information Structure

Field	Number of Bytes	Value
`Field Flags`	2	A series of bit flags that define the access permissions for this field.
`Field Name`	2	Index to a `Utf8` string in the Constant Pool representing the name of this field.
`Type`	2	Index to a `Utf8` string in the Constant Pool representing the type definition in the format described in the “Type Information” section.
`No. of Attributes`	2	The count of attributes that are defined for this field.
`Attribute Table`	Varies)	The table of attributes associated with this field. The only attribute recognized at this level by the Sun JVM 1.0.2 is the `ConstantValue` attribute defined previously.

Table 35.16 defines the meaning for the access flags associated with a field.

Table 33b.16 Field Flag Value Definitions

Bit Pos. (LSb = 1)	Logical Name Class	Applies To Interface	Definition If Set
1	`PUBLIC`	Yes Yes	The field is accessible
			from other classes outside of this package.
2	`PRIVATE`	Yes No	The field is only accessible from this class. No subclasses or classes outside of this package may access it.
3	`PROTECTED`	Yes No	The field is only accessible from this class and its subclasses.
4	`STATIC`	Yes Yes	The field is considered a class level field, and only has one occurrence in memory that is shared by all instances of this class.
5	FINAL	Yes Yes	This field is only present in this class definition, and may not be overridden or have a value assigned into it after it is initialized.
7	`VOLATILE`	Yes No	Denotes that this field’s value is not guaranteed to be consistent between accesses. So, the compiler will not generate optimized code with respect to this field.
8	`TRANSIENT`	Yes No	This field’s value is only valid while an instance of the class is in memory at runtime. Its value, if written to, or read from persistent storage, is ignored.

The ConstantValue Attribute

This mandatory attribute is found in the field information structure of the .class file, and is used to hold the values that were used to initialize the native typed (non-object) fields in a class when they were defined (see table 35.17).

Table 33b.17 Fields Unique to the ConstantValue Attribute

Field	Number of Bytes	Value
`Value`	2	Location in the Constant Pool of either an `Integer` constant, a `Long` constant, a `Float` constant, or a `Double` constant.

The type of constant referred to by the Value field is determined by the following table:

Constant Pool Type	Holds Values For
`Integer` constant	`boolean`, `byte`, `char`, `integer`, and `short` initializers
`Long` constant	`long` initializers
`Float` constant	`float` initializers
`Double` constant	`double` initializers

The Method Information Structure

The Method Information structure is a second-level set of information that is used to describe the name, signature, and access permissions for a method in this class (see table 35.18). Methods may be instance-oriented (only callable from an instance of this class), or they may be static methods (callable whether an instance of this class is present or not). The JVM uses the information in these structures, along with the attributes for this method, to create the internal method table for instances of this class or interface to use.

Table 33b.18 Fields in the Method Information Structure

Field	Number of Bytes	Value
`Method Flags`	2	A series of bit flags that define the access permissions for this method.
`Method Name`	2	Index to a Utf8 string in the Constant Pool representing the name of this method.
`Signature`	2	Index to a `Utf8` string in the Constant Pool representing this method’s signature definition in the format described in the “Type Information” section.
`No. of Attributes`	2	The count of attributes that are defined for this method.
`Attribute Table`	Varies)	The table of attributes associated with this method. The only attribute recognized at this level by the Sun JVM 1.0.2 are the `Exceptions` and `Code` attributes defined previously.

Table 35.19 defines the meaning for the access flags associated with a method.

Table 33b.19 Field Flag Value Definitions

Bit Pos. (LSb = 1)	Logical Name	Applies To Class Interface	Definition If Set
1	`PUBLIC`	Yes Yes	The method is accessible from other classes outside of this package.
2	PRIVATE	Yes No	The method is only accessible from this class. No subclasses or classes outside of this package may access it.
3	`PROTECTED`	Yes No	The method is only accessible from this class and its subclasses.
4	`STATIC`	Yes No	The method is considered a class level method, and may be called whether an instance of this class exists or not.
5	`FINAL`	Yes No	This method is only present in this class definition, and may not be overridden.
6	`SYNCHRONIZED`	Yes No	This method is callable in a multi-threaded scenario and will have its access controlled and locked with a monitor.
9	`NATIVE`	Yes No	This method’s implementation is not in Java bytecodes, but in some other external form. It must conform to the native call interface specification of the JVM.
11	`ABSTRACT`	Yes Yes	This method’s signature is only defined in this class and must be implemented in a subclass. It effectively turns this class into an abstract class.

`The` `Exceptions` `Attribute`

This mandatory attribute is found in the method information structure of the .class file for a given method (see table 35.20). It defines the list of exceptions that are thrown by the method containing this attribute. They are in the same order as found in the throws clause that was present in the .java source file when this class was compiled. This information is used by the class loader and JVM in order to verify that a method is permitted to throw a given exception.

Table 33b.20 Fields Unique to the Exceptions Attribute

Field	Number of Bytes	Value
Count	2	Number of elements in the following table of `Utf8` Constant Pool entries.
Table	Count * 2	An array of indexes to `Utf8` Constant Pool entries.

`The` `Code` `Attribute`

This mandatory attribute of the method information structure defines the actual compiled representation of its source statements (see table 35.21). The first two fields are used by the JVM to know how much space to define for its stack frame. The bytecodes are what is executed at runtime, the Exceptions are monitored and handled at runtime, and the attributes (if present at all) are used while debugging. In Sun’s javac compiler, the LineNumberTable and LocalVariableTable are inserted when using the -g option. These attributes are detailed following this description of the Code attribute.

Table 33b.21 Fields Unique to the Code Attribute

Field	Number of Bytes	Value
`Stack Depth`	2	Maximum allowable depth of the JVM’s expression stack.
`No. Locals`	2	Number of local variables (including arguments) defined in this method.
`Code Length`	4	Number of bytes used by the following stream of bytecodes.
`bytecodes`	(Code Length)	Stream of Java bytecodes representing the compiled version of this method’s statements.
`Exception Count`	2	Number of exceptions that are caught inside of this method as described by table 35.22.
`Exceptions`	(Count * 8)	An ordered table of fixed length structures (described in table 35.22) that detail each `try-catch` clause coded in this method.
`Attribute Count`	2	Number of attributes defined in the following attribute table.
`Attribute Table`	Varies	Table of attributes provided for this method’s `Code` attribute. Currently, only the `LineNumberTable` and `LocalVariableTable` subattributes are supported.

The embedded Exception table has the following format, shown in table 35.22.

Table 33b.22 Fields in the Code Attribute’s Embedded Exception Table

Field	Number of Bytes	Value
`PC Start`	2	First bytecode of the `try` block that this exception is to handle.
`PC End`	2	Bytecode address where this exception handler is no longer active (the bytecode immediately after the `try` block).
`PC Exception Handler`	2	Bytecode location of the beginning of the actual exception handler.
`Exception Type`	2	Index into the Constant Pool of a Class Reference constant representing the actual exception to be handled.

The definition of the embedded attributes of the Code attribute are discussed in the following sections.

`The` `LineNumberTable` `Attribute`

This optional attribute of the Code attribute contains a table of Program Counter to Line Number translation entries (see table 35.23). They are in order by PC location and may contain duplicate line number references. This anomaly is the result of the way that code is generated in general, and by optimizations performed on the generated Java bytecodes as they are created by Sun’s javac compiler.

Table 33b.23 Fields Unique to the LineNumberTable Attribute

Field	Number of Bytes	Value
`Count`	2	Number of elements in the following line number information table.
Table	(Count * 4)	A table containing line number information elements as described in table 35.24.

The actual line number table elements have the following fixed length structure, as shown in table 35.24.

Table 33b.24 Fields in the LineNumberTable Attribute’s Line Number Table

Field	Number of Bytes	Value
`PC Start`	2	Program Counter location of the start of some bytecodes associated with a given line number.
`Line Number`	2	The actual line number (relative to the start of the .java source file) where these generated bytecodes came from.

`The` `LocalVariableTable` `Attribute`

This optional attribute of the Code attribute contains a table of entries describing the local variables present in this method and their associated scope (see table 35.25). They are not in order, and include entries representing the arguments for this method. One point to note here is that every non-static method contains at least one argument (even if there are no arguments in the method 2's signature) representing the current object instance for this class.

Table 33b.25 Fields Unique to the LocalVariableTable Attribute

Field	Number of Bytes	Value
`Count`	2	Number of elements in the following local variable information table.
`Table`	(Count * 10)	A table containing local variable information elements as described in table 35.26.

The actual local variable table elements have the following fixed length structure, as shown in table 35.26.

Table 33b.26 Fields in the LocalVariableTable Attribute’s Local Variable Table

Field	Number of Bytes	Value
`PC Start`	2	Program Counter location where this variable goes into scope.
`Scope Size`	2	The number of bytecodes beginning with `PC Start` where this variable remains in scope. For example, Scope = [‘PC Start’ to (‘PC Start’ + ‘Scope Size’ - 1)].
`Name`	2	Location of a `Utf8` string in the Constant Pool containing the literal variable name.
`Type`	2	Location of a `Utf8` string in the Constant Pool containing the type information for this variable (as defined in the “Type Information” section).
`Variable Slot`	2	The slot, or offset, in this method’s stack frame where the variable’s value is kept.

The SourceFile Attribute

This optional attribute is used in the high-level .class file structure in order to hold the name of the source file that was used to compile this .class file (see table 25.27). It is primarily useful by debugging systems in order to be able to search for the source file and display source lines as required.

Table 33b.27 Fields Unique to the SourceFile Attribute

Field	Number of Bytes	Value
`File Name`	2	Location of a `Utf8` string in the Constant Pool containing the literal .java file name.

So Now What Can I Do?

Now that you have a fairly good understanding of the physical format of the class structure, there are lots of things that you can do with this information, such as:

It should help you in understanding how a Java language compiler represents the source information in binary format.
In debugging, it can help you to effectively use JDB.
In attempting to implement a custom debugging aid with the Java Debugger API, it can help you parse the .class file.
It should help you create your own .class file reader.

Personally, I chose a derivative of the fourth alternative. In order to gain a full understanding of the nuances that a .class file reader needed to be able to deal with, I implemented a Java application to help me out. I created a package and utility for parsing a .class file and converting its information into a displayable string format. The driver utility is called ClassFileDump, and the package is called com.Que.SEUsingJava.ClassFile.

The utility itself is very simple and just reads some command-line arguments and passes them onto the main class in the package. The package is comprised of 32 classes that are contained in eight Java language source files. The starting class to the package is called ClassHeader and has a simple constructor taking no arguments, and two primary methods. The first primary method is called read and takes a single argument of a java.io.DataInputStream instance. This instance should be associated with an open .class file. read is completely responsible for loading and parsing the .class file. It does this by passing the input stream to the 31 other support classes in the package.

Each class in the package knows about a specific structure or attribute of the .class file and understands how to read it and convert it to a String. After the read method returns, the utility calls the toString method on the ClassHeader instance. The toString method takes advantage of the other class instances in the package to convert their respective member data items to String values. The toString method then returns this large string to the driver utility where it is sent to System.out.

The ClassFileDump utility can be found on the CD-ROM in two formats. The first one is the source to the utility and package, and is called CLASSDMP_SOURCE.ZIP. The second format is the executable Java bytecode version and is in a file called CLASSDMP_LIB.ZIP. This file is in the proper format to add to your CLASSPATH environment variable. For example, if you put CLASSDMP_LIB.ZIP in your JDK’s \LIB directory, you could modify your classpath to be:

;c:\java\lib\classes.zip;c:\java\lib\classdmp_lib.zip

Once you have done that, you may execute the utility from anywhere that the java command is available.

The command line for ClassFileDump looks like the following:

java ClassFileDump <.class file name>

For example,

java ClassFileDump ClassFileDump.class

would cause the contents of the ClassFileDump utilities .class file to be sent to System.out, the console. I chose to send output there, as it may be easily redirected to a file.

| Previous Chapter | Next Chapter |

|Table of Contents | Book Home Page |

| Que Home Page | Digital Bookshelf | Disclaimer |

To order books from QUE, call us at 800-716-0044 or 317-361-5400.

For comments or technical support for our books and software, select Talk to Us.

- Special Edition Using Java, 2nd Edition -

Chapter 35

Understanding the .class File

Elements of the .class file

Definitions

The Constant Pool

Tag 1: Utf8 String

Tag 2: Unicode String

Tags 3 and 4: Integer and Float Values

Tags 5 and 6: Long and Double Values

Tag 7: Class Reference

Tag 8: String Reference

Tags 9, 10, and 11: Field, Method, and Interface Method Reference

Tag 12: Name and Type Reference