The Artima Developer Community
Sponsored Link

The Class File Lifestyle
The Structure and Lifestyle of the Java Class File
by Bill Venners
First Published in JavaWorld, June 1996

<<  Page 2 of 3  >>


Magic and version numbers
The first four bytes of every class file are always 0xCAFEBABE. This magic number makes Java class files easier to identify, because the odds are slim that non-class files would start with the same initial four bytes. The number is called magic because it can be pulled out of a hat by the file format designers. The only requirement is that it is not already being used by another file format that may be encountered in the real world. According to Patrick Naughton, a key member of the original Java team, the magic number was chosen "long before the name Java was ever uttered in reference to this language. We were looking for something fun, unique, and easy to remember. It is only a coincidence that OxCAFEBABE, an oblique reference to the cute baristas at Peet's Coffee, was foreshadowing for the name Java."

The second four bytes of the class file contain the major and minor version numbers. These numbers identify the version of the class file format to which a particular class file adheres and allow JVMs to verify that the class file is loadable. Every JVM has a maximum version it can load, and JVMs will reject class files with later versions.

Constant pool
The class file stores constants associated with its class or interface in the constant pool. Some constants that may be seen frolicking in the pool are literal strings, final variable values, class names, interface names, variable names and types, and method names and signatures. A method signature is its return type and set of argument types.

The constant pool is organized as an array of variable-length elements. Each constant occupies one element in the array. Throughout the class file, constants are referred to by the integer index that indicates their position in the array. The initial constant has an index of one, the second constant has an index of two, etc. The constant pool array is preceded by its array size, so JVMs will know how many constants to expect when loading the class file.

Each element of the constant pool starts with a one-byte tag specifying the type of constant at that position in the array. Once a JVM grabs and interprets this tag, it knows what follows the tag. For example, if a tag indicates the constant is a string, the JVM expects the next two bytes to be the string length. Following this two-byte length, the JVM expects to find length number of bytes, which make up the characters of the string.

In the remainder of the article I'll sometimes refer to the nth element of the constant pool array as constant_pool[n]. This makes sense to the extent the constant pool is organized like an array, but bear in mind that these elements have different sizes and types and that the first element has an index of one.

Access flags
The first two bytes after the constant pool, the access flags, indicate whether or not this file defines a class or an interface, whether the class or interface is public or abstract, and (if it's a class and not an interface) whether the class is final.

This class
The next two bytes, the this class component, are an index into the constant pool array. The constant referred to by this class, constant_pool[this_class], has two parts, a one-byte tag and a two-byte name index. The tag will equal CONSTANT_Class, a value that indicates this element contains information about a class or interface. Constant_pool[name_index] is a string constant containing the name of the class or interface.

The this class component provides a glimpse of how the constant pool is used. This class itself is just an index into the constant pool. When a JVM looks up constant_pool[this_class], it finds an element that identifies itself as a CONSTANT_Class with its tag. The JVM knows CONSTANT_Class elements always have a two-byte index into the constant pool, called name index, following their one-byte tag. So it looks up constant_pool[name_index] to get the string containing the name of the class or interface.

Super class
Following the this class component is the super class component, another two-byte index into the constant pool. Constant_pool[super_class] is a CONSTANT_Class element that points to the name of the super class from which this class descends.

The interfaces component starts with a two-byte count of the number of interfaces implemented by the class (or interface) defined in the file. Immediately following is an array that contains one index into the constant pool for each interface implemented by the class. Each interface is represented by a CONSTANT_Class element in the constant pool that points to the name of the interface.

The fields component starts with a two-byte count of the number of fields in this class or interface. A field is an instance or class variable of the class or interface. Following the count is an array of variable-length structures, one for each field. Each structure reveals information about one field such as the field's name, type, and, if it is a final variable, its constant value. Some information is contained in the structure itself, and some is contained in constant pool locations pointed to by the structure.

The only fields that appear in the list are those that were declared by the class or interface defined in the file; no fields inherited from super classes or superinterfaces appear in the list.

The methods component starts with a two-byte count of the number of methods in the class or interface. This count includes only those methods that are explicitly defined by this class, not any methods that may be inherited from superclasses. Following the method count are the methods themselves.

The structure for each method contains several pieces of information about the method, including the method descriptor (its return type and argument list), the number of stack words required for the method's local variables, the maximum number of stack words required for the method's operand stack, a table of exceptions caught by the method, the bytecode sequence, and a line number table.

Bringing up the rear are the attributes, which give general information about the particular class or interface defined by the file. The attributes section has a two-byte count of the number of attributes, followed by the attributes themselves. For example, one attribute is the source code attribute; it reveals the name of the source file from which this class file was compiled. JVMs will silently ignore any attributes they don't recognize.

<<  Page 2 of 3  >>

Sponsored Links

Copyright © 1996-2018 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use