Sponsored Link •
All Java programs are compiled into class files that contain bytecodes, the machine language of the Java virtual machine. Here's a first look at Java's bytecodes.
Welcome to another installment of "Under The Hood." This column gives Java developers a glimpse of what is going on beneath their running Java programs. This month's article takes an initial look at the bytecode instruction set of the Java virtual machine (JVM). The article covers primitive types operated upon by bytecodes, bytecodes that convert between types, and bytecodes that operate on the stack. Subsequent articles will discuss other members of the bytecode family.
The bytecode format
Bytecodes are the machine language of the Java virtual machine. When a JVM loads a class file, it gets one stream of bytecodes for each method in the class. The bytecodes streams are stored in the method area of the JVM. The bytecodes for a method are executed when that method is invoked during the course of running the program. They can be executed by intepretation, just-in-time compiling, or any other technique that was chosen by the designer of a particular JVM.
A method's bytecode stream is a sequence of instructions for the Java virtual machine. Each instruction consists of a one-byte opcode followed by zero or more operands. The opcode indicates the action to take. If more information is required before the JVM can take the action, that information is encoded into one or more operands that immediately follow the opcode.
Each type of opcode has a mnemonic. In the typical assembly language style, streams of Java bytecodes can be represented by their mnemonics followed by any operand values. For example, the following stream of bytecodes can be disassembled into mnemonics:
// Bytecode stream: 03 3b 84 00 01 1a 05 68 3b a7 ff f9
// Disassembly: iconst_0 // 03 istore_0 // 3b iinc 0, 1 // 84 00 01 iload_0 // 1a iconst_2 // 05 imul // 68 istore_0 // 3b goto -7 // a7 ff f9
The bytecode instruction set was designed to be compact. All instructions, except two that deal with table jumping, are aligned on byte boundaries. The total number of opcodes is small enough so that opcodes occupy only one byte. This helps minimize the size of class files that may be traveling across networks before being loaded by a JVM. It also helps keep the size of the JVM implementation small.
All computation in the JVM centers on the stack. Because
the JVM has no registers for storing abitrary values,
everything must be pushed onto the stack before it can be used in a calculation.
Bytecode instructions therefore operate primarily on the stack. For example,
in the above bytecode sequence a local variable is multiplied by two by
first pushing the local variable onto the stack with the
then pushing two onto the stack with
iconst_2. After both integers have
been pushed onto the stack, the
imul instruction effectively pops the two
integers off the stack, multiplies them, and pushes the result back
onto the stack. The result is popped off the top of the stack and stored
back to the local variable by the
istore_0 instruction. The JVM was designed
as a stack-based machine rather than a register-based machine to facilitate
efficient implementation on register-poor architectures such as the Intel