Compilation Overview
The process of compiling a set of source files into a corresponding set of class files is not a simple one, but can be generally divided into three stages. Different parts of source files may proceed through the process at different rates, on an "as needed" basis.
This process is handled by the JavaCompiler
class.
- All the source files specified on the command line are read, parsed into syntax trees, and then all externally visible definitions are entered into the compiler's symbol tables.
- All appropriate annotation processors are called. If any annotation processors generate any new source or class files, the compilation is restarted, until no new files are created.
- Finally, the syntax trees created by the parser are analyzed and translated into class files. During the course of the analysis, references to additional classes may be found. The compiler will check the source and class path for these classes; if they are found on the source path, those files will be compiled as well, although they will not be subject to annotation processing.
Parse and Enter
Source files are processed for Unicode escapes and converted
into a stream of tokens by the Scanner
.
The token stream is read by the Parser
, to create
syntax trees, using a TreeMaker
. Syntax trees are
built from subtypes of JCTree
which implement
com.sun.source.Tree
and
its subtypes.
Each tree is passed to Enter
, which enters symbols
for all the definitions encountered into the symbols. This has to
done before analysis of trees which might reference those symbols.
The output from this phase is a To Do list, containing trees
that need to be analyzed and have class files generated.
Enter
consists of phases; classes migrate from one
phase to the next via queues.
class enter | → | Enter.uncompleted | → | MemberEnter (1) |
→ | MemberEnter.halfcompleted | → | MemberEnter (2) | |
→ | To Do | → | (Attribute and Generate) |
-
In the first phase, all class symbols are entered into their enclosing scope, descending recursively down the tree for classes which are members of other classes. The class symbols are given a
MemberEnter
object as completer.In addition, if any
package-info.java
files are found, containing package annotations, then the top level tree node for the file is put on the To Do list as well. -
In the second phase, classes are completed using MemberEnter.complete(). Completion might occur on demand, but any classes that are not completed that way will be eventually completed by processing the uncompleted queue. Completion entails
- (1) determination of a class's parameters, supertype and interfaces.
- (2) entering all symbols defined in the class into its scope, with the exception of class symbols which have been entered in phase 1.
-
After all symbols have been entered, any annotations that were encountered on those symbols will be analyzed and validated.
Whereas the first phase is organized as a sweep through all compiled syntax trees, the second phase is on demand. Members of a class are entered when the contents of a class are first accessed. This is accomplished by installing completer objects in class symbols for compiled classes which invoke the MemberEnter phase for the corresponding class tree.
Annotation Processing
This part of the process is handled by
JavacProcessingEnvironment
.
Conceptually, annotation processing is a preliminary step before compilation. This preliminary step consists of a series of rounds, each to parse and enter source files, and then to determine and invoke any appropriate annotation processors. After an initial round, subsequent rounds will be performed if any of the annotation processors that are called generate any new source files or class files that need to be part of the eventual compilation. Finally, when all necessary rounds have been completed, the actual compilation is performed.
In practice, the need to call any annotation processors may not
be known until after the files to be compiled have been parsed and
the declarations they contain have been determined. Therefore, to
avoid parsing and entering the source files unnecessarily in the
case where no annotation processing is performed,
JavacProcessingEnvironment
executes somewhat "out of
phase" with the conceptual model, while still fulfilling the
conceptual requirement that annotation processing as a whole
happens before the actual compilation.
JavacProcessingEnvironment
is invoked after the
files to be compiled have already been parsed and entered. It
determines whether any annotation processors need to be loaded and
called for any of the files being compiled. Normally, if any errors
occur during the overall compilation process, the process is
stopped at the next convenient point. However, an exception is made
if any missing symbols were detected during the Enter
phase, because definitions for these symbols may be generated as a
result of calling annotation processors.
If annotation processors are to be run, they are loaded and run in a separate class loader.
When the annotation processors have been run,
JavacProcessingEnvironment
determines if another round
of annotation processing is required. If so, it creates a new
JavaCompiler
object, reads any newly generated source
files that need to be parsed, and reuses any previously parsed
syntax trees. All these trees are entered into the symbol tables
for this new compiler instance, and annotation processors are
called as necessary. This step is repeated until no more rounds of
annotation processing are required.
Finally, JavacProcessingEnvironment
returns the
JavaCompiler
object to be used for the remainder of
the compilation. This will either be the original instance used to
parse and enter the initial set of files, or it will be the latest
instance created by JavacProcessingEnvironment
used to
start the final round of compilation.
Analyse and Generate
Once all the files specified on the command line have been
parsed and entered into the compiler's symbol tables, and after any
annotation processing has occurred, JavaCompiler
can
proceed to analyse the syntax trees that were parsed with a view to
generating the corresponding class files.
While analysing the tree, references may be found to classes
which are required for successful compilation, but which were not
explicitly specified for compilation. Depending on the compilation
options, the source path and class path will be searched for such
class definitions. If the definition is found in a class file, the
class file will be read to determine the definitions in that class;
if the definition is found in a source file, the source file will
be automatically parsed, entered and put on the To Do list.
This is done by registering JavaCompiler
as an
implementation of Attr.SourceCompleter
.
The work to analyse the tree and generate class files is performed by a series of visitors that process the entries on the compiler's To Do list. There is no requirement that these visitors should be applied in step for all source files, and indeed, memory issues would make that extremely undesireable. The only requirement is that each entry on the To Do list should should eventually be processed by each of these visitors, unless the compilation is terminated early because of errors.
- Attr
-
The top level classes are "attributed", using
Attr
, meaning that names, expressions and other elements within the syntax tree are resolved and associated with the corresponding types and symbols. Many semantic errors may be detected here, either byAttr
, or byCheck
. - Flow
-
If there are no errors so far, flow analysis will be done for the class, using
Flow
. Flow analysis is used to check for definite assignment to variables, and unreachable statements, which may result in additional errors. - TransTypes
-
Code involving generic types is translated to code without generic types, using
TransTypes
. - Lower
-
"Syntactic sugar" is processed, using
Lower
, which rewrites syntax trees to eliminate particular types of subtree by substituting equivalent, simpler trees. This takes care of nested and inner classes, class literals, assertions, foreach loops, and so on. For each class that is processed,Lower
returns a list of trees for the translated class and all its translated nested and inner classes.Although
Lower
normally processes top level classes, it will also process top level trees forpackage-info.java
. For such trees,Lower
will create a synthetic class to contain any annotations for the package. - Gen
-
Code for methods is generated by
Gen
, which creates theCode
attributes containing the bytecodes needed by a JVM to execute the method. If that step is successful, the class is written out byClassWriter
.
Once a class has been written out as a class file, much of its syntax tree and the bytecodes that were generated will no longer be required. To save memory, references to these parts of the tree and symbols will be nulled out, to allow the memory to be recovered by the garbage collector.