Processing Code
Or: Doclets, Annotation Processors and Plugins: Oh My!In addition to runtime reflection, JDK provides some different ways to analyze Java classes without having to load the classes, whether they are found in source files or in compiled class files. This note discusses the different ways, and the reasons to choose one way over another.
History
- Before JDK 5.0
-
Prior to JDK 5.0, the only supported way to examine the structure and comments of Java classes was by using javadoc and the original Doclet API.
- JDK 5.0
-
In JDK 5.0, annotations were added to the Java language, along with other features, like enums and generics. Annotations were supported by an experimental "annotation processing tool" (apt) and a corresponding experimental Mirror API.
- JDK 6.0
-
In JDK 6.0, the apt tool was superseded by direct support for annotation processing in javac, and the Mirror API was replaced by two new Java SE APIs:
- The Language Model API, providing a way to model the elements of Java classes, based on information in either source files or compiled class files.
- The Annotation Processing API, providing a way to execute code within javac that could analyze the classes participating in the compilation, using the Language Model API.
- The Compiler API, providing a way to invoke compilers, such as javac, programmatically, including the ability to run annotation processors during the compilation.
- The
Compiler Tree API, providing a way to examine the syntax tree
of Java source code. Support for accessing documentation comments
was limited to the ability to access the raw string content of the
comment using
Elements.getDocComment(Element)
- JDK 8.0
-
In JDK 8.0, type annotations were added to the Java language, and the original Doclet API began to show its limitations. Array types in particular were imperfectly modeled as a reference to an underlying type, with the dimensionality indicated by a string containing an appropriate number of repetitions of "
[]
", which is obviously inadequate for modeling the type annotations that may appear within the type. This was in contrast to the Language Model API which was better designed, with future extensibility in mind.Another shortcoming of the Doclet API was the
Doclet
type itself. Although documented as an interface, it was not an interface to be implemented by any instance of a doclet; it was merely a placeholder to specify the methods that should be provided by a doclet, and which would be called reflectively by the javadoc tool. This precluded use of features like the service loader to discover and load code found on an execution path, such as the class path or doclet path.Also in JDK 8.0, the Compiler Tree API was extended to provide support for detailed analysis of documentation comments, by parsing them into a simple "syntax tree". The most notable benefit of the extension was the ability to determine the exact position in the underlying source file of any part of the documentation comment, thus providing the ability to provide informative error messages pointing at the exact character position in the source file. (This was previously impractical to do with the basic support for accessing the raw comment text.)
- JDK 9
-
In JDK 9, the Java Platform Module System was introduced. Instead of trying to extend the original Doclet API, the decision was made to replace it with a new Doclet API that leveraged both the Language Model API and Compiler Tree API, both of which were being extended to support modules, and which already had a better foundation than the original Doclet API.
Doclets, Annotation Processors, and Plugins
The Language Model API provides a way to analyze the elements of a Java program, based on information in source files and compiled class files. The Compiler Tree API provides a way to syntax trees for program elements and for documentation comments. But, in isolation, neither one directly provides the means necessary to invoke these APIs on specific source and class files. That task is achieved by using a doclet, annotation processor or javac plugin.
Doclets
Doclets provide code that can be executed by the JDK javadoc tool. Although the tool is primarily designed to support the ability to generate API documentation from element declarations and documentation comments, it is not limited to that purpose, and can run any user-supplied doclet, which can use the Language Model API and Compiler Tree API to analyze the packages, classes and files specified on the command line.
The javadoc tool provides a rich set of command line options to specify the elements to be processed, and individual doclets can declare additional doclet-specific command-line options as well.
As well as using the command line to invoke the javadoc tool, you can also invoke javadoc programmatically in two different ways:
-
Use an instance of
DocumentationTool
, typically obtained from thejavax.tools.ToolProvider
class. This API usesJavaFileManager
to access files, and provides direct API support for adding modules into the compilation environment. -
Use an instance of
java.util.spi.ToolProvider
obtained by callingToolProvider.findFirst("javadoc")
. This API provides functionality that is equivalent to command-line invocation of the javadoc tool without the overhead of creating a new separate process to run it.
Note that the two classes named "ToolProvider
" are
in different packages and are distinct and unrelated.
Annotation Processors
Annotation processing is a standard feature of the Java SE
platform, using standard APIs. Annotation processors are executed
by javac while compiling code, and may even create
additional files to be compiled. Despite the name, annotation
processors are not restricted to just processing annotations, and
may be used to analyze any classes involved in a compilation,
whether found in source form or compiled class form, and whether or
not they contain any annotations. An annotation processor may also
access the documentation comments for declarations found in source
files: it may access the comment as either raw text or as a parsed
DocCommentTree
.
Annotation processing occurs at a specific point in the timeline of a compilation, after all source files and classes specified on the command line have been read, and analyzed for the types and members they contain, but before the contents of any method bodies have been analyzed.
The javac -proc
option can be used to
disable annotation processing, or to instruct javac to
discontinue the compilation when annotation processing has been
completed. In the latter case, javac will not analyze
method bodies or generate class files for any source files involved
in the compilation.
Annotation processing imposes a certain overhead on the
compilation, and the ability to pass options into an instance of an
annotation processor is somewhat limited: you have to use the
javac option
-A<name>=<value>
to pass
name-value pairs to an annotation processor.
javac Plugins
Compared to using an annotation processor, a javac plugin imposes almost no overhead on the compilation; it provides a more flexible mechanism to specify when plugin code should be executed; and provides a more flexible, albeit more basic, way of specifying options via a standard "argv"-style array of strings. However, it is also a JDK-specific feature and not a Java SE feature.
Plugins are loaded by the service loader. They are initially invoked early in the compilation lifecycle, after command line options have been analyzed. At that time, a typical plugin will call JavacTask.addTaskListener to register a listener to be called at subsequent times during the compilation.
While powerful, plugins are a lower-level feature than either doclets or annotation processors, and require a more detailed understanding of the overall compilation pipeline.
Examples
The following examples show how to use different APIs to analyze program elements and documentation comments.
Using visitors and scanners
One of the simplest ways to understand the structure of elements
and tree nodes is to print them out. The Language Model
API and Compiler Tree API use the visitor pattern
to make it easy to navigate around the different data structures.
"Visitor" classes are used to dispatch to different methods
depending on the kind of Element
or
DocTree
that is given, so that different actions can
be taken for different kinds of item. "Scanner" classes are a
special kind of visitor whose default behavior is to recursively
navigate the children of each node. In both cases, you can either
implement or override a single method to affect the behavior for
all kinds of nodes, or you can implement or override individual
methods to affect the behavior for different kinds of nodes.
The following class shows how to display the structure of each of a list of elements, and for any element that is encountered, if it has an associated documentation comment, the structure of the comment is displayed as well.
The class is simple enough, but it does not show how it can be invoked.
The following examples show how code like this can be invoked in different ways, using either javadoc or javac.
Using a simple doclet
The following example shows how to write a doclet that can analyze program elements and documentation comments, such as theShowCode
example.
For any doclet, the
run
method is the one called by the
javadoc to process the items specified on the command
line. It is passed a
DocletEnvironment
object, which contains the
information needed for the doclet to proceed.
In this example, the code obtains the collection of elements
specified on the command line with any of the available
command-line options, and passes that collection to an instance of
ShowCode
to display the details of the elements
specified on the command line.
The doclet can be run by using a command based on the following template, which puts the compiled classes for the doclet on the doclet path, specifies the name of the doclet, places the files to be analyzed on the sourcepath, and specifies a package to be analyzed.
When the code is run on itself, as shown in the preceding template, it generates the following output:
Using an annotation processor
An annotation processor will typically extend AbstractProcessor, in which case it must provide a process method to perform the work for each round of annotation processing, and may optionally provide an init method to access additional state provided by the execution environment (typically, javac).
The following example shows how to write an annotation processor
that can analyze program elements and documentation comments, such
as the ShowCode
example.
In this example, ShowCode
is run on the set of
root elements for each round of annotation processing.
The annotation processor can be run by using a command based on the following template, which puts the compiled classes for the processor on the processor path, specifies the name of the processor, specifies that the compiler should stop after annotation processing, and specifies some source files to be analyzed.
When the code is run on itself, as shown in the preceding
template, it generates the same output as when run with
ShowDoclet
, except for the first couple of lines,
which were generated by the javadoc tool, and not the
doclet.
Using a javac plugin
The following example shows how to write a javac plugin
that can analyze program elements and documentation comments, such
as the ShowCode
example.
In this example, ShowCode
is run on each type
element after it has been analysed by javac, prior to code
being generated for the element.
Plugins must normally be compiled and packaged into a JAR file, because they require the use of a service configuration file, such as the following:
The plugin can be run by using a command based on the following template, which puts a jar file containing the plugin on the processor path, specifies the name of the plugin, and specifies some source files to be analyzed during the compilation.
When the code is run on itself, as shown in the preceding
template, it generates the same output as when run with
ShowDoclet
, except for the first couple of lines,
which were generated by the javadoc tool, and not the
doclet.