Issues

Performance

javac using Java.g is significantly slower than using the standard parser. The are two potential solutions.

Tweak Java.g where possible to improve performance. There is a trade off here between clarity and performance, since it is also a goal to try and unify the grammars used by javac and presented in JLS.
Improve the code generated by ANTLR. See, for example, Faster expression parsing for Antlr by Terence Parr.

The following table measures the performance of the ANTLR javac compiler against that the standard compiler. Two bodies of code were tested:

The OpenJDK langtools repository: 156,336 lines in 662 files
The OpenJDK jdk repository: 2,563,605 lines in 7569 files

For each body of code, the following measurements were taken:

The time taken to just scan (lex) the .java source files
The time taken to scan and parse the .java source files
The time taken to compile the .java source files
The time taken to complete a standard build

Times in the following table are in seconds; for the first three rows, the times were measured as elapsed time using System.currentTimeMillis(); for the full build, the times were measured with the Unix time command, using the sum of user time and system time.

	langtools			jdk
	standard javac	ANTLR javac	/	standard javac	ANTLR javac	/
lex source files	0.111	1.223	11.02	18.181	190.54	10.48
lex and parse source files	0.366	2.609	7.12	47.123	346.34	7.35
compile source files	9.176	18.730	2.04	84.481	168.161	1.99
full build	47.915	58.751	1.23	487.68	574.96	1.17

Although the ANTLR lexer and parser are significantly slower than their hand-written counterparts, the impact is ameliorated in the context of typical real world usage.

Error handling

The error messages generated by ANTLR are typically not as detailed as those generated by the standard parser, which can often give more hints about what is expected at any point.

The standard parser has features that make it suitable for use in an IDE like NetBeans, that are not necessarily required for a batch compiler. In particular, it has support for improved error recovery, after a syntax error has been found, and it has support for retaining valid subtrees in the context of a syntax error. For example, consider this input:

for (int i = 0; i < 10; i++) ?

The trees for "int i = 0;", "i < 10", and "i++" should not necessarily be discarded just because there is an error in the body of the for-loop. In a case like this, the standard parser will return an ErrorTree containing any trees which were successfully parsed before the syntax error was discovered. This allows an IDE to analyze those trees even though the complete statement is syntactially malformed.

Interaction with IDEs and other tools

While not encouraged or necessarily supported, some downstream clients of javac may depend on internal API, such as the lexer and parser classes. The clients may be significantly affected by a change to using an ANTLR-based lexer and parser.