JEP 296: Consolidate the JDK Forest into a Single Repository

AuthorJoseph D. Darcy
OwnerJoe Darcy
TypeInfrastructure
ScopeImplementation
StatusClosed / Delivered
Release10
Componentinfrastructure / build
Discussionjdk9 dash dev at openjdk dot java dot net
EffortM
DurationM
Relates toJEP 369: Migrate to GitHub
Reviewed byBrian Goetz, Mikael Vidstedt
Endorsed byMark Reinhold
Created2016/10/07 18:42
Updated2019/11/07 19:40
Issue8167368

Summary

Combine the numerous repositories of the JDK forest into a single repository in order to simplify and streamline development.

Non-Goals

Adding the FX sources to the JDK forest is not part of the proposal.

Motivation

For many years, the full code base of the JDK has been broken into numerous Mercurial repositories. In JDK 9 there are eight repos: root, corba, hotspot, jaxp, jaxws, jdk, langtools, and nashorn.

While this model of multiple repos offers some advantages, it also has many downsides and does a poor job of supporting various desirable source-code management operations. In particular, it is not possible to perform an atomic commit across repositories of inter-dependent changesets. For example, if the code for a single bug fix or RFE spans both the jdk and hotspot repos today, the change to both repositories cannot be done atomically in the forest hosting those two distinct repos. Changes spanning multiple repos are a common occurrence; over 1,100 bug ids have been reused across repositories in the JDK forest. The 1,100+ repo-crossing bugs is only a lower bound on the number of logically repo-crossing bugs, since some engineers use separate bug ids to push to different repos.

This mismatch between the divisions of the Mercurial repos and unity of the engineering dilutes one of the main benefits of modern source-code management: tracking changes to sets of files rather than just individual files. As a corollary, this mismatch between SCM transactions and logical transactions complicates use of tools such as Mercurial bisect.

The individual repos don't have a development cycle separate from the JDK as a whole; all the repos advance in lockstep with the JDK promotion cycle. The multiplicity of repos presents a larger than necessary barrier to entry to new developers and has lead to workarounds such as the "get source" script.

Description

To address these issues, a prototype of a consolidated forest has been developed. The prototype is available at:

http://hg.openjdk.java.net/jdk10/consol-proto/

Some of the supporting conversion scripts used to create the prototype are attached as unify.zip.

In the prototype. the eight repositories have been combined into a single repository using an automated conversion script that preserves history on a per-file level, with the consolidated forest being synchronized at the tags used to mark JDK promotions. The changeset comments and creation date are also preserved.

The prototype has another level of code reorganization. In the consolidated forests, code for Java modules is generally combined under a single top-level src directory. For example, today in the JDK forest there are module-based directories like

$ROOT/jdk/src/java.base
...
$ROOT/langtools/src/java.compiler
...

In the consolidated forest, this code is instead organized as

$ROOT/src/java.base
$ROOT/src/java.compiler
...

As a consequence, from the root of the repository the relative path of a source file in a module is preserved after the consolidation and src directory combination.

An analogous but less aggressive reorganization is done for the test directories to go from

$ROOT/jdk/test/Foo.java
$ROOT/langtools/test/Bar.java

to

$ROOT/test/jdk/Foo.java
$ROOT/test/langtools/Bar.java

Since the effort is currently a prototype, not all portions of it are entirely complete and the fit and finish can be improved in some areas. The HotSpot C/C++ sources are moved to the shared src directory alongside the modularized Java code.

While the regression tests will run with the current state of the prototype, further consolidations of the jtreg configuration files are possible and may be done in the future.

Alternatives

One alternative is to simply stay with the current set of repositories. The history of some or all of the repositories could have been dropped when moving to a single repository, but that was rejected. Consolidating a core subset of the repositories was considered, but rejected in favor of the simplicity of a single repository.

Testing

To validate the file contents, for each promotion tag a script was used to verify the contents of the split forest at that tag matched the contents of the consolidated repository at that tag. For a recent JDK 9 tag, builds of the split forest and consolidated forest at the same tag were compared; there were only minor and explainable differences.

Risks and Assumptions

The testing described above should mitigate the most serious risks of file corruptions and faulty builds. While the major portions of the needed work for the consolidation are complete in the prototype, various smaller supporting features may not be finished before the consolidations is put into production. The pre and post consolidation code bases are not related in a Mercurial sense. Diffs (with suitably massaged paths) will have to be used for forward- and back- ports as opposed to exporting and importing changesets.