JEP draft: Linkable JDK runtimes

OwnerSeverin Gehwolf
TypeFeature
ScopeJDK
StatusDraft
Componenttools / jlink
Discussioncore dash libs dash dev at openjdk dot org
EffortM
DurationM
Created2024/06/07 13:36
Updated2024/07/17 09:07
Issue8333799

Summary

Optionally produce a JDK build which allows for jlink to produce custom runtimes without the need for packaged modules.

Goals

Reduce the size of a JDK installation without losing the ability to create custom runtimes.

Non-Goals

Motivation

In order to understand the motivation for the Linkable JDK Runtimes JEP, a brief discussion of Modular Run-Time Images, how to reason about its content, a discussion about the JDK's on-disk-size footprint and a discussion about custom runtimes is needed.

Modular Run-Time Images and their Size

Let's look at a current JDK installation's on-disk-size footprint. After extraction of the JDK tarball we would see something like this:

$ du -sh .
352M .
$ du -s ./* | sort -n
4        ./release
160      ./conf
224      ./include
316      ./legal
464      ./bin
800      ./man
87828    ./jmods
269784   ./lib

That is, the total on-disk-size of the JDK is about 352MB large, while the lib folder of the JDK contributes 269794KB (or 264MB) to it. Specifically, we see a lib/modules file which takes up a large chunk of the total size of the lib folder:

$ du -s ./lib/* | sort -n | tail -n5
1684     ./lib/libfontmanager.so
9920     ./lib/ct.sym
52056    ./lib/src.zip
57088    ./lib/server
141292   ./lib/modules

To be precise, that file amounts to 141292 KB or 138 MB. How does one know what that file contains? This is what the java --list-modules command can help with. It lists - as the name implies - included modules of a modular JDK. More importantly, the included set of modules has a direct impact on the size of the lib/modules file and the on-disk-size footprint of your runtime as a whole:

$ ./bin/java --list-modules | sort | head
java.base@23-ea
java.compiler@23-ea
java.datatransfer@23-ea
java.desktop@23-ea
java.instrument@23-ea
java.logging@23-ea
java.management@23-ea
java.management.rmi@23-ea
java.naming@23-ea
java.net.http@23-ea

The above command lists only the first 10 modules. Note, however, that by default the JDK includes 69 different modules. Do you need all of them for your application? For example, if your application doesn't use the Java compiler (javac) at runtime you probably won't need the java.compiler module. The set of Java SE modules and their exported packages are governed by the Java SE Platform Specification. The full list of modules of the JDK are governed by the Platform and are evolved with JEPs (e.g. JEP 320).

jlink: Reducing the Size of the Runtime

In the previous section we discussed the on-disk-size footprint of the JDK. In this section we'll discuss how one could reduce the size of the Java runtime by making it tailored for a specific application.

JDK 9 introduced a tool, the Java Linker, also known as jlink, which can be used to pick and choose the JDK modules that are needed for specific application needs. It allows for creating custom Java runtimes for your application needs. Doing that has a couple of advantages:

  1. It reduces the size of the overall runtime that you need to bundle with your application.
  2. It reduces the attack surface of classes included in your application's runtime.
  3. In a cloud deployment, this results in smaller total image size of the deploying application container. Therefore, in orchestration environments like Kubernetes this can improve bring-up-time of new nodes on horizontal scaling of clusters since the time to pull application containers could potentially improve due to the reduced per container image size.

In the following example, we illustrate creating a custom Java runtime using jlink for a specific application. The example application is a simple fruit-managing REST application, running from the class path, packaged as JARs. We first run the application on a default JDK installation. Later, we run the same application on a custom runtime illustrating possible on-disk-size savings between the two runtimes.

The simple demo application manages fruits and is available at http://127.0.0.1:8080. All fruits, stored so far, can be listed with the /fruits endpoint:

$ ./bin/java [...] -jar ../app-dir/app-run.jar &
[1] 27040
$ curl -s -w "\n" http://127.0.0.1:8080/fruits | jq '.[] | .name'
"Apple"
"Pineapple"

If we look at the on-disk-size footprint of the JDK that we used as the runtime for this application we see the same as we've seen in the initial section when we discussed the size of the JDK installation:

$ du -sh .
352M .
$ du -s ./* | sort -n
4        ./release
160      ./conf
224      ./include
316      ./legal
464      ./bin
800      ./man
87828    ./jmods
269784   ./lib

The total JDK installation is about 352MB in size while the lib folder takes up the bulk of it. We also notice that the jmods folder is second on the list, consuming 86MB (or ~25% of the total on-disk-size). More about that later.

The JDK is currently comprised of 69 modules which amounts to a significant size of the modules image and the runtime as a whole:

$ ./bin/java --list-modules | wc -l
69

Given our example application which requires a known set of JDK modules, we can reduce the size of the runtime which we'll discuss in the next section. More specifically, the size of a custom runtime can be reduced by the following means:

  1. Compression
  2. Only including JDK modules that the application actually needs

Since compression can be applied to a full runtime with all 69 modules as well as to a runtime with fewer modules, we'll focus on the latter approach in the next section: Creating custom runtimes. For the impatient, higher compression of the modules image can be achieved with the --compress option of jlink (e.g. use --compress zip-9 for the highest compression level). The default compression level when using jlink is zip-6.

jlink: Creating Custom Runtimes

In the previous section we have seen that the default on-disk-size footprint of a JDK installation can be quite large. Also, many modules are included in a default JDK. For the demo application shown in the previous section we do not need all of them. In fact, it only requires modules java.base, java.logging, java.naming and jdk.unsupported for the application to work. The following command can be used to create such a custom runtime including only those modules:

$ ./bin/jlink --add-modules java.base,java.logging,java.naming,jdk.unsupported \
              --output ../custom-jdk --verbose
java.base file:///home/test/jdk/jmods/java.base.jmod
java.logging file:///home/test/jdk/jmods/java.logging.jmod
java.naming file:///home/test/jdk/jmods/java.naming.jmod
java.security.sasl file:///home/test/jdk/jmods/java.security.sasl.jmod
jdk.unsupported file:///home/test/jdk/jmods/jdk.unsupported.jmod

Providers:
  java.base provides java.nio.file.spi.FileSystemProvider used by java.base
  java.naming provides java.security.Provider used by java.base
  java.security.sasl provides java.security.Provider used by java.base
  java.logging provides jdk.internal.logger.DefaultLoggerFinder used by java.base

The resulting image only contains the transitive closure of those modules:

$ ../custom-jdk/bin/java --list-modules
java.base@23-ea
java.logging@23-ea
java.naming@23-ea
java.security.sasl@23-ea
jdk.unsupported@23-ea

Looking at the total size of the custom runtime, we see that its size is significantly reduced:

$ du -sh ../custom-jdk
59M     ../custom-jdk
$ du -s ../custom-jdk/* | sort -n
4       ../custom-jdk/release
28      ../custom-jdk/bin
108     ../custom-jdk/conf
120     ../custom-jdk/legal
196     ../custom-jdk/include
300     ../custom-jdk/man
59536   ../custom-jdk/lib
$ du -sh ../custom-jdk/lib/modules
31M     ../custom-jdk/lib/modules

We've just reduced the size of the runtime from ~350MB to about ~60MB! Also, the lib/modules file shrunk from 138M to 31MB. Note we didn't change the compression of the modules image. So what did change? First, we need to make this a fair comparison. The default JDK build includes sources for JDK libraries, src.zip (~50MB), that are used by many IDEs, uses the --keep-packaged-modules jlink option, which essentially creates the jmods folder and its content, and includes Class Data Sharing (CDS) archives for the JVM. If we exclude those files from our size measurements, then we end up with a size of about 185MB for the default JDK. That is, we actually reduced the size of our runtime from ~185MB to about ~60MB. That's still a huge win! Most of the size reduction is due to reducing the set of modules included in the custom runtime, from 69 originally to only 5. This allowed for the modules image to shrink to only 31MB (from 138MB).

Now that we've created a custom runtime for our application, we can verify that the application still runs with it:

$ ../custom-jdk/bin/java [...] -jar ../app-dir/app-run.jar &
[1] 33492
$ curl -s -w "\n" http://127.0.0.1:8080/fruits | jq '.[] | .name'
"Apple"
"Pineapple"

Success!

jlink: Challenges and Default Runtimes

Unfortunately, using jlink doesn't only have advantages. Using it requires a JDK installation to have "packaged modules" included, which we show later in this section. Packaged modules, by default, are included in the jmods folder that we have seen in the size analysis in one of the previous sections. They can amount to an extra ~80MB of disk space (or more). Could we do something about that? If so, would jlink still be usable? Would the JDK work without them as a runtime?

As for the latter question, we've already answered it by creating and using a custom runtime for the example application in the previous section. The attentive reader would have already noticed that no jmods folder was present in the resulting custom runtime. The application ran fine without it. In fact, all that is needed for runtime, is the modular runtime image one is accustomed to in the JDK installation directory modulo the jmods folder.

What about the former question? Can jlink be used without the jmods folder? In order to answer this question, we first make sure that the jlink tool is included in the default JDK:

$ ./bin/java --list-modules | grep jlink
jdk.jlink@23-ea

Yes, it is. If it wasn't, then we wouldn't have the jlink binary to begin with as that's also part of the jdk.jlink module. For our simple experiment, we remove the jmods folder and then attempt to create a custom runtime again like we did in the previous section:

$ rm -rf jmods/
$ ./bin/jlink --add-modules java.base,java.logging,java.naming,jdk.unsupported --output ../custom-jdk --verbose
Error: --module-path is not specified and this runtime image does not contain jmods directory.
Usage: jlink <options> --module-path <modulepath> --add-modules <module>[,<module>...]
Use --help for a list of possible options

Unfortunately, this doesn't work. This is the answer to our question: jlink indeed needs the jmods folder, or, in other words, "packaged modules". Packaged modules are typically contained in the jmods folder of the JDK installation. For example, the packaged module for the java.base module is contained in the $JAVA_HOME/jmods/java.base.jmod file.

Packaged modules, however, duplicate content already present in the JDK runtime image and are only needed for linking custom runtimes. For example, one can list the contents of of the packaged module for java.base (file java.base.jmod) using the jmod tool observing some of the duplication:

$ ./bin/jmod list ./jmods/java.base.jmod | grep libjvm.so
lib/server/libjvm.so
$ ls ./lib/server/libjvm.so 
./lib/server/libjvm.so

To summarize, packaged modules don't influence Java runtime behaviour. In fact, jlink, by default, doesn't include them when custom runtimes get created. So in a way, the default JDK installation without packaged modules can be viewed as the default Java runtime that includes all 69 JDK modules. That is, it would satisfy runtime needs of any Java application as far as Java library classes are concerned. On the other hand, it is a rather large runtime. The largest possible one as far as the JDK is concerned.

We have also seen that contents of the packaged modules are already present in the JDK installation elsewhere.

What if there was a way to reduce the installed size of the default JDK by ~80MB, yet still allowing users to create custom runtimes? It would be a win in today's cloud-enabled world. For example, creating custom runtimes, on demand, inside a Kubernetes cluster would have an effect since the download - or container pull - of the base JDK supporting jlink would be smaller than it is today. As a result, it can make cluster auto-scaling potentially leaner for Java application containers on Kubernetes.

This is where Linkable JDK Runtimes fit into the picture. They allow jlink to continue to work without requiring packaged modules to be present. Thus, making the overall installation size of the default JDK smaller.

Description

Creating linkable JDK runtimes needs to be enabled at JDK build time with the configure option --enable-linkable-runtime. The default for this configure option is set to being disabled. This is a precautionary measure so as to be able to gather more feedback by allowing OpenJDK distributors to enable this build time option without right away making this the default for all OpenJDK builds. Having said that, it may become the default build option in a future JDK release. A JDK produced with the --enable-linkable-runtime configure option does not contain packaged modules. This allows for the on-disk-size of a linkable JDK runtime installation to be smaller. It is worth noting that a linkable JDK runtime includes all JDK modules in the produced JDK image as before.

Using Linkable JDK runtimes

Even though a linkable JDK runtime does not include packaged modules, it still supports most jlink use-cases (see section "Restrictions" for limitations). In order to support jlink with a linkable JDK runtime, it will re-constitute the packaged modules view of the resource bytes part of a JDK module by the following means:

  1. It uses the jimage file, lib/modules, as the primary resource for Java classes and resources in order to get the required bytes for those resources at link time. Those bytes mostly equate to the classes section in jmod archives (see point 3 for differences).
  2. Files not in the jimage file, such as native binaries, native libraries, configuration files or legal files, are taken from the file system directly as opposed to taking them from the packaged modules. The current JDK installation already includes these files for default OpenJDK builds.
  3. Since some classes or resources might be generated at link time, and are - therefore - not present in packaged modules, a difference file in a simple binary format is included on a per-module-basis in the jdk.jlink module. Not only generated (added) classes are being tracked in the difference file, but also removed or modified class files.
  4. In order to know which file on the filesystem belongs to which JDK module, a per-module meta-data file that is included in the jdk.jlink module is being consulted. This meta-data file lists files not itself included in the jimage (lib/modules).

In order to show the changed jlink run time behaviour visually two figures have been included in this JEP. Figure 1 illustrates a jlink workflow at run time using packaged modules. Figure 2, on the other hand, illustrates the jlink workflow at run time for linkable JDK runtimes. The primary difference to a regular jlink run is the changed input where resource bytes are being taken from. A regular jlink run, uses packaged modules and only packaged modules as input. Linkable JDK runtimes, use the installed JDK image instead.

Since linkable JDK runtimes use the installed JDK image as input, a few limitations apply when creating derived custom runtimes. See section "Restrictions" below. Experience with jlink in pratice has shown that most use-cases of jlink revolve around the idea of creating smaller runtimes. That is, remove some JDK built-in modules from the default runtime which a specific application doesn't need. A default OpenJDK build currently includes a total of 69 modules. Linkable JDK runtimes recognizes that creating custom runtimes is an important use-case that needs to be kept working.

Linkable JDK runtimes take configuration files and other binaries from the JDK installation. Therefore, the meta-data files associating files to JDK modules include a hash sum per file so as to be able to detect changes to the actual on-disk files. If any of those files have been modified, running jlink on such an installation will fail and report an appropriate error message:

$ ./bin/jlink --add-modules java.base --output ../mini-runtime
Linking based on the current run-time image.
Error: [..]/conf/security/java.security has been modified.
$ echo $?
1

Restrictions

Linkable JDK runtimes will come with the following limitations on using jlink:

Alternatives

As far as we are aware there is no alternative that would satisfy the goal of reducing the installed JDK installation size by removing packaged modules, yet allow for running jlink to produce customized runtimes.

Testing

Since this JEP does not intend to remove the feature of using jlink with packaged modules as input it will be tested and compared to jlink running on packaged modules. In particular, creating a custom runtime including the java.se modules without packaged modules must be byte-for-byte comparable to a runtime created with packaged modules as input.

Risks and Assumptions

Creating a linkable JDK runtime needs to be enabled with a configure option at JDK build time: --enable-linkable-runtime. Therefore, the risk only applies to JDK builds enabling that option. For default OpenJDK builds, there should be little to no risk.

One basic assumption of this JEP is that users who create custom Java runtimes using jlink tend to not include the jdk.jlink module itself in the custom runtime.

Using the run-time JDK image as base for linking custom runtimes has minimal risk:

  1. Prior this change, when the jmods folder didn't exist jlink aborted with an error. After this change jlink would continue the link using the linkable JDK runtime image as input. This is being indicated to the user by an appropriate message where jlink was invoked from.
  2. Files on the file system could be changed before creating a custom runtime image from them. Mitigation of the risk of modified files in the file system is provided by including hash sums of those files and failing the runtime-based link when files have been externally modified.
  3. With linkable JDK runtimes, using jlink is possible without packaged modules as long as the jdk.jlink module is present in the base JDK. Note however, that including jdk.jlink in a custom runtime will not be possible with linkable JDK runtimes at this time. jlink aborts with an error in this case.