JEP draft: Launch Multi-File Source-Code Programs

OwnerRon Pressler
TypeFeature
ScopeJDK
StatusSubmitted
Componenttools / launcher
EffortS
Relates toJEP 330: Launch Single-File Source-Code Programs
Reviewed byAlex Buckley, Brian Goetz
Created2023/03/17 10:17
Updated2023/09/05 23:11
Issue8304400

Summary

Enhance the java launcher to run a program supplied as one or more files of Java source code. This allows programs to defer the overhead of configuring a build tool and packaging until this is desired.

Goals

Non-Goals

Motivation

Java excels at writing large, complex applications developed and maintained over many years by large teams. Still, even large programs start small. In the early stages, developers tinker and explore and don't care about deliverable artifacts; the project's structure may not yet exist, and when it emerges, it changes frequently. Fast iteration and radical change are the order of the day. Several features to assist with tinkering and exploration have been added to the JDK in recent years, such as JShell (an interactive shell for playing with snippets of code) and a simple web server (for quick prototyping of web apps).

In JDK 11 the java launcher was enhanced to be able to run a .java file directly, without an explicit compilation step. For example, suppose the file Prog.java declares two classes:

class Prog {
    public static void main(String[] args) { Helper.run(); }
}

class Helper {
    static void run() { System.out.println("Hello!"); }
}

Running java Prog.java will compile both classes in memory, then execute the main method of the first class declared in Prog.java.

This low-ceremony approach to running a program has a major limitation: all the source code of the program must be placed in a single .java file. To work with more than one .java file, developers must return to compiling source files explicitly. For experienced developers, this will often entail creating a project configuration for a build tool. The need to shift phase from amorphous tinkering to a project structure capable of producing runnable artifacts at such an early stage is a bump on the road from an early idea to a finished product that we encounter just when we want ideas and experiments to flow smoothly. For people learning Java, the transition from a single .java file to two or more files requires an even bigger phase transition. It is at that point where they must pause their learning of the language and learn either to operate javac, or to pick and learn a third-party build tool, or to rely on the magic of their IDE.

Ideally, a developer could defer the "project setup" stage until learning more about the shape of the project, or possibly avoid it altogether when quickly hacking and then throwing away a prototype. Some simple programs may even remain in their source form forever. This motivates enhancing the java launcher to be able to run a program that has grown beyond a single .java file, without forcing an explicit compilation step. The traditional "edit-build-run" cycle becomes simply "edit-run". Developers can then decide when it's time to set up a build process rather be forced to do it by a limitation of the tooling.

Description

The java launcher's source-file mode is enhanced to be able to run a program supplied as one or more files of Java source code.

For example, suppose a directory contains two files, Prog.java and Helper.java, where each file declares a class:

// Prog.java
class Prog {
    public static void main(String[] args) { Helper.run(); }
}

// Helper.java
class Helper {
    static void run() { System.out.println("Hello!"); }
}

Running java Prog.java will compile the Prog class in memory and invoke its main method. Because code in this class refers to the class Helper, Helper.java will be found in the filesystem and its classes compiled in memory. If code in class Helper refers to some other class, e.g., HelperAux, then HelperAux.java will be found and compiled too.

When classes in different .java files refer to each other, the java launcher does not guarantee any particular order or timing for the compilation of the .java files. It is possible, for example, that Helper.java is compiled before Prog.java. Some code may be compiled before the program starts executing while other code may be compiled lazily, on the fly. See the section Launch-time semantics and operation for details about the process of compiling and executing source-file programs.

Only .java files whose classes are referenced by the program will be compiled. This allows developers to play with new versions of code without worrying that old versions will be compiled accidentally. For example, suppose the directory also contains OldProg.java, whose older version of the Prog class expects the Helper class to have a method go rather than run. The presence of OldProg.java, with its latent error, is immaterial when running java Prog.java.

Multiple classes can be declared in one .java file, and will all be compiled together. Classes co-declared in a .java file are preferred to classes declared in other .java files. For example, suppose the file Prog.java above is changed to declare a class Helper, despite a class of that name already being declared in Helper.java. When code in Prog.java refers to Helper, the class that is co-declared in Prog.java will be used, and the launcher will not search for the file Helper.java.

Duplicate classes in the source code program are prohibited. That is, two declarations of a class with the same name in either the same .java file, or across different .java files that form part of the program, are not permitted. For example, suppose that after some edits, Prog.java and Helper.java end up as shown below, with the class Aux accidentally declared in both:

// Prog.java
class Prog {
    public static void main(String[] args) { Helper.run(); Aux.cleanup(); }
}
class Aux {
    static void cleanup() { ... }
}

// Helper.java
class Helper {
    static void run() { ... }
}
class Aux {
    static void cleanup() { ... }
}

Running java Prog.java will compile the Prog and Aux classes in Prog.java, invoke the main method of Prog, and then (due to main's reference to Helper) find Helper.java and compile its classes Helper and Aux. The duplicate declaration of Aux in Helper.java is not permitted, so the program stops and the launcher gives an error.

Source-file mode is triggered by passing the name of one .java file to the java launcher. If additional filenames are passed, they become arguments to the main method of the first class, e.g., java Prog.java Helper.java will result in the string "Helper.java" being an argument to the main method of the Prog class.

The name of the launched .java file need not match the name of the public class declared inside it, but giving it a different name is mostly useful for single-file programs, in particular "shebang" files (see below).

Using pre-compiled classes

Small programs that are written to run in source-file mode will often wish to use libraries provided on the class path. For example, suppose a directory contains a number of small programs plus a helper class, all in the unnamed package, alongside some JAR files:

Prog1.java
Prog2.java
Helper.java
library1.jar
library2.jar

A developer can quickly run these programs by passing -cp '*' to the java launcher: (this option puts all the JAR files in the directory on the class path; the asterisk is quoted to avoid expansion by the shell)

java -cp '*' Prog1.java
java -cp '*' Prog2.java

As a developer continues to experiment, it may be appropriate to put the JAR files in a separate directory, then use, e.g., -cp 'libs/*' to make them available. If libraries are available as modular JARs, then jlink can be used to create a Java image which contains exactly the JDK modules and library modules needed by the source-file program; in this scenario, no options are needed for the java launcher.

How the launcher finds source files

The java launcher expects that the source files of a multi-file program are located in a standard directory hierarchy, where the directory structure follows the package structure. This means that (1) source files in the same directory are expected to declare classes in the same package, and (2) a source file in directory foo/bar declares a class in package foo.bar.

For example, suppose a directory contains Prog.java, which declares classes in the unnamed package, and a subdirectory pkg, where Helper.java declares the class Helper in the package pkg:

// Prog.java
class Prog {
    public static void main(String[] args) { pkg.Helper.run(); }
}

// pkg/Helper.java
package pkg;
class Helper {
    static void run() { System.out.println("Hello!"); }
}

Running java Prog.java will cause Helper.java to be found in the pkg subdirectory and compiled in memory, resulting in the class pkg.Helper needed by code in class Prog.

If Prog.java declared classes in a named package, or Helper.java declared classes in a package other than pkg, then java Prog.java would fail.

The java launcher computes the root of the source tree from the package and the filesystem location of the initial .java file. For java Prog.java, the initial file is Prog.java and it declares a class in the unnamed package, so the root of the source tree is the directory containing Prog.java. On the other hand, if Prog.java declared a class in a named package a.b.c, then Prog.java must be placed in the corresponding directory hierarchy:

a/
  b/
    c/
      Prog.java

and must be launched by running java a/b/c/Prog.java. The root of the source tree is the directory containing the subdirectory a.

If Prog.java declared classes in a different named package, then java a/b/c/Prog.java would fail. This is a change in behavior of the java launcher's source-file mode. Prior to JDK NN, source-file mode was permissive about which package, if any, was declared in a .java file at a given location; java a/b/c/Prog.java would succeed as long as Prog.java was found in a/b/c/, regardless of its package declaration. Since it is unusual for a .java file to declare classes in a named package without residing in the corresponding directory hierarchy, it is unlikely that the package is important; the simple fix is to remove the package declaration from the file.

Launch-time semantics and operation

Since JDK 11, source-file mode has worked as if:

java  <other options> --class-path <path> <.java file>

is informally equivalent to:

javac  <other options> -d <memory> --class-path <path> <.java file>
java  <other options> --class-path <memory>:<path> <first class in .java file>

With the ability to launch multi-file source code programs, source-file mode works as if:

java --class-path <path> <.java file>

is informally equivalent to:

javac  <other options> -d <memory> --class-path <path> --source-path <root> <.java file>
java  <other options> --class-path <memory>:<path> <first class in .java file>

where <root> is the computed root of the source tree, as explained earlier.

(The use of --source-path indicates that classes co-located in a .java file are preferred to classes located in other .java files. For example, invoking javac --source-path dir dir/Prog.java will not compile Helper.java if Prog.java declares the class Helper.)

When the java launcher is run in source-file mode (e.g., java Prog.java), the following steps occur:

  1. The launcher computes the directory which is the root of the source tree.

  2. The launcher determines the module of the source code program. If a module-info.java file exists in the root, its module declaration is used to define a named module that will contain all the classes compiled from .java files in the source tree. If module-info.java does not exist, all the classes compiled from .java files will reside in the unnamed module.

  3. The launcher compiles all the classes in the initial .java file, and possibly other .java files which declare classes referenced by code in the initial file, and stores the resulting class files in an in-memory cache (rather than writing the class files to disk).

  4. The launcher uses a custom class loader to load the first class declared in the initial file from the in-memory cache, then invokes the main entry point of the class.

When the custom class loader is asked to load a class called C -- either the first class in the initial file, or any other class that needs to be loaded while running the program -- the loader performs a search that mimics the order of javac -Xprefer:source at compile time. Notably, if a class exists both in the source tree (declared in a .java file) and on the class path (in a .class file), the class in the source tree is preferred. The loader's search algorithm for a class called C is:

  1. If a class file for C is found in the in-memory cache, the loader defines the cached class file to the JVM, and loading of C is complete.

  2. Otherwise, the loader delegates to the application class loader to search for a class file for C that is exported by a named module which is (i) read by the module of the source code program and (ii) present on the module path or in the Java runtime image. (The unnamed module, in which the source code program may reside, reads a default set of modules in the Java runtime image.) If found, loading of C is completed by the application class loader.

  3. Otherwise, the loader searches for a .java file matching the name of the class (or the enclosing class if the requested class is a member class), i.e. C.java, located in the directory corresponding to the package of the class. If found, all the classes declared in the .java file are compiled. If compilation succeeds, the resulting class files are stored in the in-memory cache, the loader defines the class C to the JVM using the cached class file, and loading of C is complete. If compilation fails, the launcher reports the error and terminates with a non-zero exit status.

    When compiling C.java, the launcher may choose to eagerly to compile other .java files that declare classes referenced by C.java, and store the resulting class files in the in-memory cache. This choice is based on heuristics that may change between JDK releases.

  4. Otherwise, if the source code program resides in an unnamed module, the loader delegates to the application class loader to search for a class file for C on the class path. If found, loading of C is completed by the application class loader.

  5. Otherwise, a class called C cannot be found, and a ClassNotFoundException is thrown.

Classes loaded from the class path or module path cannot reference classes that are compiled in memory from .java files. That is, when class references in pre-compiled classes are encountered, the source tree is never consulted.

Differences between compilation at compile-time and launch-time

There are some major differences between how the Java compiler compiles code on the source path when using javac and how it compiles code when using the java launcher in source-file mode:

  1. In source-file mode, the classes that are referenced and found in .java files may be compiled during program execution, rather than all being compiled before execution starts. This means that a compilation error may occur, causing the launcher to terminate, after the program has already started executing. This developer experience is very different than prototyping with explicit compilation via javac, but it works effectively in the fast-moving "edit-run" cycle enabled by source-file mode.

  2. In source-file mode, classes that are accessed via reflection are loaded in the same manner as classes that are accessed directly. For example, if the program calls Class.forName("pkg.Helper"), then the launcher's custom class loader will attempt to load the class Helper in the package pkg, potentially causing compilation of pkg/Helper.java. Similarly, if a package's annotations are queried via Package.getAnnotations, then an appropriately-placed package-info.java file in the source tree will be compiled in memory and loaded.

  3. In source-file mode, annotation processing is disabled, similar to when --proc:none is passed to javac.

  4. In source-file mode, it is not possible to run a source code program whose .java files span multiple modules.

The limitations imposed by #3 and #4 may be removed in the future.

"Shebang" files

So-called "shebang" files – files whose first line starts with #! – can be used to write a script in Java (see "shebang" files in JEP 330). Because such scripts are expected to be self-contained, the launcher's behavior is unchanged from that described in JEP 330: only the launched file will be compiled and no others, as if the source path passed to the compiler is empty.

Alternatives

We could keep source-code programs restricted to a single file and require a separate compilation step for multi-file programs. While it does not impose significantly more work on the programmer, the reality is that most Java programmers have grown unfamiliar with the direct use of javac, and prefer relying on a build tool when class file generation is required. Use of the java command is less intimidating than javac.

Even if we made javac friendlier to use, with convenient defaults for compiling complete source trees, the need to set up a directory for the generated class files (or otherwise have them pollute the source tree) is a speed bump we'd like to remove. Many programmers place their .java files under version control even at the tinkering stage, and would need to set up their version control repository to exclude the class files generated by javac.