JEP draft: Code reflection (Incubator)

Owner	Paul Sandoz
Type	Feature
Scope	JDK
Status	Draft
Component	core-libs
Effort	L
Duration	L
Reviewed by	Adam Sotona, Gary Frost, Juan Fumero, Maurizio Cimadamore
Created	2025/06/30 19:54
Updated	2026/02/19 00:16
Issue	8361105

Summary

Enhance the core reflection API to model Java code, build and transform models of Java code, and access models of Java code in methods and lambda expressions. Libraries can use this enhancement to analyze Java code and extend its reach, such as executing it as code on GPUs. This is an incubating API.

Goals

Enable Java developers to interface with non-Java (foreign) programming models using familiar Java language constructs, such as lambda expressions and static typing.
Encourage libraries to expose novel programming models to Java developers without requiring developers to embed non-Java code inside Java code, or to write tedious Java code that builds data structures to model Java code or other (foreign) code.

Non-Goals

It is not a goal to change the meaning of Java programs as specified by the Java Language Specification, compile Java source code to anything other than the instruction set as specified by the Java Virtual Machine specification, change the JVM’s instruction set, and change HotSpot to support instruction sets of specialized processing units. For example, it is not a goal to make such changes to the Java platform to execute Java methods on GPUs.
It is not a goal to standardize the internal Abstract Syntax Tree of javac to serve as the model for Java code.
It is not a goal to enable access at run time to bytecode and for it to serve as the model for Java code.
It is not a goal to devise a general metaprogramming or macro facility for the Java language.
It is not a goal to introduce language constructs, like class literals, to concisely express access to a model of code.

Motivation

Many Java programs need to process large amounts of data in parallel, and Java libraries make it easy to implement parallel computations. For example, in a face detection algorithm, we need to convert RGB pixels to grayscale; here is simplified code to do that using a lambda expression and the parallel streams built into the JDK:

IntConsumer rgbToGray = i -> {
    byte r = rgbImage[i * 3 + 0];
    byte g = rgbImage[i * 3 + 1];
    byte b = rgbImage[i * 3 + 2];
    grayImage[i] = gray(r, g, b);
};
IntStream.range(0, N)
    .parallel()
    .forEach(rgbToGray);

If the number of pixels N is sufficiently large and/or the work to compute each pixel is sufficiently demanding, then the stream will compute the result faster than a single-threaded for loop, even with the overhead of starting and coordinating multiple threads.

for (int i = 0; i < N; i++) {
    byte r = rgbImage[i * 3 + 0];
    byte g = rgbImage[i * 3 + 1];
    byte b = rgbImage[i * 3 + 2];
    grayImage[i] = gray(r, g, b);
}

Gustafson's Law states that as we increase the number of threads M, each working on a sufficiently large number of pixels, the estimated speedup of a program will approach M as the fraction of time spent on parallel tasks grows.

Unfortunately, the number of threads that can run compute-intensive tasks is limited by the CPU, e.g., an AMD EPYC 9005 Zen 5c has 384 threads. Java 21 introduced virtual threads to run large numbers of I/O-intensive tasks in parallel, but virtual threads do not create new compute resources and cannot speed up code that is already CPU-bound.

General-purpose computing with Graphics Processing Units

There is a class of computing device, the Graphics Processing Unit (GPU), whose architecture is very different to the CPU: rather than a few hundred threads, a modern GPU such as an NVIDIA Blackwell B200 GPU can simultaneously execute a few hundred thousand threads.

Originally GPUs were designed for rendering images and video games, but now we can use them for general-purpose computations such as face detection, General Matrix Multiplication (GEMM), or Fast Fourier Transformation (FFT).

If we could run parallel tasks on a GPU instead of a CPU, with orders of magnitude more threads, we could either greatly reduce the execution time or compute more in the same execution time.

Historically one approach was to write the multithreaded computation in a language supported by the GPU, e,g., CUDA C, and embed it as a string in a Java program. We could use JNI to run the CUDA C compiler and transfer the compiled code to the GPU for execution.

static void gpuComputation(int N, byte[] rgbImage, byte[] grayImage) {
    var cudaCCode = """
        __device__
        char gray(char r, char g, char b) { return ...; }

        __global__
        void computeGrayImage(int N, char* rgbImage, char* grayImage) {
            int i = blockIdx.x * blockDim.x + threadIdx.x;
            if (i < N) {
                char r = rgbImage[i * 3 + 0];
                char g = rgbImage[i * 3 + 1];
                char b = rgbImage[i * 3 + 2];
                grayImage[i] = gray(r, g, b);
            }
        }
        """;
    var gpuCode = compileGpuCode(cudaCCode);
    runGpuCode(gpuCode, N, rgbImage, grayImage);
}

This approach is problematic: it presents a leaky abstraction that forces developers to be familiar with CUDA artifacts, and the code is no longer hardware independent. Expecting developers to write Java code that merely carries non-Java code is misguided since javac cannot compile and check that code; they can choose any language, not just Java, to carry CUDA C.

Exploring Better Abstractions

In the 2010s, OpenJDK Project Sumatra aimed to let Java developers take advantage of GPUs by enhancing the JVM and parallel streams. The Sumatra JVM could generate code for AMD GPUs and place parts of the JVM’s heap in GPU memory. This approach, where the Java Platform obscures the presence of a GPU from Java code, is in stark contrast to manually embedding GPU code into a Java program. Neither approach provides the right abstraction, and this is why we abandoned Sumatra.

Obscuring the GPU is particularly challenging. First, memory is split between CPU and GPU; managing the JVM’s heap across the CPU and GPU can be a continual drag on performance. Second, the idiomatic Java code in lambda expressions is polymorphic: methods are commonly invoked on interfaces rather than classes and each invocation triggers virtual method lookup and possibly class loading, initialization, etc. It is counterproductive to bring this highly variable behavior, where each thread may run different code, to the GPU, where each thread is intended to run identical code in lock step on different data elements using a Single Instruction Multiple Thread (SIMT) execution model.

Empowering libraries

We believe the best way to support GPUs in the Java Platform is to introduce primitives that enable the creation of libraries which, in turn, introduce novel programming models and APIs that harness the unique memory and execution capabilities of GPUs. With these primitives we design the Java platform for growth. Libraries can introduce their novel programming models that feel part of the Java platform and yet are free to deviate from the semantics of Java code. They are not held back waiting, at vast expense, for the addition of questionable features to the Java platform. (See the Foreign programming models section for more detail.)

One such primitive is the Foreign Function & Memory (FFM) API, introduced in Java 22. While the FFM API has no built-in knowledge of GPUs, it allows libraries to interact efficiently with native device drivers on the CPU and thereby control the GPU indirectly.

If libraries are to translate the parallel parts of Java programs to GPU code, they need access to Java code. Fortunately, the Java Platform has a longstanding primitive –- core reflection -- which allows a library to inspect the structure of a Java program. Java 1.1 introduced reflection at run time, the core reflection API, kick-starting an ecosystem of libraries for data access, unit testing, messaging, etc. Java 5 introduced reflection at compile time, allowing annotation processors to generate code that extends the application with no maintenance overhead.

Unfortunately, the core reflection API is limited and does not provide access to the code in methods or lambda expressions. We can use the core reflection API to inspect what methods a class declares, but we can go no deeper and inspect the code of those methods.

A library can access the source code of methods and lambdas with internal APIs of javac, but this is only available at compile time and is too complex since it contains many extraneous syntactic details. At run time, a library can access the bytecode of methods (but not lambdas) with the Class-File API, but this is a poor substitute for source code and class files are not always available.

Code Reflection: The Missing Primitive

To support libraries effectively, we propose to enhance reflection to expose not just classes, fields, and methods but also the code of methods and lambda expressions. With this enhancement, we can develop libraries that translate Java code, e.g., the lambda expression used in a parallel stream, into GPU code, eliminating the need to manually write CUDA C code. With knowledge of both Java code and GPU code, libraries can model data dependencies and optimize data transfer between CPU and GPU for better performance.

Foreign programming models

Just as the FFM API is not specific to GPUs, an API providing access to Java code is not specific to GPUs. Libraries could use it to, e.g., automatically differentiate Java code, pass translated Java code to native machine learning runtimes, or translate Java code to SQL statements. These are all examples of foreign programming models.

A library that translates Java code to code of some other foreign programming model does not, in general, preserve the semantics of that Java code. For example, a GPU library will not, in general, preserve the semantics of Java code when it translates it to C code conforming to the CUDA C programming model specified by NVIDIA. The GPU library specifies the rules as to what constitutes translatable Java code. For example, it may reject try statements and not preserve accuracy of floating point operations. Those rules are foreign to the Java programming model as specified by the Java Language Specification, which specifies nothing about programming models for GPUs, nor those for automatic differentiation etc.

So with code reflection we can leverage a foreign programming model and with the Foreign Function & Memory API we can leverage a foreign runtime. Using both together we can embrace a foreign world and orchestrate complex activity between the two.

Enabling the incubating API

Code reflection is an incubating API, disabled by default. Code reflection is offered in the incubator module jdk.incubator.code. To try out code reflection you must request that the incubator module jdk.incubator.code be resolved:

Compile the program with javac --add-modules jdk.incubator.code Main.java and run it with java --add-modules jdk.incubator.code Main; or
When using the source code launcher, run the program with java --add-modules jdk.incubator.code Main.java;, or
When using jshell, start it with jshell --add-modules jdk.incubator.code.

Description

We propose to enhance the core reflection API with code reflection. Code reflection supports access to a model of code in a method or lambda expression, a code model, at run time that is suited for analysis and transformation.

With code reflection, a library can generate CUDA C code from Java code. Recall the lambda and stream based example.

IntConsumer rgbToGray = i -> {
    byte r = rgbImage[i * 3 + 0];
    byte g = rgbImage[i * 3 + 1];
    byte b = rgbImage[i * 3 + 2];
    grayImage[i] = gray(r, g, b);
};
IntStream.range(0, N)
        .parallel()
        .forEach(rgbToGray);

First, we declare that the lambda expression is reflectable and thereby grant access to its code. We do so by casting our lambda expression to the target interface annotated with @Reflect.

final byte[] rgbImage = ...
final float[] grayImage = ...
IntConsumer rgbToGray = (@Reflect IntConsumer) i -> {
    byte r = rgbImage[i * 3 + 0];
    byte g = rgbImage[i * 3 + 1];
    byte b = rgbImage[i * 3 + 2];
    grayImage[i] = gray(r, g, b);
};

When the lambda expression is compiled by javac it translates its internal model of the lamda expression to a standard model, a code model, and stores the code model in a class file related to the class file containing the bytecode of the compiled lambda expression. A code model is an immutable tree of code elements, where in general each element models some Java statement or expression.

We use code reflection to access the lambda expression’s code model, which loads the corresponding code model that was stored in the related class file.

var rgbToGrayModel = Op.ofLambda(rgbToGray).orElseThrow();

(Since code reflection is an incubating enhancement to the core reflection API we cannot add new APIs in packages of other modules, such as in the java.lang.reflect package of the java.base module. For now, we must provide such methods in the incubating code reflection module.)

The method Op.ofLambda returns the code model for the result of a reflectable lambda expression, in this case an instance of IntConsumer. By default, lambda expressions are not reflectable, so we return an optional value. (For more details see the Declaring reflectable code section.)

What does the lamba expression's code model look like? To get some comprehension we can convert the code model to a string and print it out. Below we show part of that string with embedded comments associating code elements with the langauge elements they model. Later in the Code models section we explain further by looking in more detail at the code model of the gray method.

%0 : java.type:"java.util.function.IntConsumer" = lambda
@lambda.isReflectable=true
(%1 : java.type:"int")java.type:"void" -> {
    // declaration of method parameter i
    %2 : Var<java.type:"int"> = var %1 @"i";
    // access to captured rgbImage
    %3 : java.type:"byte[]" = var.load %4;
    // access to i
    %5 : java.type:"int" = var.load %2;
    // 3
    %6 : java.type:"int" = constant @3;
    // i * 3
    %7 : java.type:"int" = mul %5 %6;
    // 0
    %8 : java.type:"int" = constant @0;
    // i * 3 + 0
    %9 : java.type:"int" = add %7 %8;
    // rgbImage[i * 3 + 0]
    %10 : java.type:"byte" = array.load %3 %9;
    // byte r = rgbImage[i * 3 + 0]
    %11 : Var<java.type:"byte"> = var %10 @"r";
    ...
    // access to r
    %31 : java.type:"byte" = var.load %11;
    // conversion of byte to int
    %32 : java.type:"int" = conv %31;
    ...
    // gray(r, g, b)
    %37 : java.type:"int" = invoke %32 %34 %36 @java.ref:"GPUExample::gray(int, int, int):int";
    // conversion of int to float
    %38 : java.type:"float" = conv %37;
    // grayImage[i] = gray(r, g, b)
    array.store %28 %30 %38;
    // implicit return
    return;
};

Once we have the Java code model we can pass it to our GPU library.

String cudaCCode = translateJavaCodeToCudaCCode(rgbToGrayModel);

The GPU library uses code reflection to traverse the code model and translate it to CUDA C code embedded in a string, similar to what we previously wrote by hand, after which the example proceeds as before to compile the CUDA C code and run it.

As the GPU library traverses the code model it will encounter an element that models the invocation expression to the gray method. This method also needs to be translated to CUDA C code, otherwise we will generate an incomplete CUDA program. However, the library has no intrinsic support for this method. The library requires the code model of this method so that it can traverse and translate like was done with the lambda expression’s code model.

To achieve this we must declare that the gray method is also reflectable and thereby grant access to its code. We do so by also annotating our method with the @Reflect annotation.

@Reflect
static int gray(int r, int g, int b) {
    return ...;
}

Ordinarily the GPU library would call the methods to translate, compile, and run on behalf of the user, so the user simply calls just one method.

compileAndRun((@Reflect IntConsumer) i -> {
    byte r = rgbImage[i * 3 + 0];
    byte g = rgbImage[i * 3 + 1];
    byte b = rgbImage[i * 3 + 2];
    grayImage[i] = gray(r, g, b);
});

This enables the GPU library to hide the details of translation and compilation. For example, the library may, instead of translating to CUDA C, translate to an intermediate representation called SPIR-V, a standard binary interchange format for cross-platform computation on GPUs.

More importantly the GPU library can choose where to run the code. When performance is not a concern the library could be configured to directly invoke the lambda expression on the JVM using just a few platform threads, e.g, internally using a sequential or parallel stream. The developer can then use existing Java tools to debug their code and write unit tests against their code. When suitably debugged and tested the code can be run with increased confidence and performance on the GPU, where the code is much harder to debug and test.

Declaring reflectable code

We have previously shown how to declare a reflectable lambda expression, using the @Reflect annotation, and access its code model using code reflection. However, we think declaring reflectable code would best be done via a new keyword in the Java language, but an incubating module can't introduce language features. So, using an annotation is good enough for now. In some future non-incubating JEP we might devise the required language feature.

Declaration serves two purposes. First, we explicitly grant that other parts of our Java application may have run time access to the code, such as a library we may not be directly responsible for. Not all code needs to be reflected over, and not all code should, so we can reduce to only that which is necessary to share. Second, it informs javac it needs to perform additional tasks, so that a code model can be built and is made accessible at run time.

In total there are four syntactic locations where @Reflect can appear that governs, in increasing scope, what is declared reflectable.

If the annotation appears in a cast expression of a lambda expression (or method reference), annotating the use of the type in the cast operator of the cast expression, then the lambda expression is declared reflectable. For example,
```
compileAndRun((@Reflect IntConsumer) i -> { ... });
```
If the annotation appears as a modifier for a field declaration or a local variable declaration, annotating the field or local variable, then any lambda expressions (or method references) in the variable initializer expression (if present) are declared reflectable. This is useful when cast expressions become verbose and/or types become hard to reason about. For example, with fluent stream-like expressions where many reflectable lambda expressions are passed as arguments. For example,
```
@Reflect
IntConsumer rgbToGray = i -> { ... };
compileAndRun(rgbToGray);
```
Finally, if the annotation appears as a modifier for a non-abstract method declaration, annotating the method, then the method and any lambda expressions (or method references) it contains are declared reflectable. For example,
```
@Reflect
static int gray(int r, int g, int b) { ... }
```

The annotation is ignored if it appears in any other valid syntactic location.

Declaring a reflectable lambda expression or method does not implicitly broaden the scope of what is reflectable to methods they invoke. (In our GPU example we needed to annotate the gray method.) Furthermore, declaring a reflectable lambda expression does broaden the scope to the surrounding code of final, or effectively final, variables used but not declared in the lambda expression.

We access the code model of a reflectable method by invoking the method Op.ofMethod with a given Method instance, which returns an optional instance of the code model, a code element modeling the method declaration (see the Code models section).

We access the code model of a reflectable lambda expression by invoking the method Op.ofLambda with a given instance of a functional interface associated with the lambda expression, which returns an optional instance of Quoted<JavaOp.LambdaOp>. From the Quoted instance we can obtain the code model, a code element that models the lambda expression. In addition, we can obtain a mapping of run time values to items in the code model that model final, or effectively final, variables used but not declared in the lambda expression.

Code Models

A code model is an immutable instance of data structures that can, in general, model many kinds of code, be it Java code or foreign code. It has some properties like an Abstract Syntax Tree (AST) used by a source compiler like javac, such as modeling code as a tree of arbitrary depth, and some properties like an intermediate representation used by an optimizing compiler like HotSpot, such as modeling control flow and data flow as graphs. These properties ensure code models can preserve many important details of code they model and ensure code models are suited for analysis and transformation.

A code model of Java code produced by javac will be similar in fidelity to javac's AST of that Java code. That model can be progressively lowered, via transformation (while preserving the meaning of the code it models), to a code model that is similar in fidelity to bytecode or HotSpot's C2 IR. Code reflection provides one set of features to support code models across the spectrum of high fidelity to low fidelity.

The primary data structure of a code model is a tree of code elements. There are three kinds of code elements, operation, body, and block. The root of a code model is an operation, and descendant operations form a tree of arbitrary depth.

Code reflection supports representing the data structures of a code model, code elements for modeling Java language constructs and behavior, traversing code models, building code models, and transforming code models. We shall explain with further examples.

Traversing code models

Continuing with our GPU example, we shall reflect over the gray method (presented again with its implementation), access its code model, and traverse to print the model’s tree structure.

@Reflect
static int gray(int r, int g, int b) {
    return (29 * r + 60 * g + 11 * b) / 100;
}

var grayMethod = GPUExample.class.getDeclaredMethod("gray",
        int.class, int.class, int.class, int.class);
FuncOp grayModel = Op.ofMethod(addMethod).orElseThrow();

(To simplify we assume all bytes of the rgbImage array are normalized to be within the range of 0 to 100.)

First we obtain the core reflection Method instance for the method declaration of the gray method, then we go deeper and access the method’s code model using Op.ofMethod (as previously explained). The root of the code model is an operation, an instance of FuncOp that is a function declaration operation modeling the method.

We can then stream over elements of the code model, sorted topologically in pre-order traversal, using the CodeElement.elements method.

grayModel.elements().forEach((CodeElement<?, ?> e) -> {
    int depth = 0;
    var parent = e;
    while ((parent = parent.parent()) != null) depth++;
    IO.println("  ".repeat(depth) + e.getClass());
});

This code prints out the class of each code element it encounters and prefixes that with white space proportionate to the depth of the element in the code model tree. (We compute the depth for each code element by traversing back up the code model tree until the root element is reached. So, it is possible to traverse up and down the code model tree.)

class jdk.incubator.code.dialect.core.CoreOp$FuncOp
  class jdk.incubator.code.Body
    class jdk.incubator.code.Block
      class jdk.incubator.code.dialect.core.CoreOp$VarOp
      class jdk.incubator.code.dialect.core.CoreOp$VarOp
      class jdk.incubator.code.dialect.core.CoreOp$VarOp
      class jdk.incubator.code.dialect.core.CoreOp$ConstantOp
      class jdk.incubator.code.dialect.core.CoreOp$VarAccessOp$VarLoadOp
      class jdk.incubator.code.dialect.java.JavaOp$MulOp
      class jdk.incubator.code.dialect.core.CoreOp$ConstantOp
      class jdk.incubator.code.dialect.core.CoreOp$VarAccessOp$VarLoadOp
      class jdk.incubator.code.dialect.java.JavaOp$MulOp
      class jdk.incubator.code.dialect.java.JavaOp$AddOp
      class jdk.incubator.code.dialect.core.CoreOp$ConstantOp
      class jdk.incubator.code.dialect.core.CoreOp$VarAccessOp$VarLoadOp
      class jdk.incubator.code.dialect.java.JavaOp$MulOp
      class jdk.incubator.code.dialect.java.JavaOp$AddOp
      class jdk.incubator.code.dialect.core.CoreOp$ConstantOp
      class jdk.incubator.code.dialect.java.JavaOp$DivOp
      class jdk.incubator.code.dialect.core.CoreOp$ReturnOp

We can observe that the top of the tree is the FuncOp which contains one child, a Body, which in turn contains one child, a Block, which in turn contains a sequence of operations. Bodies and blocks provide additional structure for modeling code. Each operation models some part of the method's code, for example variable declaration operations (instances of VarOp) model Java variable declarations, in this case the method parameters, and the add operation (instance of AddOp) models the Java + operator.

Now that we know how to access code models and traverse them we can combine both to traverse from the lambda expression's code model to the gray method's code model (as needed by the GPU library).

final byte[] rgbImage = ...
final float[] grayImage = ...
@Reflect
IntConsumer rgbToGray = i -> {
    byte r = rgbImage[i * 3 + 0];
    byte g = rgbImage[i * 3 + 1];
    byte b = rgbImage[i * 3 + 2];
    grayImage[i] = gray(r, g, b);
};
Quoted<LambdaOp> rgbToGrayQuotedModel = Op.ofLambda(rgbToGray).orElseThrow();
LambdaOp rgbToGrayModel = rgbToGrayQuotedModel.op();
FuncOp grayModel = lambdaModel.elements()
        .flatMap((e) -> switch (e) {
            case JavaOp.InvokeOp iop -> {
                Method m;
                try {
                    m = iop.invokeDescriptor().resolveToMethod(MethodHandles.lookup());
                } catch (ReflectiveOperationException ex) {
                    yield null;
                }
                yield Op.ofMethod(m).stream();
            }
            default -> null;
        }).findFirst().orElseThrow();

We access the lambda expression’s code model using Op.ofLambda (as previously explained), which returns an instance of Quoted<LambdaOp> holding the code model. The code model is an operation, an instance of LambdaOp, that is a lambda expression operation modeling the lambda expression. The quoted instance also holds run time values for the final local variables rgbImage and grayImage, and associates them with items in the code model that model those variables. (If the variables were instead instance fields then the quoted instance would hold the value of this.)

The stream expression maps an operation modeling an invocation expression, an instance of InvokeOp, to the code model of the method it invokes, and returns the first model it encounters. We do this by resolving an invoke operation's description of the method it invokes to a Method instance from which we can obtain its code model. (We assume the MethodHandles.lookup() instance, passed as an argument to resolveToMethod, grants permission to resolve.)

To increase our comprehension of grayModel's code model we can convert it to a string and print it out.

IO.println(grayModel.toText());

(The toText method will traverse the code elements in a similar manner as we did previously when using the elements method.)

36  ....@Reflect
37  ....static int gray(int r, int g, int b) {
38  ....    return (29 * r + 60 * g + 11 * b) / 100;
39  ....}

func @loc="36:5:file:/.../GPUExample.java" @"gray" (
        %0 : java.type:"int", %1 : java.type:"int", %2 : java.type:"int")java.type:"int" -> {
    %3 : Var<java.type:"int"> = var %0 @loc="36:5" @"r";
    %4 : Var<java.type:"int"> = var %1 @loc="36:5" @"g";
    %5 : Var<java.type:"int"> = var %2 @loc="36:5" @"b";
    %6 : java.type:"int" = constant @loc="38:17" @29;
    %7 : java.type:"int" = var.load %3 @loc="38:22";
    %8 : java.type:"int" = mul %6 %7 @loc="38:17";
    %9 : java.type:"int" = constant @loc="38:26" @60;
    %10 : java.type:"int" = var.load %4 @loc="38:31";
    %11 : java.type:"int" = mul %9 %10 @loc="38:26";
    %12 : java.type:"int" = add %8 %11 @loc="38:17";
    %13 : java.type:"int" = constant @loc="38:35" @11;
    %14 : java.type:"int" = var.load %5 @loc="38:40";
    %15 : java.type:"int" = mul %13 %14 @loc="38:35";
    %16 : java.type:"int" = add %12 %15 @loc="38:17";
    %17 : java.type:"int" = constant @loc="38:45" @100;
    %18 : java.type:"int" = div %16 %17 @loc="38:16";
    return %18 @loc="38:9";
};

A code model’s text is designed to be human-readable. Its format is unspecified and is intended for debugging, testing, and comprehension. To further aid debugging each operation has line number information, and the root operation also has source information from where the code model originated.

The code model text shows the code model’s root element is a function declaration (func) operation. The lambda-like expression represents the fusion of the function declaration operation’s single body and the body’s first and only block, called the entry block. Then there is a sequence of operations in the entry block. For each operation there is an instance of a corresponding Java class, all of which extend from the abstract class jdk.incubator.code.Op and which have already seen when we printed out the classes. Unsurprisingly the printed operations and printed operation classes occur in the same order since the toText method traverses the model in the same order as we explicitly traversed.

The entry block declares three values called block parameters, %0, %1, and %2, which model the method’s initial values for parameters r, g and b. Focusing on the block parameter %0 we can track its transitive dependencies as it is used as an operand of an operation that produces a value called an operation result, which is used as an operand of a subsequent operation and so on until we reach the return operation.

func @loc="36:5:file:/.../GPUExample.java" @"gray" (
        %0 : java.type:"int", %1 : java.type:"int", %2 : java.type:"int")java.type:"int" -> {
    // declaration of method parameter r
    %3 : Var<java.type:"int"> = var %0 @loc="36:5" @"r";
    ...
    // access to r
    %7 : java.type:"int" = var.load %3 @loc="38:22";
    // 29 * r
    %8 : java.type:"int" = mul %6 %7 @loc="38:17";
    ...
    // (29 * r + 60 * g)
    %12 : java.type:"int" = add %8 %11 @loc="38:17";
    ...
    // (29 * r + 60 * g + 11 * b)
    %16 : java.type:"int" = add %12 %15 @loc="38:17";
    ...
    // (29 * r + 60 * g + 11 * b) / 100
    %18 : java.type:"int" = div %16 %17 @loc="38:16";
    return %18 @loc="38:9";
};

The declaration of parameter r is modeled as an embedded var operation, initialized with block parameter %0 used as the var operation’s single operand. The operation result, %3, models the parameter as a variable value. A variable value can be loaded from or stored to using variable access operations, respectively modeling an expression that denotes a variable and assignment to a variable. The expression denoting parameter r is modeled as a var.load operation that uses %3 as an operand. The operation result, %7 modeling the current value of r, is used by the mul operation and so on. Finally, the result of the div operation modeling the / operator, %18, is used by the return operation modeling the return statement.

The source code of our method might contain all sorts of syntactic details that javac will represent in its internal model but are extraneous details for code reflection. This complexity is not present in the code model. For example, the same code model would be produced if subexpressions in the return statement were explicitly grouped e.g., (((29 * r) + (60 * g)) + (11 * b)) / (100).

In addition to the code model containing code elements forming a tree it also contains other code items, values (block parameters or operation results) we previously introduced, that form bidirectional dependency graphs between their declaration and their use. A value also has a type element, another code item, modeling the set of all possible values. In our example many of the type elements model Java types, and some model the type of variable values (the type element of the operation result of a var operation).

Astute readers may observe that code models are in Static Single-Assignment (SSA) form, and there is no explicit distinction, as there is in the source code, between statements and expressions. Block parameters and operation results are declared before they are used and cannot be reassigned (and we therefore require special operations and type elements to model variables as we previously showed).

Finally, we can execute the code model by transforming it to byte code, wrapping it in a method handle, and invoking the handle.

var handle = BytecodeGenerator.generate(MethodHandles.lookup(), addModel);
assert ExampleAdd.gray(32, 32, 32) == (int) handle.invokeExact(32, 32, 32);

Building code models

Building code models is an important feature of code reflection that is used by many other areas. For example, javac uses this feature to build code models of reflectable code and the run time uses it when those same models are accessed. Later we shall see how building is composed to support transformation of code models.

We can write Java code to build an equivalent code model we previously accessed and traversed (that was built for us at compile time and run time).

var builtGrayModel = func(
"gray",
CoreType.functionType(INT, INT, INT, INT))
.body((Block.Builder bldr) -> {
    // int r
    Op.Result varR = bldr.op(var("r", bldr.parameters().get(0)));
    // int g
    Op.Result varG = bldr.op(var("g", bldr.parameters().get(1)));
    // int b
    Op.Result varB = bldr.op(var("b", bldr.parameters().get(1)));

    // (((29 * r) + (60 * g)) + 11 * b)
    var sum = bldr.op(add(
            bldr.op(add(
                    bldr.op(mul(
                            bldr.op(constant(INT, 29)),
                            bldr.op(varLoad(varR)))),
                    bldr.op(mul(
                            bldr.op(constant(INT, 60)),
                            bldr.op(varLoad(varG)))))),
            bldr.op(mul(
                    bldr.op(constant(INT, 11)),
                    bldr.op(varLoad(varB))))));
    // return (...) / 100;
    bldr.op(return_(
            bldr.op(div(
                    sum,
                    bldr.op(constant(INT, 100))))));
});

The consuming lambda expression passed to the body method operates on a block builder, instance of Block.Builder, representing the entry block being built. We use that to append operations to the entry block. When an operation is appended it produces an operation result that can be used as an operand of a further operation and so on. It is possible to fluently build complete expressions as expression trees, or build separate subexpressions. In the above example we separated out the building of the subexpression corresponding to the numerator, and used the result of that subexpression, sum, as the first operand of the div operation. The same code model will be built regardless of whether expressions are built fluently or separately as distinct statements (assuming the left-to-right precedence of the return statement's expression is preserved).

When the body method returns a body element and the entry block element it contains will be fully built. Building is carefully designed so that structurally invalid models cannot be built.

We don’t anticipate most users will commonly build complete models of Java code, since it’s a rather verbose and tedious process, although potentially less so than other approaches e.g., building byte code, or using method handle combinators. Building complete models is more likely to be performed by tooling, like javac, and since it is very good at building code models it can be employed to do so for many purposes. Instead, we anticipate many users will build parts of models when they transform them.

Transforming code models

Code reflection supports the transformation of code models by combining traversing and building. A code model transformation is represented by a function that takes an operation, encountered in the (input) model being transformed, and a code model builder for the resulting transformed (output) model, and mediates how, if at all, that operation is transformed into other code elements that are built. We were inspired by the functional transformation approach devised by the Class-File API and adapted that design to work on the tree structure of (immutable) code models.

We can write a simple code model transform that transforms our gray method’s code model, replacing the operation modeling the + operator with an invocation operation modeling an invocation expression to the method Integer.sum.

MethodRef SUM = MethodRef.method(Integer.class, "sum", int.class,
        int.class, int.class);
CodeTransformer grayToMethodTransformer = CodeTransformer.opTransformer((
        Function<Op, Op.Result> builder,
        Op inputOp,
        List<Value> outputOperands) -> {
    switch (inputOp) {
        // Replace a + b; with Integer.sum(a, b);
        case AddOp _ -> builder.apply(invoke(SUM, outputOperands));
        // Copy operation
        default -> builder.apply(inputOp);
    }
});

The code transformation function, passed as lambda expression to CodeTransformer.opTransformer, accepts as parameters a block builder function, builder, an operation encountered when traversing the input code model, inputOp, and a list of values in the output model being built that are associated with input operation’s operands, outputOperands. We must have previously encountered and transformed the input operations whose results are associated with those values, since values can only be used after they have been declared.

In the code transformation we switch over the input operation, and in this case we just match on an add operation and by default any other operation. In the latter case we apply the input operation to the builder function, which creates a new output operation that is a copy of the input operation, appends the new operation to the block being built, and associates the new operation’s result with the input operation’s result. When we match on an add operation we replace it by building part of a code model, a method invoke operation to the Integer.sum method constructed with the given output operands. The result of the output invoke operation is automatically associated with the result of the input add operation.

We can then transform the method’s code model by invoking the FuncOp.transform method with the code transformer as an argument.

FuncOp transformedGrayModel = grayModel.transform(grayToMethodTransformer);
IO.println(transformedGrayModel.toText());

The transformed code model is naturally very similar to the input code model.

func @loc="36:5:file:/.../GPUExample.java" @"gray" (
        %0 : java.type:"int", %1 : java.type:"int", %2 : java.type:"int")java.type:"int" -> {
    %3 : Var<java.type:"int"> = var %0 @loc="36:5" @"r";
    %4 : Var<java.type:"int"> = var %1 @loc="36:5" @"g";
    %5 : Var<java.type:"int"> = var %2 @loc="36:5" @"b";
    %6 : java.type:"int" = constant @loc="38:17" @29;
    %7 : java.type:"int" = var.load %3 @loc="38:22";
    %8 : java.type:"int" = mul %6 %7 @loc="38:17";
    %9 : java.type:"int" = constant @loc="38:26" @60;
    %10 : java.type:"int" = var.load %4 @loc="38:31";
    %11 : java.type:"int" = mul %9 %10 @loc="38:26";
    %12 : java.type:"int" = invoke %8 %11 @java.ref:"java.lang.Integer::sum(int, int):int";
    %13 : java.type:"int" = constant @loc="38:35" @11;
    %14 : java.type:"int" = var.load %5 @loc="38:40";
    %15 : java.type:"int" = mul %13 %14 @loc="38:35";
    %16 : java.type:"int" = invoke %12 %15 @java.ref:"java.lang.Integer::sum(int, int):int";
    %17 : java.type:"int" = constant @loc="38:45" @100;
    %18 : java.type:"int" = div %16 %17 @loc="38:16";
    return %18 @loc="38:9";
};

We can observe the two add operations have been replaced with two invoke operations. Also, by default, each operation that was copied preserves line number information. This code transformation can also be applied unmodified to the code model of our lambda expression or to more complex models containing many + operators in arbitrarily nested positions.

The code transformation function is not a direct implementation of functional interface CodeTransformer. Instead, we adapted from another functional interface, which is easier to implement for simpler transformations on operations. Direct implementations of CodeTransformer are more complex but are also capable of more complex transformations, such as building new blocks and retaining more control over associating items in the input and output models. Code reflection provides many complex code transformers, such as those for progressively lowering code models, converting models into pure SSA-form, and inlining models into other models. We will continue to explore the code model transformation design to better understand how we can improve it across the spectrum of simple to complex transformations.

Future work

We shall explore access to code models at compile time. Code reflection provides very basic support for annotation processors to access code models of program elements, and while useful for advanced experimentation it needs more consideration.

As the language evolves we shall look for opportunities to enhance code reflection to take advantage of new language features, especially features related to pattern matching and data-oriented programming. We anticipate pattern matching will strongly influence code reflection and enhance the querying of code models. Furthermore, this is an opportunity to provide feedback on language features.

We need to explore the language feature for declaration of reflectable code. Use of the @Reflect annotation is a temporary solution that is good enough for incubation but insufficient for preview.

We need to ensure that a library using code reflection can operate on code models produced by a JDK version that is greater that the version it was compiled against. Such forward compatibility is challenging. We shall explore solutions, such as the library declaring an upper bound of JDK versions of reflective code it supports or enabling the lowering of a modeled language feature a library is not capable of processing to modeled features it can (potentially compromising high fidelity but still preserving programing meaning).

Alternatives

Compiler Tree API

The com.sun.source package of the jdk.compiler module contains the javac API for accessing the abstract trees (ASTs) representing Java source code. Javac uses an implementation of this API when parsing source code. This API is not suitable for standardization as it is too intertwined with javac’s implementation, since javac reserves the right to make breaking changes to this API as the language evolves. More generally ASTs can be difficult to analyze and transform. For example, a modern optimizing compiler will transform its AST representing source code into another slightly lower form, an intermediate representation, that is easier to analyze and transform to executable code.

Bytecode

Bytecode is not easily accessible nor guaranteed to be so at run time, and even if we made it so it would not be ideal. The translation of Java source to bytecode by javac will result in numerous Java language features being translated away, making it hard to recover them e.g., lambda expressions are translated into invoke dynamic instructions and synthetic methods. Bytecode is also, by default, too low-level which makes it difficult to analyze and transform. For example, the HotSpot C2 compiler will transform bytecode into another higher form, an intermediate representation, that is easier to analyze and transform to executable code.

C# expression trees

C# expression trees represent code in a tree-like data structure, where each node is an expression. The C# compiler supports the declaration of expressible C# code to access to the code's expression tree. Expressions trees can also be built directly.

C# expression trees have many similarities to code reflection. However, C# is limited in the set of C# statements and expressions it can model and is therefore limited in what C# code it can access as expression trees. Code reflection can model nearly all Java statements and expressions and therefore can access a wider variety of Java code as code models. Further, code models are more suited to analysis and transformation since they combine the properties of models used by source compilers and optimizing compilers.

C#'s language feature for declaration of expressible code is superior to code reflection's use of the @Reflect annotation to declare reflectable code. We can learn from C# when devising the equivalent Java langauge feature.

Testing

Testing will focus on a suite of unit tests for the compiler and runtime that give high modeling coverage and code coverage. Where possible we try to operationally reuse code reflection features, such as when storing and loading models in class files.

We need to ensure that Java code models produced by the compiler preserve Java program meaning. We will select an existing suite of Java tests and recompile the source they test, using a special javac internal flag, such that the bytecode is generated from code models produced by the compiler. Testing against these specially compiled sources must yield the same results as testing against the ordinarily compiled sources.

Risks and Assumptions

While incubating we will strive to keep the number of changes required to code in the java.base and jdk.compiler modules to a minimum, thereby reducing the burden on maintainers and reviewers. So far the changes are modest.

Introduction of a new language feature, even a modest one, is a significant effort with numerous tasks to update many areas of the platform. Code reflection will add to that list of tasks, since the language feature will need to be modeled and supported like existing modeled features. There is a risk it will require significant effort to model, especially with high fidelity. We think this risk is mitigated by the generic modeling capabilities of code models, and that we can currently model all Java statements and expressions with high fidelity.