JEP 371: Hidden Classes

Owner	Mandy Chung
Type	Feature
Scope	SE
Status	Closed / Delivered
Release	15
Component	core-libs / java.lang.invoke
Discussion	valhalla dash dev at openjdk dot java dot net
Effort	L
Duration	L
Reviewed by	Alex Buckley, David Holmes, John Rose, Maurizio Cimadamore, Paul Sandoz
Endorsed by	John Rose
Created	2019/03/13 17:37
Updated	2020/10/07 16:39
Issue	8220607

Summary

Introduce hidden classes, which are classes that cannot be used directly by the bytecode of other classes. Hidden classes are intended for use by frameworks that generate classes at run time and use them indirectly, via reflection. A hidden class may be defined as a member of an access control nest, and may be unloaded independently of other classes.

Goals

Allow frameworks to define classes as non-discoverable implementation details of the framework, so that they cannot be linked against by other classes nor discovered through reflection.
Support extending an access control nest with non-discoverable classes.
Support aggressive unloading of non-discoverable classes, so that frameworks have the flexibility to define as many as they need.
Deprecate the non-standard API sun.misc.Unsafe::defineAnonymousClass, with the intent to deprecate it for removal in a future release.
Do not change the Java programming language in any way.

Non-Goals

It is not a goal to support all the functionality of sun.misc.Unsafe::defineAnonymousClass, such as constant-pool patching.

Motivation

Many language implementations built on the JVM rely upon dynamic class generation for flexibility and efficiency. For example, in the case of the Java language, javac does not translate a lambda expression into a dedicated class file at compile time but, rather, emits bytecode that dynamically generates and instantiates a class to yield an object corresponding to the lambda expression when needed. Similarly, runtimes for non-Java languages often implement the higher-order features of those languages by using dynamic proxies, which also generate classes dynamically.

Language implementors usually intend for a dynamically generated class to be logically part of the implementation of a statically generated class. This intent suggests various properties that are desirable for dynamically generated classes:

Non-discoverability. Being independently discoverable by name is not only unnecessary but harmful. It undermines the goal that the dynamically generated class is merely an implementation detail of the statically generated class.
Access control. It may be desirable to extend the access control context of the statically generated class to include the dynamically generated class.
Lifecycle. Dynamically generated classes may only be needed for a limited time, so retaining them for the lifetime of the statically generated class might unnecessarily increase memory footprint. Existing workarounds for this situation, such as per-class class loaders, are cumbersome and inefficient.

Unfortunately, the standard APIs that define a class -- ClassLoader::defineClass and Lookup::defineClass -- are indifferent to whether the bytecodes of the class were generated dynamically (at run time) or statically (at compile time). These APIs always define a visible class that will be used every time another class in the same loader hierarchy tries to link a class of that name. Consequently, the class may be more discoverable or have a longer lifecycle than desired. In addition, the APIs can only define a class that will act as a member of a nest if the nest's host class knows the name of the member class in advance; practically speaking, this prevents dynamically generated classes from being members of a nest.

If a standard API could define hidden classes that are not discoverable and have a limited lifecycle, then frameworks both inside and outside of the JDK that generate classes dynamically could instead define hidden classes. This would improve the efficiency of all language implementations built on the JVM. For example:

java.lang.reflect.Proxy could define hidden classes to act as the proxy classes which implement proxy interfaces;
java.lang.invoke.StringConcatFactory could generate hidden classes to hold the constant-concatenation methods;
java.lang.invoke.LambdaMetaFactory could generate hidden nestmate classes to hold lambda bodies that access enclosing variables; and
A JavaScript engine could generate hidden classes for the bytecode translated from JavaScript programs, knowing that the classes will be unloaded when the engine no longer uses them.

Description

The Lookup API introduced in Java 7 allows a class to obtain a lookup object that provides reflective access to classes, methods, and fields. Crucially, no matter what code ends up using a lookup object, the reflective access always occurs in the context of the class which originally obtained the lookup object -- the lookup class. In effect, a lookup object transmits the access rights of the lookup class to any code which receives the object.

Java 9 enhanced the transmission capabilities of lookup objects by introducing the method Lookup::defineClass(byte[]). From the bytes supplied, this method defines a new class in the same context as the class which originally obtained the lookup object. That is, the newly-defined class has the same defining class loader, run-time package, and protection domain as the lookup class.

This JEP proposes to extend the Lookup API to support defining a hidden class that can only be accessed by reflection. A hidden class is not discoverable by the JVM during bytecode linkage, nor by programs making explicit use of class loaders (via, e.g., Class::forName and ClassLoader::loadClass). A hidden class can be unloaded when it is no longer reachable, or it can share the lifetime of a class loader so that it is unloaded only when the class loader is garbage collected. Optionally, a hidden class can be created as a member of an access control nest.

For brevity, this JEP speaks of a "hidden class", but it should be understood to mean a hidden class or interface. Similarly, a "normal class" means a normal class or interface, the result of ClassLoader::defineClass.

Creating a hidden class

Whereas a normal class is created by invoking ClassLoader::defineClass, a hidden class is created by invoking Lookup::defineHiddenClass. This causes the JVM to derive a hidden class from the supplied bytes, link the hidden class, and return a lookup object that provides reflective access to the hidden class. The invoking program should store the lookup object carefully, for it is the only way to obtain the Class object of the hidden class.

The supplied bytes must be a ClassFile structure (JVMS 4.1). The derivation of a hidden class by Lookup::defineHiddenClass is similar to the derivation of a normal class by ClassLoader::defineClass, with one major difference discussed below. After the hidden class is derived, it is linked as for a normal class (JVMS 5.4), except that no loading constraints are imposed. After the hidden class is linked, it is initialized if the initialize argument of Lookup::defineHiddenClass is true; if the argument is false, then the hidden class will be initialized when reflective methods instantiate it or access its members.

The major difference in how a hidden class is created lies in the name it is given. A hidden class is not anonymous. It has a name that is available via Class::getName and may be shown in diagnostics (such as the output of java -verbose:class), in JVM TI class loading events, in JFR events, and in stack traces. However, the name has a sufficiently unusual form that it effectively makes the class invisible to all other classes. The name is the concatenation of:

The binary name in internal form (JVMS 4.2.1) specified by this_class in the ClassFile structure, say A/B/C;
The '.' character; and
An unqualified name (JVMS 4.2.2) that is chosen by the JVM implementation.

For example, if this_class specifies com/example/Foo (the internal form of the binary name com.example.Foo), then a hidden class derived from the ClassFile structure may be named com/example/Foo.1234. This string is neither a binary name nor the internal form of a binary name.

Given a hidden class whose name is A/B/C.x, the result of Class::getName is the concatenation of:

The binary name A.B.C (obtained by taking A/B/C and replacing each '/' with '.');
The '/' character; and
The unqualified name x.

For example, if a hidden class is named com/example/Foo.1234, then the result of Class::getName is com.example.Foo/1234. Again, this string is neither a binary name nor the internal form of a binary name.

The namespace of hidden classes is disjoint from the namespace of normal classes. Given a ClassFile structure where this_class specifies com/example/Foo/1234, invoking cl.defineClass("com.example.Foo.1234", bytes, ...) merely results in a normal class named com.example.Foo.1234, distinct from the hidden class named com.example.Foo/1234. It is impossible to create a normal class named com.example.Foo/1234 because cl.defineClass("com.example.Foo/1234", bytes, ...) will reject the string argument as being not a binary name.

We acknowledge that not using binary names for the names of hidden classes is potentially a source of problems, but it is compatible with the longstanding practice of Unsafe::defineAnonymousClass (see discussion here). The use of / to indicate a hidden class in the Class::getName output is also aligned stylistically with the use of / in stack traces to qualify a class by its defining module and loader (see StackTraceElement::toString). The error log below reveals two hidden classes, both in module m1: one hidden class has a method test, the other has a method apply.
java.lang.Error: thrown from hidden class com.example.Foo/0x0000000800b7a470
    at m1/com.example.Foo/0x0000000800b7a470.toString(Foo.java:16)
    at m1/com.example.Foo_0x0000000800b7a470$$Lambda$29/0x0000000800b7c040.apply(<Unknown>:1000001)
    at m1/com.example.Foo/0x0000000800b7a470.test(Foo.java:11)

Hidden classes and class loaders

Despite the fact that a hidden class has a corresponding Class object, and the fact that a hidden class's supertypes are created by class loaders, no class loader is involved in the creation of the hidden class itself. Notice that this JEP never says that a hidden class is "loaded". No class loaders are recorded as initiating loaders of a hidden class, and no loading constraints are generated that involve hidden classes. Consequently, hidden classes are not known to any class loader: A symbolic reference in the run-time constant pool of a class D to a class C denoted by N will never resolve to a hidden class for any value of D, C, and N. The reflective methods Class::forName, ClassLoader::findLoadedClass, and Lookup::findClass will not find hidden classes.

Notwithstanding this detachment from class loaders, a hidden class is deemed to have a defining class loader. This is necessary to resolve types used by the hidden class's own fields and methods. In particular, a hidden class has the same defining class loader, runtime package, and protection domain as the lookup class, which is the class that originally obtained the lookup object on which Lookup::defineHiddenClass is invoked.

Using a hidden class

Lookup::defineHiddenClass returns a Lookup object whose lookup class is the newly created hidden class. A Class object can be obtained for the hidden class by invoking Lookup::lookupClass on the returned Lookup object. Via the Class object, the hidden class can be instantiated and its members accessed as if it was a normal class, except for four restrictions:

Class::getName returns a string that is not a binary name, as described earlier.
Class::getCanonicalName returns null, indicating the hidden class has no canonical name. (Note that the Class object for an anonymous class in the Java language has the same behavior.)
Final fields declared in a hidden class are not modifiable. Field::set and other setter methods on a final field of a hidden class will throw IllegalAccessException regardless of the field's accessible flag.
The Class object is not modifiable by instrumentation agents, and cannot be redefined or retransformed by JVM TI agents. We will, however, extend JVM TI and JDI to support hidden classes, such as testing whether a class is hidden, including hidden classes in any list of "loaded" classes, and sending JVM TI events when hidden classes are created.

It is important to realize that the only way for other classes to use a hidden class is indirectly, via its Class object. The hidden class cannot be used directly by bytecode instructions in other classes because it cannot be referenced nominally, that is, by name. For example, suppose a framework learns of a hidden class named com.example.Foo/1234, and manufactures a class file which attempts to instantiate the hidden class. Code in the class file would contain a new instruction that ultimately points to a constant pool entry which denotes the name. If the framework attempts to denote the name as com/example/Foo.1234, then the class file will be invalid -- com/example/Foo.1234 is not a valid internal form of a binary name. On the other hand, if the framework attempts to denote the name in the valid internal form com/example/Foo/1234, then the JVM would resolve the constant pool entry by first converting the name in internal form to a binary name, com.example.Foo.1234, and then trying to load a class of that name; this will most likely fail, and will certainly not find the hidden class named com.example.Foo/1234. The hidden class is not truly anonymous, since its name is exposed, but it is effectively invisible.

Without the ability of the constant pool to refer nominally to a hidden class, there is no way to use a hidden class as a superclass, field type, return type, or parameter type. This lack of usability is reminiscent of anonymous classes in the Java language, but hidden classes go further: An anonymous class can enclose other classes in order to let them access its members, but a hidden class cannot enclose other classes (their InnerClasses attributes cannot name it). Even a hidden class is unable to use itself as a field type, return type, or parameter type in its own field and method declarations.

Importantly, code in a hidden class can use the hidden class directly, without relying on the Class object. This is because bytecode instructions in a hidden class can refer to the hidden class symbolically (without concern for its name) rather than nominally. For example, a new instruction in a hidden class can instantiate the hidden class via a constant pool entry which refers directly to the this_class item in the current ClassFile. Other instructions, such as getstatic, getfield, putstatic, putfield, invokestatic, and invokevirtual, can access members of the hidden class via the same constant pool entry. Direct use inside the hidden class is important because it simplifies generation of hidden classes by language runtimes and frameworks.

A hidden class generally has the same powers of reflection as a normal class. That is, code in a hidden class may define normal classes and hidden classes, and may manipulate normal classes and hidden classes via their Class objects. A hidden class may even act as a lookup class. That is, code in a hidden class may obtain a lookup object on itself, which helps with hidden nestmates (see below).

Hidden classes in stack traces

Methods of hidden classes are not shown in stack traces by default. They represent implementation details of language runtimes, and are never expected to be useful to developers diagnosing application issues. However, they can be included in stack traces via the options -XX:+UnlockDiagnosticVMOptions -XX:+ShowHiddenFrames.

There are three APIs which reify stack traces: Throwable::getStackTrace, Thread::getStackTrace and the newer StackWalker API introduced in Java 9. For the Throwable::getStackTrace and Thread::getStackTrace API, stack frames for hidden classes are omitted by default; they can be included with the same options as for stack traces above. For the StackWalker API, stack frames for hidden classes should be included by a JVM implementation only if the SHOW_HIDDEN_FRAMES option is set. This allows stack-trace filtering to omit unnecessary information when developers are diagnosing application issues.

Hidden classes in access control nests

Introduced in Java 11 by JEP 181, a nest is a set of classes that allow access to each other's private members but without any of the backdoor accessibility-broadening methods usually associated with nested classes in the Java language. The set is defined statically: One class serves as the nest host, its class file enumerating the other classes that are nest members; in turn, the nest members indicate in their class files which class hosts the nest. While static membership works well for class files generated from Java source code, it is usually insufficient for class files generated dynamically by language runtimes. To help such runtimes, and to encourage the use of Lookup::defineHiddenClass over Unsafe::defineAnonymousClass, a hidden class can join a nest at run time; a normal class cannot.

A hidden class can be created as a member of an existing nest by passing the NESTMATE option to Lookup::defineHiddenClass. The nest which the hidden class joins is not determined by an argument to Lookup::defineHiddenClass. Instead, the nest to be joined is inferred from the lookup class, that is, from the class whose code initially obtained the lookup object: The hidden class is a member of the same nest as the lookup class (see below).

In order for Lookup::defineHiddenClass to add hidden classes to the nest, the lookup object must have the proper permissions, namely PRIVATE and MODULE access. These permissions assert that the lookup object was obtained by the lookup class with the intent of allowing other code to expand the nest.

The JVM disallows nested nests. A member of one nest cannot serve as the host of another nest, regardless of whether nest membership is defined statically or dynamically.

The lookup class's membership of a nest may be indicated statically (via NestHost) if the lookup class is a normal class, or it may have been set dynamically if the lookup class is a hidden class. Static nest membership is validated lazily. It is important for a language runtime or framework library to be able to add hidden classes to the nest of a lookup class that may have a bad nest membership. As an example, consider the LambdaMetaFactory framework introduced in Java 8. When the source code of a class C contains a lambda expression, the corresponding C.class file uses LambdaMetaFactory at run time to define a hidden class that holds the body of the lambda expression and implements the required functional interface. C.class may have a bad NestHost attribute but the execution of C never references the class H named in the NestHost attribute. Since the lambda body may access private members of C, the hidden class needs to be able to access them too; accordingly, LambdaMetaFactory attempts to define the hidden class as a member of the nest hosted by C.

Suppose that we have a lookup class, C, and that defineHiddenClass is invoked with the NESTMATE option to create a hidden class and add it into a nest of C. The nest host of the hidden class is determined as follows:

If C is a normal class and lacks a NestHost attribute, then C is its own host and also is the nest host of the hidden class.
If C is a normal class with a valid NestHost attribute named H, then the nest host of C, H, is the nest host of the hidden class. In this case, the hidden class is added as a member of the nest of H.
If C is a normal class with a bad NestHost attribute, then C is used as the nest host of the hidden class.
If C is a hidden class created without NESTMATE option, then C is its own host and also is the nest host of the hidden class.
If C is a hidden class created with NESTMATE option and dynamically added to the nest of D, then the nest host of D is used as the nest host of the hidden class.

If a hidden class is created without the NESTMATE option then the hidden class is the host of its own nest. This aligns with the policy that every class is either a member of a nest with another class as nest host, or else is itself the nest host of a nest. The hidden class can create additional hidden classes as members of its nest: Code in the hidden class first obtains a lookup object on itself, then invokes Lookup::defineHiddenClass on the object and passes the NESTMATE option.

Given the Class object for a hidden class created as a member of a nest, Class::getNestHost and Class::isNestmateOf will work as expected. Class::getNestMembers can be called on the Class object of any class in the nest -- whether member or host, whether normal or hidden -- but returns only the members defined statically (that is, the normal classes enumerated by NestMembers in the host) along with the nest host.

Class::getNestMembers does not include the hidden classes added to the nest dynamically because hidden classes are non-discoverable and should only be of interest to the code that created them, which knows the nest membership already. This prevents a hidden class from leaking through the nest membership if intended to be kept private.

Unloading hidden classes

A class defined by a class loader has a strong relationship with that class loader. In particular, every Class object has a reference to the ClassLoader that defined it. This tells the JVM which loader to use when resolving symbols in the class. One consequence of this relationship is that a normal class cannot be unloaded unless its defining loader can be reclaimed by the garbage collector (JLS 12.7). Being able to reclaim the defining loader implies there are no live references to the loader, which in turn implies there are no live references to any of the classes defined by the loader. (Such classes, if they were reachable, would refer to the loader.) This widespread lack of liveness is the only state where it is safe to unload a normal class.

Accordingly, to maximize the chance of unloading a normal class, it is important to minimize references to both the class and its defining loader. Language runtimes typically achieve this by creating many class loaders, each dedicated to defining just one class, or perhaps a small number of related classes. When all instances of a class are reclaimed, and assuming the runtime does not hold on to the class loader, both the class and its defining loader can be reclaimed. However, the resulting large number of class loaders is demanding on memory. In addition, ClassLoader::defineClass is considerably slower than Unsafe::defineAnonymousClass according to microbenchmarks.

A hidden class is not created by a class loader and has only a loose connection to the class loader deemed to be its defining loader. We can turn these facts to our advantage by allowing a hidden class to be unloaded even if its notional defining loader cannot be reclaimed by the garbage collector. As long as there are live references to a hidden class -- either to instances of the hidden class, or to its Class object -- then the hidden class keeps its notional defining loader alive so that the JVM can use that loader to resolve symbols in the hidden class. When the last live reference to the hidden class goes away, however, the loader need not return the favor by keeping the hidden class alive.

Unloading a normal class while its defining loader is reachable is unsafe because the loader may later be asked, either by the JVM or or by code using reflection, to reload the class, that is, to load a class with the same name. This can have unpredictable effects when static initializers are run for a second time. There is no such concern about unloading a hidden class, since hidden classes are not created in the same manner. Because a hidden class's name is an output of Lookup::defineHiddenClass, not an input, there is no way to recreate the "same" hidden class that was unloaded previously.

By default, Lookup::defineHiddenClass will create a hidden class that can be unloaded regardless of whether its notional defining loader is still alive. That is, when all instances of the hidden class are reclaimed and the hidden class is no longer reachable, it may be unloaded even though its notional defining loader is still reachable. This behavior is useful when a language runtime creates a hidden class to serve multiple classes defined by arbitrary class loaders: The runtime will see an improvement in footprint and performance relative to both ClassLoader::defineClass and Unsafe::defineAnonymousClass. In other cases, a language runtime may link a hidden class to just one normal class, or perhaps a small number of normal classes, with the same defining loader as the hidden class. In such cases, where the hidden class must be coterminous with a normal class, the STRONG option may be passed to Lookup::defineHiddenClass. This arranges for a hidden class to have the same strong relationship with its notional defining loader as a normal class has with its defining loader, which is to say, the hidden class will only be unloaded if its notional defining loader can be reclaimed.

Alternatives

There is no alternative to injecting a nestmate at run time aside from the existing workaround of generating package-private access bridges for the proxy class to access private members of a target class. There is no alternative to hide a class from other classes if it is visible to a class loader.

Testing

We will update LambdaMetaFactory, StringConcatFactory, and LambdaForms to use the new APIs. Performance testing will ensure no regressions in lambda linkage or string concatenation.
Unit tests for the new APIs will be developed.

Risks and Assumptions

We assume that developers who currently use Unsafe::defineAnonymousClass will be able to migrate to Lookup::defineHiddenClass easily. Developers should be aware of three minor constraints on the functionality of hidden classes relative to VM-anonymous classes.

Protected access. Surprisingly, a VM-anonymous class can access protected members of its host class even if the VM-anonymous class exists in a different run-time package and is not a subclass of the host class. In contrast, access control rules are applied properly for hidden classes: A hidden class can only access protected members of another class if the hidden class is in the same run-time package as, or a subclass of, the other class. There is no special access for a hidden class to the protected members of the lookup class.
Constant-pool patching. A VM-anonymous class can be defined with its constant-pool entries already resolved to concrete values. This allows critical constants to be shared between a VM-anonymous class and the language runtime that defines it, and between multiple VM-anonymous classes. For example, a language runtime will often have MethodHandle objects in its address space that would be useful to newly-defined VM-anonymous classes. Instead of the runtime serializing the objects to constant-pool entries in VM-anonymous classes and then generating bytecode in those classes to laboriously ldc the entries, the runtime can simply supply Unsafe::defineAnonymousClass with references to its live objects. The relevant constant-pool entries in the newly-defined VM-anonymous class are pre-linked to those objects, improving performance and reducing footprint. In addition, this allows VM-anonymous classes to refer to each other: Constant-pool entries in a class file are based on names. They thus cannot refer to nameless VM-anonymous classes. A language runtime can, however, easily track the live Class objects for its VM-anonymous classes and supply them to Unsafe::defineAnonymousClass, thus pre-linking the new class's constant pool entries to other VM-anonymous classes. The Lookup::defineHiddenClass method will not have these capabilities because a future enhancement may offer pre-linking of constant pool entries to all classes uniformly.
Self-control of optimization. VM-anonymous classes were designed on the assumption that only JDK code would define them. Consequently, VM-anonymous classes have an unusual ability that was previously available only to classes in the JDK, namely control of their own optimization by the HotSpot JVM. Control is exerted via annotation attributes in a VM-anonymous class's defining bytes: @ForceInline or @DontInline causes HotSpot to always-inline or never-inline a method, while @Stable causes HotSpot to treat a non-null field as a foldable constant. However, very few of the VM-anonymous classes dynamically defined by JDK code have needed this ability. It is even possible that future enhancements will make these optimizations obsolete. Accordingly, hidden classes will not have the ability to control their optimization, even when defined by JDK code. (This is not thought to present any risk to the migration of JDK code from defining VM-anonymous classes to defining hidden classes.)

As a related matter, VM-anonymous classes can use the @Hidden annotation to prevent their methods from appearing in stack traces. Of course, this functionality is automatic for hidden classes, and may be offered to other classes in future.

Migration should take the following into account:

To invoke private nestmate instance methods from code in a hidden class, use invokevirtual or invokeinterface instead of invokespecial. Generated bytecode that uses invokespecial to invoke a private nestmate instance method will fail verification. invokespecial should only be used to invoke private nestmate constructors.
As noted earlier, invoking getName on the Class object of a hidden class returns a string that is not a binary name, since it contains a / character. User-level code is not expected to come into contact with such Class objects, but framework-level code that assumes every class has a binary name may need to be updated to handle hidden classes. Framework-level code that was previously updated to handle VM-anonymous classes will continue to work, since hidden classes use the same naming convention as VM-anonymous classes.
JVM TI GetClassSignature returns a JNI-style signature and returns a string that is not a binary name in internal form, such as a string containing a . character. JVM TI agents and tools that assume every class has a binary name may need to be updated to handle hidden classes. On the other hand, the JDI implementation has been updated to handle hidden classes. Hidden classes are not modifiable by JVM TI agents. The tools impacted by the class signature of hidden classes should be limited.

Dependencies

JEP 181 (Nest-Based Access Control) introduced nest-based access control contexts, in which all classes and interfaces in a nest share private access among the nestmates.