JEP 371: Hidden Classes
Owner | Mandy Chung |
Type | Feature |
Scope | SE |
Status | Closed / Delivered |
Release | 15 |
Component | core-libs / java.lang.invoke |
Discussion | valhalla dash dev at openjdk dot java dot net |
Effort | L |
Duration | L |
Reviewed by | Alex Buckley, David Holmes, John Rose, Maurizio Cimadamore, Paul Sandoz |
Endorsed by | John Rose |
Created | 2019/03/13 17:37 |
Updated | 2020/10/07 16:39 |
Issue | 8220607 |
Summary
Introduce hidden classes, which are classes that cannot be used directly by the bytecode of other classes. Hidden classes are intended for use by frameworks that generate classes at run time and use them indirectly, via reflection. A hidden class may be defined as a member of an access control nest, and may be unloaded independently of other classes.
Goals
-
Allow frameworks to define classes as non-discoverable implementation details of the framework, so that they cannot be linked against by other classes nor discovered through reflection.
-
Support extending an access control nest with non-discoverable classes.
-
Support aggressive unloading of non-discoverable classes, so that frameworks have the flexibility to define as many as they need.
-
Deprecate the non-standard API
sun.misc.Unsafe::defineAnonymousClass
, with the intent to deprecate it for removal in a future release. -
Do not change the Java programming language in any way.
Non-Goals
- It is not a goal to support all the functionality of
sun.misc.Unsafe::defineAnonymousClass
, such as constant-pool patching.
Motivation
Many language implementations built on the JVM rely upon dynamic class generation for flexibility and efficiency. For example, in the case of the Java language, javac
does not translate a lambda expression into a dedicated class
file at compile time but, rather, emits bytecode that dynamically generates and instantiates a class to yield an object corresponding to the lambda expression when needed. Similarly, runtimes for non-Java languages often implement the higher-order features of those languages by using dynamic proxies, which also generate classes dynamically.
Language implementors usually intend for a dynamically generated class to be logically part of the implementation of a statically generated class. This intent suggests various properties that are desirable for dynamically generated classes:
-
Non-discoverability. Being independently discoverable by name is not only unnecessary but harmful. It undermines the goal that the dynamically generated class is merely an implementation detail of the statically generated class.
-
Access control. It may be desirable to extend the access control context of the statically generated class to include the dynamically generated class.
-
Lifecycle. Dynamically generated classes may only be needed for a limited time, so retaining them for the lifetime of the statically generated class might unnecessarily increase memory footprint. Existing workarounds for this situation, such as per-class class loaders, are cumbersome and inefficient.
Unfortunately, the standard APIs that define a class -- ClassLoader::defineClass
and Lookup::defineClass
-- are indifferent to whether the bytecodes of the class were generated dynamically (at run time) or statically (at compile time). These APIs always define a visible class that will be used every time another class in the same loader hierarchy tries to link a class of that name. Consequently, the class may be more discoverable or have a longer lifecycle than desired. In addition, the APIs can only define a class that will act as a member of a nest if the nest's host class knows the name of the member class in advance; practically speaking, this prevents dynamically generated classes from being members of a nest.
If a standard API could define hidden classes that are not discoverable and have a limited lifecycle, then frameworks both inside and outside of the JDK that generate classes dynamically could instead define hidden classes. This would improve the efficiency of all language implementations built on the JVM. For example:
-
java.lang.reflect.Proxy
could define hidden classes to act as the proxy classes which implement proxy interfaces; -
java.lang.invoke.StringConcatFactory
could generate hidden classes to hold the constant-concatenation methods; -
java.lang.invoke.LambdaMetaFactory
could generate hidden nestmate classes to hold lambda bodies that access enclosing variables; and -
A JavaScript engine could generate hidden classes for the bytecode translated from JavaScript programs, knowing that the classes will be unloaded when the engine no longer uses them.
Description
The Lookup
API introduced in Java 7 allows a class to obtain a lookup object that provides reflective access to classes, methods, and fields. Crucially, no matter what code ends up using a lookup object, the reflective access always occurs in the context of the class which originally obtained the lookup object -- the lookup class. In effect, a lookup object transmits the access rights of the lookup class to any code which receives the object.
Java 9 enhanced the transmission capabilities of lookup objects by introducing the method Lookup::defineClass(byte[])
. From the bytes supplied, this method defines a new class in the same context as the class which originally obtained the lookup object. That is, the newly-defined class has the same defining class loader, run-time package, and protection domain as the lookup class.
This JEP proposes to extend the Lookup
API to support defining a hidden class that can only be accessed by reflection. A hidden class is not discoverable by the JVM during bytecode linkage, nor by programs making explicit use of class loaders (via, e.g., Class::forName
and ClassLoader::loadClass
). A hidden class can be unloaded when it is no longer reachable, or it can share the lifetime of a class loader so that it is unloaded only when the class loader is garbage collected. Optionally, a hidden class can be created as a member of an access control nest.
For brevity, this JEP speaks of a "hidden class", but it should be understood to mean a hidden class or interface. Similarly, a "normal class" means a normal class or interface, the result of ClassLoader::defineClass
.
Creating a hidden class
Whereas a normal class is created by invoking ClassLoader::defineClass
, a hidden class is created by invoking Lookup::defineHiddenClass
. This causes the JVM to derive a hidden class from the supplied bytes, link the hidden class, and return a lookup object that provides reflective access to the hidden class. The invoking program should store the lookup object carefully, for it is the only way to obtain the Class
object of the hidden class.
The supplied bytes must be a ClassFile
structure (JVMS 4.1). The derivation of a hidden class by Lookup::defineHiddenClass
is similar to the derivation of a normal class by ClassLoader::defineClass
, with one major difference discussed below. After the hidden class is derived, it is linked as for a normal class (JVMS 5.4), except that no loading constraints are imposed. After the hidden class is linked, it is initialized if the initialize
argument of Lookup::defineHiddenClass
is true; if the argument is false, then the hidden class will be initialized when reflective methods instantiate it or access its members.
The major difference in how a hidden class is created lies in the name it is given. A hidden class is not anonymous. It has a name that is available via Class::getName
and may be shown in diagnostics (such as the output of java -verbose:class
), in JVM TI class loading events, in JFR events, and in stack traces. However, the name has a sufficiently unusual form that it effectively makes the class invisible to all other classes. The name is the concatenation of:
-
The binary name in internal form (JVMS 4.2.1) specified by
this_class
in theClassFile
structure, sayA/B/C
; -
The
'.'
character; and -
An unqualified name (JVMS 4.2.2) that is chosen by the JVM implementation.
For example, if this_class
specifies com/example/Foo
(the internal form of the binary name com.example.Foo
), then a hidden class derived from the ClassFile
structure may be named com/example/Foo.1234
. This string is neither a binary name nor the internal form of a binary name.
Given a hidden class whose name is A/B/C.x
, the result of Class::getName
is the concatenation of:
- The binary name
A.B.C
(obtained by takingA/B/C
and replacing each'/'
with'.'
); - The '/' character; and
- The unqualified name
x
.
For example, if a hidden class is named com/example/Foo.1234
, then the result of Class::getName
is com.example.Foo/1234
. Again, this string is neither a binary name nor the internal form of a binary name.
The namespace of hidden classes is disjoint from the namespace of normal classes. Given a ClassFile
structure where this_class
specifies com/example/Foo/1234
, invoking cl.defineClass("com.example.Foo.1234", bytes, ...)
merely results in a normal class named com.example.Foo.1234
, distinct from the hidden class named com.example.Foo/1234
. It is impossible to create a normal class named com.example.Foo/1234
because cl.defineClass("com.example.Foo/1234", bytes, ...)
will reject the string argument as being not a binary name.
We acknowledge that not using binary names for the names of hidden classes is potentially a source of problems, but it is compatible with the longstanding practice of
Unsafe::defineAnonymousClass
(see discussion here). The use of/
to indicate a hidden class in theClass::getName
output is also aligned stylistically with the use of/
in stack traces to qualify a class by its defining module and loader (seeStackTraceElement::toString
). The error log below reveals two hidden classes, both in modulem1
: one hidden class has a methodtest
, the other has a methodapply
.java.lang.Error: thrown from hidden class com.example.Foo/0x0000000800b7a470 at m1/com.example.Foo/0x0000000800b7a470.toString(Foo.java:16) at m1/com.example.Foo_0x0000000800b7a470$$Lambda$29/0x0000000800b7c040.apply(<Unknown>:1000001) at m1/com.example.Foo/0x0000000800b7a470.test(Foo.java:11)
Hidden classes and class loaders
Despite the fact that a hidden class has a corresponding Class
object, and the fact that a hidden class's supertypes are created by class loaders, no class loader is involved in the creation of the hidden class itself. Notice that this JEP never says that a hidden class is "loaded". No class loaders are recorded as initiating loaders of a hidden class, and no loading constraints are generated that involve hidden classes. Consequently, hidden classes are not known to any class loader: A symbolic reference in the run-time constant pool of a class D
to a class C
denoted by N
will never resolve to a hidden class for any value of D
, C
, and N
. The reflective methods Class::forName
, ClassLoader::findLoadedClass
, and Lookup::findClass
will not find hidden classes.
Notwithstanding this detachment from class loaders, a hidden class is deemed to have a defining class loader. This is necessary to resolve types used by the hidden class's own fields and methods. In particular, a hidden class has the same defining class loader, runtime package, and protection domain as the lookup class, which is the class that originally obtained the lookup object on which Lookup::defineHiddenClass
is invoked.
Using a hidden class
Lookup::defineHiddenClass
returns a Lookup
object whose lookup class is the newly created hidden class. A Class
object can be obtained for the hidden class by invoking Lookup::lookupClass
on the returned Lookup
object. Via the Class
object, the hidden class can be instantiated and its members accessed as if it was a normal class, except for four restrictions:
-
Class::getName
returns a string that is not a binary name, as described earlier. -
Class::getCanonicalName
returnsnull
, indicating the hidden class has no canonical name. (Note that theClass
object for an anonymous class in the Java language has the same behavior.) -
Final fields declared in a hidden class are not modifiable.
Field::set
and other setter methods on a final field of a hidden class will throwIllegalAccessException
regardless of the field'saccessible
flag. -
The
Class
object is not modifiable by instrumentation agents, and cannot be redefined or retransformed by JVM TI agents. We will, however, extend JVM TI and JDI to support hidden classes, such as testing whether a class is hidden, including hidden classes in any list of "loaded" classes, and sending JVM TI events when hidden classes are created.
It is important to realize that the only way for other classes to use a hidden class is indirectly, via its Class
object. The hidden class cannot be used directly by bytecode instructions in other classes because it cannot be referenced nominally, that is, by name. For example, suppose a framework learns of a hidden class named com.example.Foo/1234
, and manufactures a class
file which attempts to instantiate the hidden class. Code in the class
file would contain a new
instruction that ultimately points to a constant pool entry which denotes the name. If the framework attempts to denote the name as com/example/Foo.1234
, then the class
file will be invalid -- com/example/Foo.1234
is not a valid internal form of a binary name. On the other hand, if the framework attempts to denote the name in the valid internal form com/example/Foo/1234
, then the JVM would resolve the constant pool entry by first converting the name in internal form to a binary name, com.example.Foo.1234
, and then trying to load a class of that name; this will most likely fail, and will certainly not find the hidden class named com.example.Foo/1234
. The hidden class is not truly anonymous, since its name is exposed, but it is effectively invisible.
Without the ability of the constant pool to refer nominally to a hidden class, there is no way to use a hidden class as a superclass, field type, return type, or parameter type. This lack of usability is reminiscent of anonymous classes in the Java language, but hidden classes go further: An anonymous class can enclose other classes in order to let them access its members, but a hidden class cannot enclose other classes (their InnerClasses
attributes cannot name it). Even a hidden class is unable to use itself as a field type, return type, or parameter type in its own field and method declarations.
Importantly, code in a hidden class can use the hidden class directly, without relying on the Class
object. This is because bytecode instructions in a hidden class can refer to the hidden class symbolically (without concern for its name) rather than nominally. For example, a new
instruction in a hidden class can instantiate the hidden class via a constant pool entry which refers directly to the this_class
item in the current ClassFile
. Other instructions, such as getstatic
, getfield
, putstatic
, putfield
, invokestatic
, and invokevirtual
, can access members of the hidden class via the same constant pool entry. Direct use inside the hidden class is important because it simplifies generation of hidden classes by language runtimes and frameworks.
A hidden class generally has the same powers of reflection as a normal class. That is, code in a hidden class may define normal classes and hidden classes, and may manipulate normal classes and hidden classes via their Class
objects. A hidden class may even act as a lookup class. That is, code in a hidden class may obtain a lookup object on itself, which helps with hidden nestmates (see below).
Hidden classes in stack traces
Methods of hidden classes are not shown in stack traces by default.
They represent implementation details of language runtimes, and are
never expected to be useful to developers diagnosing application
issues. However, they can be included in stack traces via the options
-XX:+UnlockDiagnosticVMOptions -XX:+ShowHiddenFrames
.
There are three APIs which reify stack traces: Throwable::getStackTrace
,
Thread::getStackTrace
and the newer StackWalker
API introduced
in Java 9. For the Throwable::getStackTrace
and
Thread::getStackTrace
API, stack frames for hidden classes are
omitted by default; they can be included with the same options
as for stack traces above. For the StackWalker
API, stack frames
for hidden classes should be included by a JVM implementation
only if the SHOW_HIDDEN_FRAMES option is
set. This allows stack-trace filtering to omit unnecessary
information when developers are diagnosing application issues.
Hidden classes in access control nests
Introduced in Java 11 by JEP 181,
a nest is a set of classes that
allow access to each other's private members but without any of the
backdoor accessibility-broadening methods usually associated with
nested classes in the Java language. The set is defined statically:
One class serves as the nest host, its class file enumerating the
other classes that are nest members; in turn, the nest members
indicate in their class files which class hosts the nest. While
static membership works well for class files generated from Java
source code, it is usually insufficient for class files generated
dynamically by language runtimes. To help such runtimes, and to
encourage the use of Lookup::defineHiddenClass
over
Unsafe::defineAnonymousClass
, a hidden class can join a nest at
run time; a normal class cannot.
A hidden class can be created as a member of an existing nest by passing
the NESTMATE
option to Lookup::defineHiddenClass
. The nest
which the hidden class joins is not determined by an argument to
Lookup::defineHiddenClass
. Instead, the nest to be joined is inferred
from the lookup class, that is, from the class whose code initially obtained
the lookup object: The hidden class is a member of the same nest as the
lookup class (see below).
In order for Lookup::defineHiddenClass
to add hidden classes to the nest,
the lookup object must have the proper permissions, namely PRIVATE
and
MODULE
access. These permissions assert that the lookup object was obtained
by the lookup class with the intent of allowing other code to expand the nest.
The JVM disallows nested nests. A member of one nest cannot serve as the host of another nest, regardless of whether nest membership is defined statically or dynamically.
The lookup class's membership of a nest may be indicated statically (via NestHost
)
if the lookup class is a normal class, or it may have been set dynamically
if the lookup class is a hidden class. Static nest membership is validated
lazily. It is important for a language runtime or framework library to be able to add
hidden classes to the nest of a lookup class that may have a bad nest membership.
As an example, consider the LambdaMetaFactory framework introduced in Java 8. When the source code of
a class C
contains a lambda expression, the corresponding C.class
file uses
LambdaMetaFactory
at run time to define a hidden class that holds the body of
the lambda expression and implements the required functional interface.
C.class
may have a bad NestHost
attribute but the execution of C
never references the class H
named in the NestHost
attribute.
Since the lambda body may access private
members of C
, the hidden class
needs to be able to access them too; accordingly, LambdaMetaFactory
attempts
to define the hidden class as a member of the nest hosted by C
.
Suppose that we have a lookup class, C, and that defineHiddenClass
is invoked with the NESTMATE
option to create a hidden class and add it into a nest of C. The nest host of the hidden class is
determined as follows:
- If C is a normal class and lacks a
NestHost
attribute, then C is its own host and also is the nest host of the hidden class. - If C is a normal class with a valid
NestHost
attribute named H, then the nest host of C, H, is the nest host of the hidden class. In this case, the hidden class is added as a member of the nest of H. - If C is a normal class with a bad NestHost attribute, then C is used as the nest host of the hidden class.
- If C is a hidden class created without
NESTMATE
option, then C is its own host and also is the nest host of the hidden class. - If C is a hidden class created with
NESTMATE
option and dynamically added to the nest of D, then the nest host of D is used as the nest host of the hidden class.
If a hidden class is created without the NESTMATE
option then
the hidden class is the host of its own nest. This aligns with the policy
that every class is either a member of a nest with another class as nest host,
or else is itself the nest host of a nest. The hidden class can create additional hidden
classes as members of its nest: Code in the hidden class first obtains
a lookup object on itself, then invokes Lookup::defineHiddenClass
on the object and passes the NESTMATE
option.
Given the Class
object for a hidden class created as a member of a
nest, Class::getNestHost
and Class::isNestmateOf
will work as
expected. Class::getNestMembers
can be called on the Class
object
of any class in the nest -- whether member or host, whether normal
or hidden -- but returns only the members defined statically (that is,
the normal classes enumerated by NestMembers
in the host) along
with the nest host.
Class::getNestMembers
does not include the hidden classes added
to the nest dynamically because hidden classes are non-discoverable
and should only be of interest to the code that created them, which
knows the nest membership already. This prevents a hidden class
from leaking through the nest membership if intended to be kept private.
Unloading hidden classes
A class defined by a class loader has a strong relationship with that class loader. In particular, every Class
object has a reference to the ClassLoader
that defined it. This tells the JVM which loader to use when resolving symbols in the class. One consequence of this relationship is that a normal class cannot be unloaded unless its defining loader can be reclaimed by the garbage collector (JLS 12.7). Being able to reclaim the defining loader implies there are no live references to the loader, which in turn implies there are no live references to any of the classes defined by the loader. (Such classes, if they were reachable, would refer to the loader.) This widespread lack of liveness is the only state where it is safe to unload a normal class.
Accordingly, to maximize the chance of unloading a normal class, it is important to minimize references to both the class and its defining loader. Language runtimes typically achieve this by creating many class loaders, each dedicated to defining just one class, or perhaps a small number of related classes. When all instances of a class are reclaimed, and assuming the runtime does not hold on to the class loader, both the class and its defining loader can be reclaimed. However, the resulting large number of class loaders is demanding on memory. In addition, ClassLoader::defineClass
is considerably slower than Unsafe::defineAnonymousClass
according to microbenchmarks.
A hidden class is not created by a class loader and has only a loose connection to the class loader deemed to be its defining loader. We can turn these facts to our advantage by allowing a hidden class to be unloaded even if its notional defining loader cannot be reclaimed by the garbage collector. As long as there are live references to a hidden class -- either to instances of the hidden class, or to its Class
object -- then the hidden class keeps its notional defining loader alive so that the JVM can use that loader to resolve symbols in the hidden class. When the last live reference to the hidden class goes away, however, the loader need not return the favor by keeping the hidden class alive.
Unloading a normal class while its defining loader is reachable is unsafe because the loader may later be asked, either by the JVM or or by code using reflection, to reload the class, that is, to load a class with the same name. This can have unpredictable effects when static initializers are run for a second time. There is no such concern about unloading a hidden class, since hidden classes are not created in the same manner. Because a hidden class's name is an output of Lookup::defineHiddenClass
, not an input, there is no way to recreate the "same" hidden class that was unloaded previously.
By default, Lookup::defineHiddenClass
will create a hidden class that can be unloaded regardless of whether its notional defining loader is still alive. That is, when all instances of the hidden class are reclaimed and the hidden class is no longer reachable, it may be unloaded even though its notional defining loader is still reachable. This behavior is useful when a language runtime creates a hidden class to serve multiple classes defined by arbitrary class loaders: The runtime will see an improvement in footprint and performance relative to both ClassLoader::defineClass
and Unsafe::defineAnonymousClass
. In other cases, a language runtime may link a hidden class to just one normal class, or perhaps a small number of normal classes, with the same defining loader as the hidden class. In such cases, where the hidden class must be coterminous with a normal class, the STRONG
option may be passed to Lookup::defineHiddenClass
. This arranges for a hidden class to have the same strong relationship with its notional defining loader as a normal class has with its defining loader, which is to say, the hidden class will only be unloaded if its notional defining loader can be reclaimed.
Alternatives
There is no alternative to injecting a nestmate at run time aside from the existing workaround of generating package-private access bridges for the proxy class to access private members of a target class. There is no alternative to hide a class from other classes if it is visible to a class loader.
Testing
-
We will update
LambdaMetaFactory
,StringConcatFactory
, andLambdaForms
to use the new APIs. Performance testing will ensure no regressions in lambda linkage or string concatenation. -
Unit tests for the new APIs will be developed.
Risks and Assumptions
We assume that developers who currently use Unsafe::defineAnonymousClass
will be able to migrate to Lookup::defineHiddenClass
easily. Developers should be aware of three minor constraints on the functionality of hidden classes relative to VM-anonymous classes.
-
Protected access. Surprisingly, a VM-anonymous class can access
protected
members of its host class even if the VM-anonymous class exists in a different run-time package and is not a subclass of the host class. In contrast, access control rules are applied properly for hidden classes: A hidden class can only accessprotected
members of another class if the hidden class is in the same run-time package as, or a subclass of, the other class. There is no special access for a hidden class to theprotected
members of the lookup class. -
Constant-pool patching. A VM-anonymous class can be defined with its constant-pool entries already resolved to concrete values. This allows critical constants to be shared between a VM-anonymous class and the language runtime that defines it, and between multiple VM-anonymous classes. For example, a language runtime will often have
MethodHandle
objects in its address space that would be useful to newly-defined VM-anonymous classes. Instead of the runtime serializing the objects to constant-pool entries in VM-anonymous classes and then generating bytecode in those classes to laboriouslyldc
the entries, the runtime can simply supplyUnsafe::defineAnonymousClass
with references to its live objects. The relevant constant-pool entries in the newly-defined VM-anonymous class are pre-linked to those objects, improving performance and reducing footprint. In addition, this allows VM-anonymous classes to refer to each other: Constant-pool entries in a class file are based on names. They thus cannot refer to nameless VM-anonymous classes. A language runtime can, however, easily track the liveClass
objects for its VM-anonymous classes and supply them toUnsafe::defineAnonymousClass
, thus pre-linking the new class's constant pool entries to other VM-anonymous classes. TheLookup::defineHiddenClass
method will not have these capabilities because a future enhancement may offer pre-linking of constant pool entries to all classes uniformly. -
Self-control of optimization. VM-anonymous classes were designed on the assumption that only JDK code would define them. Consequently, VM-anonymous classes have an unusual ability that was previously available only to classes in the JDK, namely control of their own optimization by the HotSpot JVM. Control is exerted via annotation attributes in a VM-anonymous class's defining bytes:
@ForceInline
or@DontInline
causes HotSpot to always-inline or never-inline a method, while@Stable
causes HotSpot to treat a non-null field as a foldable constant. However, very few of the VM-anonymous classes dynamically defined by JDK code have needed this ability. It is even possible that future enhancements will make these optimizations obsolete. Accordingly, hidden classes will not have the ability to control their optimization, even when defined by JDK code. (This is not thought to present any risk to the migration of JDK code from defining VM-anonymous classes to defining hidden classes.)
As a related matter, VM-anonymous classes can use the @Hidden
annotation to prevent their methods from appearing in stack traces. Of course, this functionality is automatic for hidden classes, and may be offered to other classes in future.
Migration should take the following into account:
-
To invoke private nestmate instance methods from code in a hidden class, use
invokevirtual
orinvokeinterface
instead ofinvokespecial
. Generated bytecode that usesinvokespecial
to invoke a private nestmate instance method will fail verification.invokespecial
should only be used to invoke private nestmate constructors. -
As noted earlier, invoking
getName
on theClass
object of a hidden class returns a string that is not a binary name, since it contains a/
character. User-level code is not expected to come into contact with suchClass
objects, but framework-level code that assumes every class has a binary name may need to be updated to handle hidden classes. Framework-level code that was previously updated to handle VM-anonymous classes will continue to work, since hidden classes use the same naming convention as VM-anonymous classes. -
JVM TI
GetClassSignature
returns a JNI-style signature and returns a string that is not a binary name in internal form, such as a string containing a.
character. JVM TI agents and tools that assume every class has a binary name may need to be updated to handle hidden classes. On the other hand, the JDI implementation has been updated to handle hidden classes. Hidden classes are not modifiable by JVM TI agents. The tools impacted by the class signature of hidden classes should be limited.
Dependencies
JEP 181 (Nest-Based Access Control) introduced nest-based access control contexts, in which all classes and interfaces in a nest share private access among the nestmates.