JEP 402: Classes for the Basic Primitives (Preview)
Owner | Dan Smith |
Type | Feature |
Scope | SE |
Status | Candidate |
Discussion | valhalla dash dev at openjdk dot java dot net |
Effort | XL |
Duration | L |
Reviewed by | Brian Goetz |
Created | 2021/01/13 22:40 |
Updated | 2022/04/29 03:08 |
Issue | 8259731 |
Summary
Repurpose the primitive wrapper classes to act as declarations for the basic
primitives (int
, double
, etc.), unifying the treatment of these types with
that of other types declared by primitive classes.
This is a preview language and VM feature.
Goals
-
Migrate the eight wrapper classes (
java.lang.Integer
,java.lang.Double
, etc.) to be primitive classes. -
In the Java programming language, treat basic primitive values as instances of these classes. Support type keywords like
int
anddouble
as the way to refer to the declared primitive types. Support method invocation, value object conversion, and array covariance on these types. -
In the Java virtual machine, treat the basic primitive array types as equivalent to the corresponding
Q
array types. -
In the core reflection API, change the behavior of the eight
Class
objects representing the basic primitive types (int.class
,double.class
, etc.) to more closely resemble that of other primitive class types.
Non-Goals
-
The core functionality of primitive classes is introduced by JEP 401. This JEP is only concerned with applying those features to the eight basic primitive types.
-
This JEP does not address the interaction of primitive types, including
int
,double
, etc., with Java's generics. Separate JEPs will address the need for primitive types as type arguments, and eventually optimize the performance of these parameterizations. -
This JEP does not propose any new kinds of numeric primitives, or any new capabilities for Java's unary and binary operators.
Motivation
Java's classes and interfaces provide an expressive mechanism to model data and associated operations. But the basic primitive types of the language—booleans, integers, and floating-point numbers—do not make use of this mechanism. Instead, they support a predetermined set of operations and conversions, and cannot otherwise interoperate with other types.
As a workaround, the standard library provides wrapper classes, instances of which store a single primitive value and present it as an object. In Java 5, implicit boxing and unboxing conversions were introduced, transparently converting the basic primitive values to wrapper class instances, and vice versa, as required by the program.
But the wrapper class workaround is imperfect. It doesn't entirely hide the
effects of conversions—boxing the same value twice, for example, may yield two
objects that are not ==
to each other. More importantly, in many applications
wrapping primitive values in objects has significant runtime costs, and
developers must weigh those costs against the benefit of greater
expressiveness.
The primitive classes feature, introduced by JEP 401, eliminates most of the overhead of modeling primitive values with classes. As a result, it is now practical to treat the basic primitives as class types, gaining all the capabilities of classes and delegating many details of these types to the standard library.
These new primitive classes will be unique in some ways—for example, the primitive type introduced by the class is named with a keyword. But, in most ways, we can treat a primitive class that models a basic primitive type just like any other primitive class.
A lot of existing code assumes that an Object
modeling a basic primitive value
will belong to a wrapper class. Since there is no longer any need to wrap basic
primitive values, we can minimize disruption by repurposing
the wrapper classes to treat int
values as instances of java.lang.Integer
,
double
values as instances of java.lang.Double
, etc.
Description
The features described below are preview features, enabled with the
--enable-preview
compile-time and runtime flags.
Basic primitive classes
The eight basic primitive classes are the following:
java.lang.Boolean
java.lang.Character
java.lang.Byte
java.lang.Short
java.lang.Integer
java.lang.Long
java.lang.Float
java.lang.Double
The compiler and bootstrap class loader use special logic to locate these class files; when preview features are enabled, modified versions of these classes that are declared primitive are located.
The public
constructors of these classes were deprecated for removal in Java
16 by JEP 390. To avoid subtle binary compatibility issues involving
identity and primitive class constructors being compiled differently, the
constructors in the modified classes are private
.
Java language model
Unlike other primitive classes, the primitive type of a basic primitive class is
expressed with one of the eight type keywords—boolean
, char
, byte
,
short
, int
, long
, float
, or double
. The name of the class—Boolean
,
Character
, etc.—instead refers to the class's reference type. (To do: do we
support/encourage int.ref
syntax as well?)
Also unlike other primitive classes, a basic primitive class may declare an
instance field of its own primitive type. (For example, the Integer
class has a
field of type int
.)
Java supports a number of conversions between different basic primitive types,
such as int
to double
; those behaviors are unchanged. For clarity, we now
call them widening numeric conversions and narrowing numeric conversions.
There are no similar conversions between reference types, such as Integer
to
Double
.
The boxing and unboxing conversions are superseded by primitive classes' value object and primitive value conversions. The supported types are the same, but the runtime behavior is more efficient.
Java provides a number of unary and binary operators for manipulating basic
primitive values (e.g., 23*12
, !true
). The rules and behaviors of these
operators are unchanged.
Because the basic primitive types are class types, they now have methods. Code
such as 23.compareTo(42)
is legal. (To do: does this introduce any parsing
problems? And do the behaviors of equals
and compareTo
make sense?)
As with other primitive types, arrays of basic primitive types are covariant: An
int[]
can now be treated as an Integer[]
, Number[]
, etc.
Compilation and run time
The JVM treats the basic primitive types as distinct from primitive class types:
The type D
represents 64-bit floating-point values that span two stack slots
and support a full suite of dedicated opcodes (dload
, dstore
, dadd
,
dcmpg
, etc.), while the type Qjava/lang/Double;
represents primitive
values of class Double
that span a single stack slot and respond to the
reference type opcodes (aload
, astore
, invokevirtual
, etc.)
A Java compiler is responsible for adapting between the two types as needed, via
methods such as Double.valueOf
and Double.doubleValue
(or some other
mechanism TBD?). The resulting bytecode will look similar to boxing and unboxing
code, but the runtime overhead is greatly reduced.
Compiler adaptations are not sufficient for basic primitive arrays. For example,
an array of type [D
created with newarray
may be passed to a method
expecting a [Ljava/lang/Double;
, and an array of type [Qjava/lang/Double;
created with anewarray
may be cast to type [D
. To support this behavior, the
JVM treats the types [D
and [Qjava/lang/Double;
as compatible with each
other, and supports both families of opcodes on their values (daload
and
aaload
, dastore
and aastore
), regardless of how the arrays were created.
For consistency, basic primitive value types appearing in field types and method
signatures are always translated to basic primitive JVM types (D
, not
Qjava/lang/Double;
). To reduce complexity for consumers of class files, we
might consider it illegal for any bytecode (whether generated by javac
or some
other tool) to mention the Q
type of a basic primitive class in a descriptor.
Core reflection
There are two Class
objects that developers may encounter for each basic
primitive class. In the case of class double
, these are:
-
Double.class
, corresponding to the JVM descriptor typeLjava/lang/Double;
. Returnsfalse
fromisPrimitive()
. Behaves like a standardClass
object modeling a primitive class, except that it represents a reference type rather than a primitive type. -
double.class
, corresponding to the JVM descriptor typeD
. Returnstrue
fromisPrimitive()
. Aligning with the language model, it behaves in most other ways like a "secondary"Class
object for a primitive class, except that it represents a primitive type. In particular, it reflects the methods and supertypes of theDouble
class declaration.
The getClass
method of a basic primitive class instance returns a Class
object of
the first kind—Double.class
, Integer.class
, etc. As with all primitive
objects, the method's result is the same whether invoked via the value type
((23.0).getClass()
) or the reference type (((Double)23.0).getClass()
).
The JVM type Qjava/lang/Double
cannot be encoded with a Class
object.
Alternatives
The language could be left unchanged, continuing to fully specify the basic primitives without relying on a class declaration. But it will be useful to eliminate the rift between basic primitives and and developer-declared primitives, especially as Java's generics are enhanced to work with primitive class types.
The wrappers could be left behind as legacy API. But assumptions about boxing behavior run deep in some code, and a new set of classes would break those programs.
The JVM could follow the Java language in fully unifying its basic primitive
types (I
, D
, etc.) with its primitive class types
(Qjava/lang/Integer;
, Qjava/lang/Double;
, etc.) But this would be an
expensive change for little ultimate benefit. For example, there would have to
be a way to reconcile the two-slot size of type D
with the single-slot size of
type Qjava/lang/Double;
, perhaps requiring a disruptive versioned change
to the class file format.
Risks and Assumptions
Removing the wrapper class constructors breaks binary compatibility for a significant subset of legacy Java programs. There are also behavioral changes associated with migration to primitive classes. JEP 390, along with some expected followup efforts, mitigates these concerns. But some programs that invoke the constructors or rely on boxed object identity will break.
Changes in reflection behavior, due to the new status of basic primitive types as class types, may cause problems for some programs.
Dependencies
JEP 401, Primitive Classes, is a prerequisite.
In anticipation of this feature we already added warnings about potential
incompatible changes to primitive class candidates to javac
and HotSpot, via
JEP 390. Some followup work will come in additional JEPs.
We anticipate modifying the generics model in Java to make type parameters universal—instantiable by all types, both reference and value. This will be pursued in a separate JEP.