JEP 402: Classes for the Basic Primitives (Preview)

OwnerDan Smith
TypeFeature
ScopeSE
StatusCandidate
Discussionvalhalla dash dev at openjdk dot java dot net
EffortXL
DurationL
Reviewed byBrian Goetz
Created2021/01/13 22:40
Updated2022/04/29 03:08
Issue8259731

Summary

Repurpose the primitive wrapper classes to act as declarations for the basic primitives (int, double, etc.), unifying the treatment of these types with that of other types declared by primitive classes. This is a preview language and VM feature.

Goals

Non-Goals

Motivation

Java's classes and interfaces provide an expressive mechanism to model data and associated operations. But the basic primitive types of the language—booleans, integers, and floating-point numbers—do not make use of this mechanism. Instead, they support a predetermined set of operations and conversions, and cannot otherwise interoperate with other types.

As a workaround, the standard library provides wrapper classes, instances of which store a single primitive value and present it as an object. In Java 5, implicit boxing and unboxing conversions were introduced, transparently converting the basic primitive values to wrapper class instances, and vice versa, as required by the program.

But the wrapper class workaround is imperfect. It doesn't entirely hide the effects of conversions—boxing the same value twice, for example, may yield two objects that are not == to each other. More importantly, in many applications wrapping primitive values in objects has significant runtime costs, and developers must weigh those costs against the benefit of greater expressiveness.

The primitive classes feature, introduced by JEP 401, eliminates most of the overhead of modeling primitive values with classes. As a result, it is now practical to treat the basic primitives as class types, gaining all the capabilities of classes and delegating many details of these types to the standard library.

These new primitive classes will be unique in some ways—for example, the primitive type introduced by the class is named with a keyword. But, in most ways, we can treat a primitive class that models a basic primitive type just like any other primitive class.

A lot of existing code assumes that an Object modeling a basic primitive value will belong to a wrapper class. Since there is no longer any need to wrap basic primitive values, we can minimize disruption by repurposing the wrapper classes to treat int values as instances of java.lang.Integer, double values as instances of java.lang.Double, etc.

Description

The features described below are preview features, enabled with the --enable-preview compile-time and runtime flags.

Basic primitive classes

The eight basic primitive classes are the following:

The compiler and bootstrap class loader use special logic to locate these class files; when preview features are enabled, modified versions of these classes that are declared primitive are located.

The public constructors of these classes were deprecated for removal in Java 16 by JEP 390. To avoid subtle binary compatibility issues involving identity and primitive class constructors being compiled differently, the constructors in the modified classes are private.

Java language model

Unlike other primitive classes, the primitive type of a basic primitive class is expressed with one of the eight type keywords—boolean, char, byte, short, int, long, float, or double. The name of the class—Boolean, Character, etc.—instead refers to the class's reference type. (To do: do we support/encourage int.ref syntax as well?)

Also unlike other primitive classes, a basic primitive class may declare an instance field of its own primitive type. (For example, the Integer class has a field of type int.)

Java supports a number of conversions between different basic primitive types, such as int to double; those behaviors are unchanged. For clarity, we now call them widening numeric conversions and narrowing numeric conversions. There are no similar conversions between reference types, such as Integer to Double.

The boxing and unboxing conversions are superseded by primitive classes' value object and primitive value conversions. The supported types are the same, but the runtime behavior is more efficient.

Java provides a number of unary and binary operators for manipulating basic primitive values (e.g., 23*12, !true). The rules and behaviors of these operators are unchanged.

Because the basic primitive types are class types, they now have methods. Code such as 23.compareTo(42) is legal. (To do: does this introduce any parsing problems? And do the behaviors of equals and compareTo make sense?)

As with other primitive types, arrays of basic primitive types are covariant: An int[] can now be treated as an Integer[], Number[], etc.

Compilation and run time

The JVM treats the basic primitive types as distinct from primitive class types: The type D represents 64-bit floating-point values that span two stack slots and support a full suite of dedicated opcodes (dload, dstore, dadd, dcmpg, etc.), while the type Qjava/lang/Double; represents primitive values of class Double that span a single stack slot and respond to the reference type opcodes (aload, astore, invokevirtual, etc.)

A Java compiler is responsible for adapting between the two types as needed, via methods such as Double.valueOf and Double.doubleValue (or some other mechanism TBD?). The resulting bytecode will look similar to boxing and unboxing code, but the runtime overhead is greatly reduced.

Compiler adaptations are not sufficient for basic primitive arrays. For example, an array of type [D created with newarray may be passed to a method expecting a [Ljava/lang/Double;, and an array of type [Qjava/lang/Double; created with anewarray may be cast to type [D. To support this behavior, the JVM treats the types [D and [Qjava/lang/Double; as compatible with each other, and supports both families of opcodes on their values (daload and aaload, dastore and aastore), regardless of how the arrays were created.

For consistency, basic primitive value types appearing in field types and method signatures are always translated to basic primitive JVM types (D, not Qjava/lang/Double;). To reduce complexity for consumers of class files, we might consider it illegal for any bytecode (whether generated by javac or some other tool) to mention the Q type of a basic primitive class in a descriptor.

Core reflection

There are two Class objects that developers may encounter for each basic primitive class. In the case of class double, these are:

The getClass method of a basic primitive class instance returns a Class object of the first kind—Double.class, Integer.class, etc. As with all primitive objects, the method's result is the same whether invoked via the value type ((23.0).getClass()) or the reference type (((Double)23.0).getClass()).

The JVM type Qjava/lang/Double cannot be encoded with a Class object.

Alternatives

The language could be left unchanged, continuing to fully specify the basic primitives without relying on a class declaration. But it will be useful to eliminate the rift between basic primitives and and developer-declared primitives, especially as Java's generics are enhanced to work with primitive class types.

The wrappers could be left behind as legacy API. But assumptions about boxing behavior run deep in some code, and a new set of classes would break those programs.

The JVM could follow the Java language in fully unifying its basic primitive types (I, D, etc.) with its primitive class types (Qjava/lang/Integer;, Qjava/lang/Double;, etc.) But this would be an expensive change for little ultimate benefit. For example, there would have to be a way to reconcile the two-slot size of type D with the single-slot size of type Qjava/lang/Double;, perhaps requiring a disruptive versioned change to the class file format.

Risks and Assumptions

Removing the wrapper class constructors breaks binary compatibility for a significant subset of legacy Java programs. There are also behavioral changes associated with migration to primitive classes. JEP 390, along with some expected followup efforts, mitigates these concerns. But some programs that invoke the constructors or rely on boxed object identity will break.

Changes in reflection behavior, due to the new status of basic primitive types as class types, may cause problems for some programs.

Dependencies

JEP 401, Primitive Classes, is a prerequisite.

In anticipation of this feature we already added warnings about potential incompatible changes to primitive class candidates to javac and HotSpot, via JEP 390. Some followup work will come in additional JEPs.

We anticipate modifying the generics model in Java to make type parameters universal—instantiable by all types, both reference and value. This will be pursued in a separate JEP.