JEP 401: Value Classes and Objects (Preview)

OwnerDan Smith
TypeFeature
ScopeSE
StatusDraft
Componentspecification
Discussionvalhalla dash dev at openjdk dot java dot net
EffortXL
DurationXL
Reviewed byAlex Buckley, Brian Goetz
Created2020/08/13 19:31
Updated2025/08/20 00:01
Issue8251554

Summary

Enhance the Java Platform with value objects: class instances that have only final fields and lack object identity. This is a preview language and VM feature.

Goals

Non-Goals

Motivation

Java developers often need to represent domain values: the date of an event, the color of a pixel, the shipping address of an order, and so on. Developers usually model these values with immutable classes that contain just enough business logic to construct, validate, and transform instances. The toString, equals, and hashCode methods in these classes are defined so that equivalent instances can be used interchangeably.

As an example, event dates can be represented with the JDK's LocalDate class:

jshell> LocalDate d1 = LocalDate.of(1996, 1, 23)
d1 ==> 1996-01-23

jshell> LocalDate d2 = d1.plusYears(30)
d2 ==> 2026-01-23

jshell> LocalDate d3 = d2.minusYears(30)
d3 ==> 1996-01-23

jshell> d1.equals(d3)
$4 ==> true

Developers will regard the "essence" of a LocalDate object as its year, month, and day values. But to Java, the essence of any object is its identity. Each time the of method in LocalDate invokes new LocalDate(...), an object with a unique identity is allocated, distinguishable from every other object in the system.

The easiest way to observe the identity of an object is with the == operator:

jshell> d1 == d3
$6 ==> false

Even though d1 and d3 represent the same year-month-day triple (d1.equals(d3) is true), they are two objects with distinct identities.

Domain values don't need identity

For mutable objects, identity is important: it lets us distinguish two objects that have the same state now but will have different state in the future. For example, suppose a class Customer has a field lastOrderedDate that is mutated when the customer makes a new order. Two Customer objects might have the same lastOrderedDate, but it would be a coincidence; when one of the customers makes a new order, the application will mutate the lastOrderedDate of one Customer object but not the other, relying on identity to pick the right one.

In other words, when objects are mutable, they are not interchangeable. But most domain values are not mutable and are interchangeable. There is no practical difference between two LocalDate objects representing 1996-01-23, because their state is fixed and unchanging. They represent the same domain value, both now and in the future. There is no need to distinguish the two objects via their identities.

In fact, object identity is actively confusing when objects are immutable and are meant to be interchangeable. Most developers will recall the experience of unwittingly using == to compare objects, as in d1 == d3 above, and being mystified by a false result even though the objects' state and behavior seem identical.

The JDK tries to reduce confusion for the immutable classes that model primitive values, such as Integer. In particular, the autoboxing of small int values to Integer uses a cache to avoid creating Integer objects with unique identities. However, this cache, somewhat arbitrarily, does not extend to four-digit int values like 1996:

jshell> Integer i = 96, j = 96;
i ==> 96
j ==> 96

jshell> i == j
$3 ==> true

jshell> Integer x = 1996, y = 1996;
x ==> 1996
y ==> 1996

jshell> x == y
$6 ==> false

For domain values like Integer, the fact that each object has unique identity is unwanted complexity that leads to surprising behavior and exposes incidental implementation choices. This extra complexity could be avoided if objects whose state and behavior make them interchangeable could be freed from the legacy requirement to have distinct identities.

Object identity is expensive at run time

Java's requirement that every object have identity, even if some domain values don't want it, is a performance impediment. It means the JVM has to allocate memory for each newly created object, distinguishing it from every object already in the system, and reference the location in memory whenever the object is used or stored.

For example, suppose a program creates arrays of int values and LocalDate references:

jshell> int[] ints = { 1996, 2006, 1996, 1, 23 }
ints ==> int[5] { 1996, 2006, 1996, 1, 23 }

jshell> LocalDate[] dates = { d1, d1, d2, null, d3 }
dates ==> LocalDate[5] { 1996-01-23, 1996-01-23, 2026-01-23,
                         null, 1996-01-23 }

The int array can be allocated by the JVM as a simple block of memory:

+----------+
| int[5]   |
+----------+
| 1996     |
| 2006     |
| 1996     |
| 1        |
| 23       |
+----------+

In contrast, the LocalDate array must be represented as a sequence of pointers, each referencing a location in memory where an object has been allocated:

+--------------+
| LocalDate[5] |
+--------------+
| 87fa1a09     | -----------------------> +-----------+
| 87fa1a09     | -----------------------> | LocalDate |
| 87fb4ad2     | ------> +-----------+    +-----------+
| 00000000     |         | LocalDate |    | y=1996    |
| 87fb5366     | ---     +-----------+    | m=1       |
+--------------+   |     | y=2026    |    | d=23      |
                   v     | m=1       |    +-----------+
        +-----------+    | d=23      |
        | LocalDate |    +-----------+
        +-----------+
        | y=1996    |
        | m=1       |
        | d=23      |
        +-----------+

Even though the data modeled by the LocalDate array is not significantly more complex than the int array -- a year-month-day triple is effectively 48 bits of primitive data -- the memory footprint is far greater.

Worse, when a program iterates over the LocalDate array, each pointer may need to be dereferenced. CPUs use caches to enable fast access to chunks of memory; if the array exhibits poor memory locality (a distinct possibility if the LocalDate objects were allocated at different times or out of order), every dereference may require caching a different chunk of memory, frustrating performance.

In some application domains, developers program for speed by creating as few objects as possible, thus de-stressing the garbage collector and improving locality. For example, they might encode event dates with an int representing an epoch day. Unfortunately, this approach gives up the functionality of classes that makes Java code so maintainable: meaningful names, private state, data validation by constructors, convenience methods, etc. A developer operating on dates represented as int values might accidentally interpret the value relative to a start date in 1601 or 1980 rather than the intended 1970 start date.

Programming without identity

Trillions of Java objects are created every day, each one bearing a unique identity. We believe the time has come to let Java developers choose which objects in the program need identity, and which do not. An immutable class like LocalDate that represents domain values could opt out of identity, so that it would be impossible to distinguish between two LocalDate objects representing the date 1996-01-23, just as it is impossible to distinguish between two int values representing the number 4.

By opting out of identity, developers are opting in to a programming model that provides the best of both worlds: the abstraction of classes with the simplicity and performance benefits of primitives.

In future, this programming model will support new Java Platform APIs, such as classes that encode different kinds of integers and floating-point values, and new Java language features, such as user-defined conversions and mathematical operators for domain values.

Description

JDK NN introduces value objects to model immutable domain values. A value object is an instance of a value class, declared with the value modifier. Classes without the value modifier are called identity classes, and their instances are identity objects.

Java programs manipulate objects through references. A reference to an object is stored in a variable and lets us find the object's fields. Traditionally, a reference also encodes the unique identity of an object: each execution of new allocates a fresh object and returns a unique reference, which can then be stored in multiple variables (aliasing). The == operator compares objects by comparing references, so references to two objects are not == even if the objects have identical field values.

A reference to a value object is stored in a variable and lets us find the object's fields, but it does not serve as the unique identity of the object. Executing new might not allocate a fresh object and might instead return a reference to an existing object, or even a "reference" that embodies the object directly. The == operator compares value objects by comparing their field values, so references to two objects are == if the objects have identical field values.

Developers can save memory and improve performance by using value objects for immutable data. Because programs cannot tell the difference between two value objects with identical field values (not even with ==), the Java Virtual Machine is able to change how a value object is laid out in memory without affecting the program; for example, its fields could be stored on the stack rather than the heap.

The following sections explore how value objects differ from identity objects and illustrate how to declare value classes. This is followed by an in-depth treatment of the special behaviors of value objects, considerations for value class declarations, and the JVM's handling of value classes and objects.

Enabling preview features

Value classes and objects are a preview language feature, disabled by default.

To try the examples below in JDK NN you must enable preview features:

Programming with value objects

In JDK NN with preview features enabled, 29 classes in the JDK are declared as value classes. Some of these classes include:

All instances of these classes are value objects. This includes the boxed primitives that are instances of Integer, Long, etc. The == operator compares value objects by their field values, so, e.g., Integer objects are == if they box the same primitive values:

% -> jshell --enable-preview
|  Welcome to JShell -- Version 25-internal
|  For an introduction type: /help intro

jshell> Integer x = 1996, y = 1996;
x ==> 1996
y ==> 1996

jshell> x == y
$3 ==> true

Similarly, two LocalDate objects are == if they have the same year, month, and day values:

jshell> LocalDate d1 = LocalDate.of(1996, 1, 23)
d1 ==> 1996-01-23

jshell> LocalDate d2 = d1.plusYears(30)
d2 ==> 2026-01-23

jshell> LocalDate d3 = d2.minusYears(30)
d3 ==> 1996-01-23

jshell> d1 == d3
$7 ==> true

The String class has not been made a value class. Instances of String are always identity objects. We can use the Objects.hasIdentity method, new in JDK NN, to observe whether an object is an identity object.

jshell> String s = "abcd"
s ==> "abcd"

jshell> Objects.hasIdentity(s)
$9 ==> true

jshell> Objects.hasIdentity(d1)
$10 ==> false

jshell> String t = "aabcd".substring(1)
t ==> "abcd"

jshell> s == t
$13 ==> false

In most respects, value objects work the way that objects have always worked in Java. However, a few identity-sensitive operations, such as synchronization, are not supported by value objects.

jshell> synchronized (d1) { d1.notify(); }
|  Error:
|  unexpected type
|    required: a type with identity
|    found:    java.time.LocalDate
|  synchronized (d1) { d1.notify(); }
|  ^--------------------------------^

jshell> Object o = d1
o ==> 1996-01-23

jshell> synchronized (o) { o.notify(); }
|  Exception java.lang.IdentityException: Cannot synchronize on
   an instance of value class java.time.LocalDate
|        at (#19:1)

The JVM has a lot of freedom to encode references to value objects at run time in ways that optimize memory footprint, locality, and garbage collection efficiency. For example, we saw the following array earlier, implemented with pointers to heap objects:

jshell> LocalDate[] dates = { d1, d1, d2, null, d3 }
dates ==> LocalDate[5] { 1996-01-23, 1996-01-23, 2026-01-23,
                         null, 1996-01-23 }

Now that LocalDate objects lack identity, the JVM could implement the array using "references" that encode the fields of each LocalDate directly. Each array component can be represented as a 64-bit word that indicates whether the reference is null, and if not, directly stores the year, month, and day field values of the value object:

+--------------+
| LocalDate[5] |
+--------------+
| 1|1996|01|23 |
| 1|1996|01|23 |
| 1|2026|01|23 |
| 0|0000|00|00 |
| 1|1996|01|23 |
+--------------+

The performance characteristics of this LocalDate array may be similar to those of an ordinary int array:

+----------+
| int[5]   |
+----------+
| 1996     |
| 2006     |
| 1996     |
| 1        |
| 23       |
+----------+

This optimization is just one example; some value classes, like LocalDateTime, are too large to take advantage of this particular technique. Still, the lack of identity enables the JVM to optimize references to value objects in many ways.

Declaring value classes

Developers can declare their own value classes by applying the value modifier to any class whose instances should be immutable and interchangeable:

When the value modifier is applied to a class, its fields are implicitly final. The class itself is also implicitly final, so cannot be extended.

There is no restriction on the types of the fields in a value class. The fields may store references to other value objects, or to identity objects, e.g., strings.

Record classes are transparent data carriers whose fields are always final, so they are often good candidates to be value classes.

jshell> value record Point(int x, int y) {}
|  created record Point

jshell> Point p = new Point(17, 3)
p ==> Point[x=17, y=3]

jshell> Objects.hasIdentity(p)
$7 ==> false

jshell> new Point(17, 3) == p
$8 ==> true

Many classes represent immutable and interchangeable domain values but cannot be record classes because they are not transparent: the internal state encodes the external state indirectly, and is often encapsulated for good measure, For example, the following class represents currency values with a private field internally, so it cannot be a record class; nevertheless, it is a good candidate to be a value class.

value class EURCurrency {
    private int cs; // implicitly final
    private EURCurrency(int cs) { this.cs = cs; }

    public EURCurrency(int euros, int cents) {
        this(euros * 100 + (euros < 0 ? -cents : cents));
    }

    public int euros() { return cs/100; }
    public int cents() { return Math.abs(cs%100); }
    public String toString() {
        return "€%d,%d".formatted(euros(), cents());
    }
}

Comparing value objects

The purpose of the == operator in Java is to test whether two referenced objects are indistinguishable. If two references are ==, the JVM can freely replace one object with the other, and no code will be able to tell the difference.

For identity objects, the == operator works the same in JDK NN as in 1.0: it checks whether two references are to the same object, at the same location in memory.

For value objects, the == operator checks for statewise equivalence. This means the two references are to objects with the same field values. Two value objects are statewise equivalent if:

== and equals will often produce the same results for value objects. However, for some value classes, instances may be interchangeable (so equals) even if their field values are different (so not ==). Developers who want to test whether two value objects represent the same domain value should use the equals method, and class authors should define equals in a way that always returns true for interchangeable domain values.

An example where == and equals may differ for value objects involves the LazySubstring value class below. It represents a substring of a string lazily, without allocating a new char[] in memory. The internal state of a LazySubstring instance is a source string and two coordinates, while the domain value represented by the instance is a character sequence produced by toString. Accordingly, two instances may model the same character sequence (so equals) even though their internal state is different (so not ==).

value class LazySubstring {
    private String str;
    private int start, end;

    public LazySubstring(String s, int i, int j) {
        str = s; start = i; end = j;
    }

    public String toString() {
        return str.substring(start, end);
    }
   
    public boolean equals(Object o) {
        return o instanceof LazySubstring &&
            toString().equals(o.toString());
    }

    public int hashCode() {
        return Objects.hash(LazySubstring.class, toString());
    }
}

jshell> LazySubstring sub1 = new LazySubstring("ringing", 1, 4);
sub1 ==> ing

jshell> LazySubstring sub2 = new LazySubstring("ringing", 4, 7);
sub2 ==> ing

jshell> sub1.equals(sub2)
$3 ==> true

jshell> sub1 == sub2
$4 ==> false

Another scenario where == and equals may differ is where value objects have fields that refer to identity objects. Even if the identity objects are interchangeable according to equals, they may not be statewise equivalent according to ==, so the value objects will not be statewise equivalent even if they are interchangeable.

jshell> value record Country(String code) {}
|  created record Country

jshell> Country c1 = new Country("SWE")
c1 ==> Country[code=SWE]

jshell> Country c2 = new Country("SWEDEN".substring(0,3))
c2 ==> Country[code=SWE]

jshell> c1.equals(c2)   // The equals method of a record class compares using equals
$8 ==> true

jshell> c1 == c2
$9 ==> false

Yet another situation where == and equals may differ is where value objects have fields that are float or double. The primitive floating-point types support multiple encodings of NaN using different bit patterns. These NaN values are treated as interchangeable by most floating-point operations, but because each bit pattern is distinct, value objects that wrap different encodings of NaN are not statewise equivalent according to ==. The value class author must decide whether the distinction is meaningful for the equals method. For example, the default behavior of equals in a value record class does not consider NaN encodings to be a meaningful distinction.

jshell> value record Length(float val) {}
|  created record Length

jshell> Length l1 = new Length(Float.intBitsToFloat(0x7ff80000))
l1 ==> Length[val=NaN]

jshell> Length l2 = new Length(Float.intBitsToFloat(0x7ff80001))
l2 ==> Length[val=NaN]

jshell> l1.equals(l2)
$13 ==> true

jshell> l1 == l2
$14 ==> false

jshell> Float.floatToRawIntBits(l1.val())
$15 ==> 2146959360

jshell> Float.floatToRawIntBits(l2.val())
$16 ==> 2146959361

Note that == performs a "deep" comparison of nested references to other value objects. The number of comparisons is unbounded. In the following example, two deep nests of Box objects require a full traversal to determine whether the objects are statewise equivalent.

jshell> value record Box(Object val) {}
|  created record Box

jshell> var b1 = new Box(new Box(new Box(new Box(sub1))))
b1 ==> Box[val=Box[val=Box[val=Box[val=ing]]]]

jshell> var b2 = new Box(new Box(new Box(new Box(sub2))))
b2 ==> Box[val=Box[val=Box[val=Box[val=ing]]]]

jshell> b1.equals(b2)
$20 ==> true

jshell> b1 == b2
$21 ==> false

Constructors of value classes are constrained (discussed later) so that the recursive application of == to value objects will never cause an infinite loop.

Value classes and Subclassing

Every value class belongs to a class hierarchy with java.lang.Object at its root, just like every identity class. There is no java.lang.Value superclass of all value classes.

By default, a value class extends java.lang.Object and can implement interfaces. This means variables declared with Object, or with interfaces, can store references to both value objects and identity objects.

jshell> Object o = LocalDate.of(1996, 1, 23)
o ==> 1996-01-23

jshell> Objects.hasIdentity(o)
$2 ==> false

jshell> Comparable<?> comp = 123
comp ==> 123

jshell> Objects.hasIdentity(comp)
$2 ==> false

jshell> comp = "abc"
comp ==> "abc"

jshell> Objects.hasIdentity(comp)
$4 ==> true

By default, a value class is implicitly final and cannot be extended. However, a value class may be declared abstract, allowing it to be extended by other classes. These subclasses may be value classes or identity classes. Thus, a value class can extend either java.lang.Object or an abstract value class.

The rules for abstract value class declarations are the same as for concrete value class declarations. For example, all instance fields of an abstract value class are implicitly final.

Many existing abstract classes are good candidates to be value classes. Applying the value modifier to an abstract class indicates that the class has no need for identity but does not restrict subclasses from having identity. For example, the abstract class Number has no fields, nor any code that depends on identity-sensitive features, so it can be safely migrated to an abstract value class.

abstract value class Number implements Serializable {
    public abstract int intValue();
    public abstract long longValue();
    public byte byteValue() { return (byte) intValue(); }
    ...
}

Both the value class Integer and the identity class java.math.BigInteger extend Number.

jshell> Number num = 123
num ==> 123

jshell> Objects.hasIdentity(num)
$6 ==> false

jshell> num = BigInteger.valueOf(123)
num ==> 123

jshell> Objects.hasIdentity(num)
$8 ==> true

An abstract value class can be sealed to limit who can extend the class.

sealed abstract value class UserID
        permits EmailID, PhoneID, UsernameID {
    ...
}

value record EmailID(String name, String domain) { ... }

value record PhoneID(String digits) { ... }

value record UsernameID(String name) { ... }

Safe construction

Constructors initialize newly-created objects, including setting the values of the objects' fields. Because value objects do not have identity, their initialization requires special care.

An object being constructed is "larval"—it has been created, but it is not yet fully-formed. Larval objects must be handled carefully, because the expected properties and invariants of the object may not yet hold—for example, the fields of a larval object may not be set. If a larval object is shared with outside code, that code may even observe the mutation of a final field!

Traditionally, a constructor begins the initialization process by invoking a superclass constructor, super(...). After the superclass returns, the subclass then proceeds to set its declared instance fields and perform other initialization tasks. This pattern exposes a completely uninitialized subclass to any larval object leakage occurring in a superclass constructor.

The Flexible Constructor Bodies feature enables an alternative approach to initialization, in which fields can be set and other code executed before the super(...) invocation. There is a two-phase initialization process: early construction before the super(...) invocation, and late construction afterwards.

During the early construction phase, larval object leakage is impossible: the constructor may set the fields of the larval object, but may not invoke instance methods or otherwise make use of this. Fields that are initialized in the early phase are set before they can ever be read, even if a superclass leaks the larval object. Final fields, in particular, can never be observed to mutate.

In a value class, all constructor and initializer code normally occurs in the early construction phase. This means that attempts to invoke instance methods or otherwise use this will fail:

value class Name {
    String name;
    int length;

    Name(String n) {
        name = n;
        length = computeLength(); // error!
    }

    private int computeLength() {
        return name.length();
    }
}

Fields that are declared with initializers get set at the start of the constructor (as usual), but any implicit super() call gets placed at the end of the constructor body.

When a constructor includes code that needs to work with this, an explicit super(...) or this(...) call can be used to mark the transition to the late phase. But all fields must be initialized before the super(...) call, without reference to this:

value class Name {
    String name;
    int length;

    Name(String n) {
        name = n;
        length = computeLength(name); // ok
        super(); // all fields must be set at this point
        System.out.println("Name: " + this);
    }

    // refactored to be static:
    private static int computeLength(String n) {
        return n.length();
    }
}

For convenience, the early construction rules are relaxed by this JEP to allow the class's fields to be read as well as written—both references to the field name in the above constructor are legal. It continues to be illegal to refer to inherited fields, invoke instance methods, or share this with other code until the late construction phase.

Instance initializer blocks (a rarely-used feature) continue to run in the late phase, and so may not assign to value class instance fields.

This scheme is also appropriate for identity records, so this JEP modifies the language rules for records such that their constructors always run in the early construction phase. This is not a source-compatible language change, but is not expected to be disruptive.

In the rare case that a record constructor needs to access this, an explicit super() can be inserted, but the record's fields must be set beforehand. The following record declaration will fail to compile when preview features are enabled, because it now makes reference to this in the early construction phase.

record Node(String label, List<Node> edges) {
   public Node {
        validateNonNull(this, label); // error!
        validateNonNull(this, edges); // error!
    }

    static void validateNonNull(Object o, Object val) {
        if (val == null) {
            throw new IllegalArgumentException(
                "null arg for " + o);
        }
    }
}

(Note that this attempt to provide useful diagnostics by sharing this is misguided anyway: in a record's compact constructor, the fields are not set until the end of the constructor body; before they are set, the toString result will always be Node[label=null, edges=null].)

Finally, in normal identity classes, we think developers should write constructors and initializers that avoid the risk of larval object leakage by generally adopting the early construction constraints: read and write the declared fields of the class, but otherwise avoid any dependency on this, and where a dependency is necessary, mark it as deliberate by putting it after an explicit super(...) or this(...) call. To encourage this style, javac provides lint warnings indicating this dependencies in normal identity class constructors. (In the future, we anticipate that normal identity classes will have a way to adopt the constructor timing of value classes and records. A class that compiles without warning will likely be able to cleanly make that transition.)

Inherited methods of java.lang.Object

By default, a value class inherits equals, hashCode, and toString from java.lang.Object.

In a value record, as for all records, the default equals, hashCode, and toString behavior is to recursively apply the same operations to the record components.

A few other methods of Object interact with value objects in interesting ways:

Limitations of value classes

Value classes, and especially value records, are useful tools for modeling immutable domain values that are interchangeable when two instances represent the same value.

As a general rule, if a class with immutable state doesn't need identity, it should be a value class. This includes abstract classes, which often have no state at all and shouldn't impose an identity requirement on their subclasses.

For final classes with final fields, applying or removing the value keyword is a binary-compatible change. It is also source-compatible in most respects, except where a program attempts to apply a synchronized statement to an expression of the class's type.

Before applying the value keyword, class authors should be aware of some limitations of value classes:

Incompatible behavior. Developers who maintain published identity classes should decide whether any users are likely to depend on identity-sensitive behavior. For example, if these classes have not overridden equals and hashCode, those methods' behavior is identity-sensitive.

Classes with public constructors are particularly at risk, because in the past users could count on the new operation producing a unique identity that the user "owned". Subsequent uses of == or synchronization may depend on that assumption of unique ownership.

In anticipation of changing behavior, the value classes in the standard library have long been marked as value-based, warning that users should not depend on the unique identities of instances. Most of these value classes have allowed object creation only through factory methods. Since Java 16, Warnings for Value-Based Classes has discouraged the use of synchronization with these JDK classes.

Exposure of sensitive state. Because the == operator and System.identityHashCode depend on a value object's state, a malicious user could use those operations to try to infer the internal state of an object. Value classes are not designed to protect sensitive data against such attacks.

Limited serialization support. Some classes that model domain values are likely to implement Serializable. Traditional object deserialization in the JDK does not safely initialize the class's fields, and so is incompatible with value classes. Attempts to serialize or deserialize a value object will generally fail unless it is a value record or a JDK class instance.

Value class authors can work around this limitation with a serialization proxy, using the writeReplace and readResolve methods. (A value record may be a good candidate for a proxy class.) In the future, enhancements to the serialization mechanism are anticipated that will allow value classes to be serialized and deserialized directly.

No garbage collector interaction. Value class authors give up the ability to to manually interact with memory management and garbage collection via finalize methods and the java.lang.ref API. Code in finalize methods will never be run. Attempts to create Reference objects throw an IdentityException at run time, and javac produces identity warnings about uses of the API at compile time.

Since JDK 25, javac has produced identity warnings about the java.lang.ref API being applied to value-based classes.

No deep reflection. Third-party tools that modify final fields via Field.setAccessible are incompatible with safe initialization and will not be able to modify value class fields. These tools will need to initialize instances of a value class using the class's constructors, not indirectly with deep reflection.

Run-time optimizations for value objects

At run time, the JVM will typically optimize the use of value object references by avoiding traditional heap object allocation as much as possible, preferring reference encodings that refer to data stored on the stack, or that embed the data in the reference itself.

As we saw earlier, an array of LocalDate references might be flattened so that the array stores the objects' data directly. (The details of flattened encodings will vary, of course, at the discretion of the JVM implementation.)

+--------------+
| LocalDate[5] |
+--------------+
| 1|1996|01|23 |
| 1|1996|01|23 |
| 1|2026|01|23 |
| 0|0000|00|00 |
| 1|1996|01|23 |
+--------------+

An array of boxed Integer objects can be similarly flattened, in this case by simply concatenating a null flag to each int value.

+--------------+
| Integer[5]   |
+--------------+
| 1|1996       |
| 1|2006       |
| 1|1996       |
| 1|1          |
| 1|23         |
+--------------+

The layout of this array is not significantly different from that of a plain int array, except that it requires some extra bits for each null flag (in practice, this probably means that each reference takes up 64 bits).

Flattening can be applied to fields as well—a LocalDateTime on the heap could store flattened LocalDate and LocalTime references directly in its object layout.

+----------------------+
| LocalDateTime        |
+----------------------+
| date=1|2026|01|23    |
| time=1|09|00|00|0000 |
+----------------------+

Heap flattening must maintain the integrity of object data. For example, the flattened reference must be read and written atomically, or it could become corrupted. On common platforms, this limits the size of most flattened references to no more than 64 bits. So while it would theoretically be possible to flatten LocalDateTime references too, in practice they would probably be too big. In the future, 128-bit flattened encodings may be used on platforms that support atomic reads and writes of that size. And the Null-Restricted Value Types JEP will enable heap flattening for even larger value classes if the programmer is willing to opt out of atomicity guarantees.

When these flattened references are read from heap storage, they need to be re-encoded in a form that the JVM can readily work with. One strategy is to store each field of the flattened reference in a separate local variable. This set of local variables constitutes a scalarized encoding of the value object reference.

In HotSpot, scalarization is a JIT compilation technique, affecting the representation of references to value objects in the bodies and signatures of JIT-compiled methods.

The following code reads a LocalDate from an array and invokes the plusYears method. The simplified contents of the plusYears method is included for reference.

LocalDate d = arr[0];
arr[0] = d.plusYears(30);

public LocalDate plusYears(long yearsToAdd) {
    // avoid overflow:
    int newYear = YEAR.checkValidIntValue(this.year + yearsToAdd);
    // (simplification, skipping leap year adjustment)
    return new LocalDate(newYear, this.month, this.day);
}

In pseudo-code, the JIT-compiled code might look like the following, where the notation { ... } refers to a vector of multiple values. (Importantly, this is purely notational: there is no wrapper at run time.)

{ d_null, d_year, d_month, d_day } = $decode(arr[0]);
arr[0] = $encode($plusYears(d_null, d_year, d_month, d_day, 30));

static { boolean, int, byte, byte }
    $plusYears(boolean this_null, int this_year,
               byte this_month, byte this_day,
               long yearsToAdd) {
    if (this_null) throw new NullPointerException();
    int newYear = YEAR.checkValidIntValue(this_year + yearsToAdd);
    return { false, newYear, this_month, this_day };
}

Notice that this code never interacts with a pointer to a heap-allocated LocalDate—a flattened reference is converted to a scalarized reference, a new scalarized reference is created, and then that reference is converted to another flattened reference.

Unlike heap flattening, scalarization is not constrained by the size of the data—local variables being operated on in the stack are not at risk of data races. A scalarized encoding of a LocalDateTime reference might consist of a null flag, four components for the LocalDate reference, and five components for the LocalTime reference.

JVMs have used similar techniques to scalarize identity objects in local code when the JVM is able to prove that an object's identity is never used. But scalarization of value objects is more predictable and far-reaching, even across non-inlinable method invocation boundaries.

One limitation of both heap flattening and scalarization is that it is not typically applied to a variable with a type that is a supertype of a value class type. Notably, this includes method parameters of generic code whose erased type is Object. Instead, when an assignment to a supertype occurs, a scalarized value object reference may be converted to an ordinary heap object reference. But this allocation occurs only when necessary, and as late as possible.

Scope of changes

Value classes and objects have a broad and deep impact on the Java Platform. This JEP includes preview language, VM, and library features, summarized as follows.

In the Java language:

In the JVM:

(Additionally, compiled value classes use the features of Strict Field Initialization in the JVM (Preview) to guarantee that the class's fields are properly initialized.)

In the Java platform API:

Future Work

Null-Restricted Value Class Types (Preview) will build on this JEP, allowing programmers to manage the storage of nulls and enable more dense heap flattening in fields and arrays.

Enhanced Primitive Boxing (Preview) will enhance the language's use of primitive types, taking advantage of the lighter-weight characteristics of boxing to value objects.

JVM class and method specialization (JEP 218, with revisions) will allow generic classes and methods to specialize field, array, and local variable layouts when parameterized by value class types.

Alternatives

As discussed, JVMs have long performed escape analysis to identify objects that never rely on identity throughout their lifespan and can be scalarized. These optimizations are somewhat unpredictable, and do not help with objects that escape the scope of the optimization, including storage in fields and arrays.

Hand-coded optimizations via primitive values are possible to improve performance, but as noted in the "Motivation" section, these techniques require giving up valuable abstractions.

The C language and its relatives support flattened storage for structs and similar class-like abstractions. For example, the C# language has value types. Unlike value objects, instances of these abstractions have identity, meaning they support operations such as field mutation. As a result, the semantics of copying on assignment, invocation, etc., must be carefully specified, leading to a more complex user model and less flexibility for runtime implementations. We prefer an approach that leaves these low-level details to the discretion of JVM implementations.

Risks and Assumptions

The feature makes significant changes to the Java object model. Developers may be surprised by, or encounter bugs due to, changes in the behavior of operations such as == and synchronized. We expect such disruptions to be rare and tractable.

Some changes could potentially affect the performance of identity objects. The if_acmpeq test, for example, typically only costs one instruction cycle, but will now need an additional check to detect value objects. But the identity class case can be optimized as a fast path, and we believe we have minimized any performance regressions.

There is a security risk that == and hashCode can indirectly expose private field values. Further, two large trees of value objects can take unbounded time to compute ==. Developers need to understand these risks.

Dependencies

Strict Field Initialization in the JVM (Preview) provides the JVM mechanism necessary to require, through verification, that value class instance fields are initialized during early construction