JEP 401: Value Classes and Objects (Preview)
Owner | Dan Smith |
Type | Feature |
Scope | SE |
Status | Draft |
Component | specification |
Discussion | valhalla dash dev at openjdk dot java dot net |
Effort | XL |
Duration | XL |
Reviewed by | Alex Buckley, Brian Goetz |
Created | 2020/08/13 19:31 |
Updated | 2025/08/20 00:01 |
Issue | 8251554 |
Summary
Enhance the Java Platform with value objects: class instances that have
only final
fields and lack object identity.
This is a preview language and VM feature.
Goals
-
Allow developers to opt in to a programming model for domain values in which objects are distinguished solely by the values of their fields, much as the
int
value3
is distinguished from theint
value4
. -
Support compatible migration of existing classes that represent domain values to this programming model. Migrate suitable classes in the JDK, such as
Integer
andLocalDate
, to be value classes. -
Maximize the freedom of the JVM to store domain values in ways that improve memory footprint, locality, and garbage collection efficiency.
Non-Goals
-
It is not a goal to automatically treat existing classes as value classes, even if they meet the requirements for how value classes are declared and used. The behavioral changes require an explicit opt-in.
-
It is not a goal to "fix" the
==
operator so that programmers can use it in place ofequals
. This JEP redefines==
only as much as necessary to cope with a new kind of identity-free object. The usual advice to compare objects in most contexts using theequals
method still applies. -
It is not a goal to introduce a
struct
feature in the Java language. Java programmers are not asked to understand new semantics for memory management or variable storage. Java continues to operate on just two kinds of data: primitives and object references. -
It is not a goal to change the treatment of primitive types. Primitive types behave like value classes in many ways, but are a distinct concept. A separate JEP will provide enhancements to make primitive types more class-like and compatible with generics.
-
It is not a goal to guarantee any particular optimization strategy or memory layout. This JEP enables many potential optimizations; only some will be implemented initially. Some optimizations, such as layouts that exclude null, will only be possible after future language and JVM enhancements.
Motivation
Java developers often need to represent domain values: the date of an event, the
color of a pixel, the shipping address of an order, and so on. Developers
usually model these values with immutable classes that contain just enough
business logic to construct, validate, and transform instances. The
toString
, equals
, and hashCode
methods in these classes are defined
so that equivalent instances can be used interchangeably.
As an example, event dates can be represented with the JDK's LocalDate
class:
jshell> LocalDate d1 = LocalDate.of(1996, 1, 23)
d1 ==> 1996-01-23
jshell> LocalDate d2 = d1.plusYears(30)
d2 ==> 2026-01-23
jshell> LocalDate d3 = d2.minusYears(30)
d3 ==> 1996-01-23
jshell> d1.equals(d3)
$4 ==> true
Developers will regard the "essence" of a LocalDate
object as its year, month,
and day values. But to Java, the essence of any object is its identity.
Each time the of
method in LocalDate
invokes new LocalDate(...)
, an
object with a unique identity is allocated, distinguishable from every other
object in the system.
The easiest way to observe the identity of an object is with the ==
operator:
jshell> d1 == d3
$6 ==> false
Even though d1
and d3
represent the same year-month-day triple (d1.equals(d3)
is true
),
they are two objects with distinct identities.
Domain values don't need identity
For mutable objects, identity is important: it lets us distinguish two objects
that have the same state now but will have different state in the future. For
example, suppose a class Customer
has a field lastOrderedDate
that is
mutated when the customer makes a new order. Two Customer
objects might have
the same lastOrderedDate
, but it would be a coincidence; when one of the customers makes
a new order, the application will mutate the lastOrderedDate
of one Customer
object but not the other, relying on identity to pick the right one.
In other words, when objects are mutable, they are not interchangeable. But most
domain values are not mutable and are interchangeable. There is no practical
difference between two LocalDate
objects representing 1996-01-23
, because
their state is fixed and unchanging. They represent the same domain value, both
now and in the future. There is no need to distinguish the two objects via their
identities.
In fact, object identity is actively confusing when objects are immutable and
are meant to be interchangeable. Most developers will recall the experience of
unwittingly using ==
to compare objects, as in d1 == d3
above, and being
mystified by a false
result even though the objects' state and behavior seem
identical.
The JDK tries to reduce confusion for the immutable classes that model primitive
values, such as Integer
. In particular, the autoboxing of small int
values to Integer
uses a cache to avoid creating Integer
objects with unique identities.
However, this cache, somewhat arbitrarily, does not extend to four-digit int
values like 1996
:
jshell> Integer i = 96, j = 96;
i ==> 96
j ==> 96
jshell> i == j
$3 ==> true
jshell> Integer x = 1996, y = 1996;
x ==> 1996
y ==> 1996
jshell> x == y
$6 ==> false
For domain values like Integer
, the fact that each object has unique identity
is unwanted complexity that leads to surprising behavior and exposes
incidental implementation choices. This extra complexity could be avoided if
objects whose state and behavior make them interchangeable could be freed from
the legacy requirement to have distinct identities.
Object identity is expensive at run time
Java's requirement that every object have identity, even if some domain values don't want it, is a performance impediment. It means the JVM has to allocate memory for each newly created object, distinguishing it from every object already in the system, and reference the location in memory whenever the object is used or stored.
For example, suppose a program creates arrays of int
values and LocalDate
references:
jshell> int[] ints = { 1996, 2006, 1996, 1, 23 }
ints ==> int[5] { 1996, 2006, 1996, 1, 23 }
jshell> LocalDate[] dates = { d1, d1, d2, null, d3 }
dates ==> LocalDate[5] { 1996-01-23, 1996-01-23, 2026-01-23,
null, 1996-01-23 }
The int
array can be allocated by the JVM as a simple block of memory:
+----------+
| int[5] |
+----------+
| 1996 |
| 2006 |
| 1996 |
| 1 |
| 23 |
+----------+
In contrast, the LocalDate
array must be represented as a sequence of pointers,
each referencing a location in memory where an object has been allocated:
+--------------+
| LocalDate[5] |
+--------------+
| 87fa1a09 | -----------------------> +-----------+
| 87fa1a09 | -----------------------> | LocalDate |
| 87fb4ad2 | ------> +-----------+ +-----------+
| 00000000 | | LocalDate | | y=1996 |
| 87fb5366 | --- +-----------+ | m=1 |
+--------------+ | | y=2026 | | d=23 |
v | m=1 | +-----------+
+-----------+ | d=23 |
| LocalDate | +-----------+
+-----------+
| y=1996 |
| m=1 |
| d=23 |
+-----------+
Even though the data modeled by the LocalDate
array is not
significantly more complex than the int
array -- a year-month-day
triple is effectively 48 bits of primitive data -- the memory footprint is far
greater.
Worse, when a program iterates over the LocalDate
array, each pointer may need
to be dereferenced. CPUs use caches to enable fast access to chunks of memory;
if the array exhibits poor
memory locality
(a distinct possibility if the LocalDate
objects were allocated at different
times or out of order), every dereference may require caching a different
chunk of memory, frustrating performance.
In some application domains, developers program for speed by creating
as few objects as possible, thus de-stressing the garbage collector and
improving locality. For example, they might encode event dates with an int
representing an epoch day. Unfortunately, this approach gives
up the functionality of classes that makes Java code so maintainable:
meaningful names, private state, data validation by constructors, convenience
methods, etc. A developer operating on dates represented as int
values might
accidentally interpret the value relative to a start date in
1601 or 1980
rather than the intended 1970 start date.
Programming without identity
Trillions of Java objects are created every day, each one bearing a unique
identity. We believe the time has come to let Java developers choose which
objects in the program need identity, and which do not. An immutable class like
LocalDate
that represents domain values could opt out of identity, so that it
would be impossible to distinguish between two LocalDate
objects representing
the date 1996-01-23
, just as it is impossible to distinguish between two int
values representing the number 4
.
By opting out of identity, developers are opting in to a programming model that provides the best of both worlds: the abstraction of classes with the simplicity and performance benefits of primitives.
In future, this programming model will support new Java Platform APIs, such as classes that encode different kinds of integers and floating-point values, and new Java language features, such as user-defined conversions and mathematical operators for domain values.
Description
JDK NN introduces value objects to model immutable domain values. A value
object is an instance of a value class, declared with the value
modifier.
Classes without the value
modifier are called identity classes, and their
instances are identity objects.
Java programs manipulate objects through references. A reference to an object
is stored in a variable and lets us find the object's fields.
Traditionally, a reference also encodes the unique identity of an object: each
execution of new
allocates a fresh object and returns a unique reference,
which can then be stored in multiple variables (aliasing).
The ==
operator compares objects by comparing references, so references to two
objects are not ==
even if the objects have identical field values.
A reference to a value object is stored in a variable and lets us find the
object's fields, but it does not serve as the unique identity of the object.
Executing new
might not allocate a fresh object and might instead return a
reference to an existing object, or even a "reference" that embodies the
object directly. The ==
operator compares value objects by comparing their
field values, so references to two objects are ==
if the objects have
identical field values.
Developers can save memory and improve performance by using value objects for
immutable data. Because programs cannot tell the difference between two value
objects with identical field values (not even with ==
), the Java Virtual
Machine is able to change how a value object is laid out in memory without
affecting the program; for example, its fields could be stored on the stack
rather than the heap.
The following sections explore how value objects differ from identity objects and illustrate how to declare value classes. This is followed by an in-depth treatment of the special behaviors of value objects, considerations for value class declarations, and the JVM's handling of value classes and objects.
Enabling preview features
Value classes and objects are a preview language feature, disabled by default.
To try the examples below in JDK NN you must enable preview features:
-
Compile the program with
javac --release NN --enable-preview Main.java
and run it withjava --enable-preview Main
; or, -
When using the source code launcher, run the program with
java --enable-preview Main.java
; or, -
When using jshell, start it with
jshell --enable-preview
.
Programming with value objects
In JDK NN with preview features enabled, 29 classes in the JDK are declared as value classes. Some of these classes include:
-
In
java.lang
:Integer
,Long
,Float
,Double
,Byte
,Short
,Boolean
, andCharacter
-
In
java.util
:Optional
,OptionalInt
,OptionalLong
, andOptionalDouble
-
In
java.time
:LocalDate
,LocalTime
,Instant
,Duration
,LocalDateTime
,OffsetDateTime
, andZonedDateTime
All instances of these classes are value objects. This includes the
boxed primitives that are instances of Integer
, Long
,
etc. The ==
operator compares value objects by their field values, so, e.g.,
Integer
objects are ==
if they box the same primitive values:
% -> jshell --enable-preview
| Welcome to JShell -- Version 25-internal
| For an introduction type: /help intro
jshell> Integer x = 1996, y = 1996;
x ==> 1996
y ==> 1996
jshell> x == y
$3 ==> true
Similarly, two LocalDate
objects are ==
if they have the same year, month,
and day values:
jshell> LocalDate d1 = LocalDate.of(1996, 1, 23)
d1 ==> 1996-01-23
jshell> LocalDate d2 = d1.plusYears(30)
d2 ==> 2026-01-23
jshell> LocalDate d3 = d2.minusYears(30)
d3 ==> 1996-01-23
jshell> d1 == d3
$7 ==> true
The String
class has not been made a value class. Instances of String
are
always identity objects. We can use the Objects.hasIdentity
method, new in
JDK NN, to observe whether an object is an identity object.
jshell> String s = "abcd"
s ==> "abcd"
jshell> Objects.hasIdentity(s)
$9 ==> true
jshell> Objects.hasIdentity(d1)
$10 ==> false
jshell> String t = "aabcd".substring(1)
t ==> "abcd"
jshell> s == t
$13 ==> false
In most respects, value objects work the way that objects have always worked in Java. However, a few identity-sensitive operations, such as synchronization, are not supported by value objects.
jshell> synchronized (d1) { d1.notify(); }
| Error:
| unexpected type
| required: a type with identity
| found: java.time.LocalDate
| synchronized (d1) { d1.notify(); }
| ^--------------------------------^
jshell> Object o = d1
o ==> 1996-01-23
jshell> synchronized (o) { o.notify(); }
| Exception java.lang.IdentityException: Cannot synchronize on
an instance of value class java.time.LocalDate
| at (#19:1)
The JVM has a lot of freedom to encode references to value objects at run time in ways that optimize memory footprint, locality, and garbage collection efficiency. For example, we saw the following array earlier, implemented with pointers to heap objects:
jshell> LocalDate[] dates = { d1, d1, d2, null, d3 }
dates ==> LocalDate[5] { 1996-01-23, 1996-01-23, 2026-01-23,
null, 1996-01-23 }
Now that LocalDate
objects lack identity, the JVM could implement the array
using "references" that encode the fields of each LocalDate
directly. Each
array component can be represented as a 64-bit word that indicates whether the
reference is null, and if not, directly stores the year, month, and day field
values of the value object:
+--------------+
| LocalDate[5] |
+--------------+
| 1|1996|01|23 |
| 1|1996|01|23 |
| 1|2026|01|23 |
| 0|0000|00|00 |
| 1|1996|01|23 |
+--------------+
The performance characteristics of this LocalDate
array may be similar to
those of an ordinary int
array:
+----------+
| int[5] |
+----------+
| 1996 |
| 2006 |
| 1996 |
| 1 |
| 23 |
+----------+
This optimization is just one example; some value classes, like LocalDateTime
,
are too large to take advantage of this particular technique. Still, the lack of
identity enables the JVM to optimize references to value objects in many ways.
Declaring value classes
Developers can declare their own value classes by applying the value
modifier
to any class whose instances should be immutable and interchangeable:
-
Immutable: All instance fields of the class should be
final
, and the domain value represented by an instance will not change over time; and -
Interchangeable: It's not necessary to distinguish between two separately-created instances that represent the same domain value
When the value
modifier is applied to a class, its fields are implicitly final
.
The class itself is also implicitly final
, so cannot be extended.
There is no restriction on the types of the fields in a value class. The fields may store references to other value objects, or to identity objects, e.g., strings.
Record classes are transparent data carriers whose fields are always
final
, so they are often good candidates to be value classes.
jshell> value record Point(int x, int y) {}
| created record Point
jshell> Point p = new Point(17, 3)
p ==> Point[x=17, y=3]
jshell> Objects.hasIdentity(p)
$7 ==> false
jshell> new Point(17, 3) == p
$8 ==> true
Many classes represent immutable and interchangeable domain values but cannot be record classes because they are not transparent: the internal state encodes the external state indirectly, and is often encapsulated for good measure, For example, the following class represents currency values with a private field internally, so it cannot be a record class; nevertheless, it is a good candidate to be a value class.
value class EURCurrency {
private int cs; // implicitly final
private EURCurrency(int cs) { this.cs = cs; }
public EURCurrency(int euros, int cents) {
this(euros * 100 + (euros < 0 ? -cents : cents));
}
public int euros() { return cs/100; }
public int cents() { return Math.abs(cs%100); }
public String toString() {
return "€%d,%d".formatted(euros(), cents());
}
}
Comparing value objects
The purpose of the ==
operator in Java is to test whether two referenced objects
are indistinguishable. If two references are ==
, the JVM can freely replace
one object with the other, and no code will be able to tell the difference.
For identity objects, the ==
operator works the same in JDK NN as in 1.0:
it checks whether two references are to the same object, at the same location in memory.
For value objects, the ==
operator checks for statewise equivalence. This means the two references are to objects with the same field values. Two value objects are statewise equivalent if:
-
They are instances of the same value class;
-
Their primitive-typed fields store the same bit patterns; and
-
Their reference-typed fields are
==
: either twonull
references, or two references to the same identity object, or two reference to statewise-equivalent value objects.
==
and equals
will often produce the same results for value objects.
However, for some value classes, instances may be interchangeable (so equals
) even if their field values are different (so not ==
).
Developers who want to test whether two value objects represent the same
domain value should use the equals
method, and class authors should define
equals
in a way that always returns true
for interchangeable domain values.
An example where ==
and equals
may differ for value objects involves the LazySubstring
value class below.
It represents a substring of a string lazily, without allocating a new char[]
in memory.
The internal state of a LazySubstring
instance is a source string and two coordinates,
while the domain value represented by the instance is a character sequence produced
by toString
. Accordingly, two instances may model the same character sequence (so equals
)
even though their internal state is different (so not ==
).
value class LazySubstring {
private String str;
private int start, end;
public LazySubstring(String s, int i, int j) {
str = s; start = i; end = j;
}
public String toString() {
return str.substring(start, end);
}
public boolean equals(Object o) {
return o instanceof LazySubstring &&
toString().equals(o.toString());
}
public int hashCode() {
return Objects.hash(LazySubstring.class, toString());
}
}
jshell> LazySubstring sub1 = new LazySubstring("ringing", 1, 4);
sub1 ==> ing
jshell> LazySubstring sub2 = new LazySubstring("ringing", 4, 7);
sub2 ==> ing
jshell> sub1.equals(sub2)
$3 ==> true
jshell> sub1 == sub2
$4 ==> false
Another scenario where ==
and equals
may differ is where value objects have fields that refer to identity objects.
Even if the identity objects are interchangeable according to equals
, they may not be statewise equivalent according to ==
, so the value objects will not be statewise equivalent even if they are interchangeable.
jshell> value record Country(String code) {}
| created record Country
jshell> Country c1 = new Country("SWE")
c1 ==> Country[code=SWE]
jshell> Country c2 = new Country("SWEDEN".substring(0,3))
c2 ==> Country[code=SWE]
jshell> c1.equals(c2) // The equals method of a record class compares using equals
$8 ==> true
jshell> c1 == c2
$9 ==> false
Yet another situation where ==
and equals
may differ is where value objects have fields that are float
or double
.
The primitive floating-point types support multiple encodings of NaN
using different
bit patterns. These NaN
values are treated as interchangeable by most floating-point
operations, but because each bit pattern is distinct, value objects that wrap
different encodings of NaN
are not statewise equivalent according to ==
. The value
class author must decide whether the distinction is meaningful for the equals
method. For example, the default behavior of equals
in a value record class does not
consider NaN
encodings to be a meaningful distinction.
jshell> value record Length(float val) {}
| created record Length
jshell> Length l1 = new Length(Float.intBitsToFloat(0x7ff80000))
l1 ==> Length[val=NaN]
jshell> Length l2 = new Length(Float.intBitsToFloat(0x7ff80001))
l2 ==> Length[val=NaN]
jshell> l1.equals(l2)
$13 ==> true
jshell> l1 == l2
$14 ==> false
jshell> Float.floatToRawIntBits(l1.val())
$15 ==> 2146959360
jshell> Float.floatToRawIntBits(l2.val())
$16 ==> 2146959361
Note that ==
performs a "deep" comparison of nested references to other value objects. The number of comparisons is unbounded. In the following example, two deep nests of Box
objects require a full traversal to determine whether the objects are statewise equivalent.
jshell> value record Box(Object val) {}
| created record Box
jshell> var b1 = new Box(new Box(new Box(new Box(sub1))))
b1 ==> Box[val=Box[val=Box[val=Box[val=ing]]]]
jshell> var b2 = new Box(new Box(new Box(new Box(sub2))))
b2 ==> Box[val=Box[val=Box[val=Box[val=ing]]]]
jshell> b1.equals(b2)
$20 ==> true
jshell> b1 == b2
$21 ==> false
Constructors of value classes are constrained (discussed later) so that the
recursive application of ==
to value objects will never cause an infinite loop.
Value classes and Subclassing
Every value class belongs to a class hierarchy with java.lang.Object
at its
root, just like every identity class. There is no java.lang.Value
superclass of all
value classes.
By default, a value class extends java.lang.Object
and can implement interfaces.
This means variables declared with Object
, or with interfaces, can store references to both value objects and identity objects.
jshell> Object o = LocalDate.of(1996, 1, 23)
o ==> 1996-01-23
jshell> Objects.hasIdentity(o)
$2 ==> false
jshell> Comparable<?> comp = 123
comp ==> 123
jshell> Objects.hasIdentity(comp)
$2 ==> false
jshell> comp = "abc"
comp ==> "abc"
jshell> Objects.hasIdentity(comp)
$4 ==> true
By default, a value class is implicitly final
and cannot be extended. However, a
value class may be declared abstract
, allowing it to be extended by other
classes. These subclasses may be value classes or identity classes.
Thus, a value class can extend either java.lang.Object
or an abstract value class.
The rules for abstract value class declarations are the same as for concrete
value class declarations. For example, all instance fields of an abstract value
class are implicitly final
.
Many existing abstract classes are good candidates to be value classes. Applying the value
modifier to an abstract class indicates that the class has no need for identity but does not restrict subclasses from having identity. For example, the abstract class Number
has no fields, nor any code that depends on identity-sensitive features, so it can be safely migrated to an abstract value class.
abstract value class Number implements Serializable {
public abstract int intValue();
public abstract long longValue();
public byte byteValue() { return (byte) intValue(); }
...
}
Both the value class Integer
and the identity class java.math.BigInteger
extend Number
.
jshell> Number num = 123
num ==> 123
jshell> Objects.hasIdentity(num)
$6 ==> false
jshell> num = BigInteger.valueOf(123)
num ==> 123
jshell> Objects.hasIdentity(num)
$8 ==> true
An abstract value class can be sealed
to limit who can extend the class.
sealed abstract value class UserID
permits EmailID, PhoneID, UsernameID {
...
}
value record EmailID(String name, String domain) { ... }
value record PhoneID(String digits) { ... }
value record UsernameID(String name) { ... }
Safe construction
Constructors initialize newly-created objects, including setting the values of the objects' fields. Because value objects do not have identity, their initialization requires special care.
An object being constructed is "larval"—it has been created, but it is not yet fully-formed. Larval objects must be handled carefully, because the expected properties and invariants of the object may not yet hold—for example, the fields of a larval object may not be set. If a larval object is shared with outside code, that code may even observe the mutation of a final field!
Traditionally, a constructor begins the initialization process by invoking a
superclass constructor, super(...)
. After the superclass returns, the subclass
then proceeds to set its declared instance fields and perform other
initialization tasks. This pattern exposes a completely uninitialized subclass
to any larval object leakage occurring in a superclass constructor.
The Flexible Constructor Bodies feature enables an
alternative approach to initialization, in which fields can be set and other
code executed before the super(...)
invocation. There is a two-phase
initialization process: early construction before the super(...)
invocation,
and late construction afterwards.
During the early construction phase, larval object leakage is impossible: the
constructor may set the fields of the larval object, but may not invoke instance
methods or otherwise make use of this
. Fields that are initialized in the
early phase are set before they can ever be read, even if a superclass leaks the
larval object. Final fields, in particular, can never be observed to mutate.
In a value class, all constructor and initializer code normally occurs in the
early construction phase. This means that attempts to invoke instance methods or
otherwise use this
will fail:
value class Name {
String name;
int length;
Name(String n) {
name = n;
length = computeLength(); // error!
}
private int computeLength() {
return name.length();
}
}
Fields that are declared with initializers get set at the start of the
constructor (as usual), but any implicit super()
call gets placed at the end
of the constructor body.
When a constructor includes code that needs to work with this
, an explicit
super(...)
or this(...)
call can be used to mark the transition to the late
phase. But all fields must be initialized before the super(...)
call, without
reference to this
:
value class Name {
String name;
int length;
Name(String n) {
name = n;
length = computeLength(name); // ok
super(); // all fields must be set at this point
System.out.println("Name: " + this);
}
// refactored to be static:
private static int computeLength(String n) {
return n.length();
}
}
For convenience, the early construction rules are relaxed by this JEP to allow
the class's fields to be read as well as written—both references to the
field name
in the above constructor are legal. It continues to be illegal to
refer to inherited fields, invoke instance methods, or share this
with other
code until the late construction phase.
Instance initializer blocks (a rarely-used feature) continue to run in the late phase, and so may not assign to value class instance fields.
This scheme is also appropriate for identity records, so this JEP modifies the language rules for records such that their constructors always run in the early construction phase. This is not a source-compatible language change, but is not expected to be disruptive.
In the rare case that a record constructor needs to access this
, an explicit
super()
can be inserted, but the record's fields must be set beforehand. The
following record declaration will fail to compile when preview features are
enabled, because it now makes reference to this
in the early construction
phase.
record Node(String label, List<Node> edges) {
public Node {
validateNonNull(this, label); // error!
validateNonNull(this, edges); // error!
}
static void validateNonNull(Object o, Object val) {
if (val == null) {
throw new IllegalArgumentException(
"null arg for " + o);
}
}
}
(Note that this attempt to provide useful diagnostics by sharing this
is
misguided anyway: in a record's compact constructor, the fields are not set
until the end of the constructor body; before they are set, the toString
result will always be Node[label=null, edges=null]
.)
Finally, in normal identity classes, we think developers should write
constructors and initializers that avoid the risk of larval object leakage by
generally adopting the early construction constraints: read and write the
declared fields of the class, but otherwise avoid any dependency on this
, and
where a dependency is necessary, mark it as deliberate by putting it after an
explicit super(...)
or this(...)
call. To encourage this style, javac
provides lint
warnings indicating this
dependencies in normal identity
class constructors. (In the future, we anticipate that normal identity classes
will have a way to adopt the constructor timing of value classes and records. A
class that compiles without warning will likely be able to cleanly make that
transition.)
Inherited methods of java.lang.Object
By default, a value class inherits equals
, hashCode
, and toString
from java.lang.Object
.
-
The inherited implementation of
Object.equals
uses==
to compare objects. For value objects, this tests for statewise equivalence, based on the objects' field values. This might be the rightequals
behavior for a value class, but if it isn't then the class author must overrideequals
. -
The inherited implementation of
Object.hashCode
computes a hash from the object's field values. (This value can also be computed viaSystem.identityHashCode
.) As usual, thehashCode
method should be overridden by a value class whenever it overridesequals
. -
The default behavior of
Object.toString
is to return a string of the form"ClassName@hashCode"
. Since the defaulthashCode
of a value object is derived from its field values, not its identity, most value class authors will want to overridetoString
to more legibly convey the domain value represented by the object.
In a value record, as for all records, the default equals
, hashCode
, and
toString
behavior is to recursively apply the same operations to the record
components.
A few other methods of Object
interact with value objects in interesting ways:
-
For a
Cloneable
value class, theObject.clone
method produces a value object that is indistinguishable from the original—the usual expectation thatx.clone() != x
is not meaningful for value objects. Value classes that store references to identity objects may wish to overrideclone
and perform a "deep copy" of these identity objects. -
The
wait
andnotify
methods require that the object be locked in the current thread; since it is impossible to synchronize on a value object, attempts to call these methods will always fail with anIllegalMonitorStateException
. -
The
finalize
method of a value object will never be invoked by the garbage collector.
Limitations of value classes
Value classes, and especially value records, are useful tools for modeling immutable domain values that are interchangeable when two instances represent the same value.
As a general rule, if a class with immutable state doesn't need identity, it should be a value class. This includes abstract classes, which often have no state at all and shouldn't impose an identity requirement on their subclasses.
For final
classes with final
fields, applying or removing the value
keyword is a binary-compatible change. It is also source-compatible in most
respects, except where a program attempts to apply a synchronized
statement
to an expression of the class's type.
Before applying the value
keyword, class authors should be aware of some
limitations of value classes:
Incompatible behavior. Developers who maintain published identity classes
should decide whether any users are likely to depend on identity-sensitive
behavior. For example, if these classes have not overridden equals
and
hashCode
, those methods' behavior is identity-sensitive.
Classes with public
constructors are particularly at risk, because in the past
users could count on the new
operation producing a unique identity that the
user "owned". Subsequent uses of ==
or synchronization may depend on that
assumption of unique ownership.
In anticipation of changing behavior, the value classes in the standard library have long been marked as value-based, warning that users should not depend on the unique identities of instances. Most of these value classes have allowed object creation only through factory methods. Since Java 16, Warnings for Value-Based Classes has discouraged the use of synchronization with these JDK classes.
Exposure of sensitive state. Because the ==
operator and
System.identityHashCode
depend on a value object's state, a malicious
user could use those operations to try to infer the internal state of an object.
Value classes are not designed to protect sensitive data against such attacks.
Limited serialization support. Some classes that model domain values are
likely to implement Serializable
. Traditional object deserialization in the
JDK does not safely initialize the class's fields, and so is incompatible with
value classes. Attempts to serialize or deserialize a value object will
generally fail unless it is a value record or a JDK class instance.
Value class authors can work around this limitation with a serialization proxy,
using the writeReplace
and readResolve
methods. (A value record may be a
good candidate for a proxy class.) In the future, enhancements to the
serialization mechanism are anticipated that will allow value classes to be
serialized and deserialized directly.
No garbage collector interaction. Value class authors give up the ability to
to manually interact with memory management and garbage collection via
finalize
methods and the java.lang.ref
API. Code in finalize
methods will
never be run. Attempts to create Reference
objects throw an
IdentityException
at run time, and javac
produces identity
warnings about
uses of the API at compile time.
Since JDK 25, javac
has produced identity
warnings about the java.lang.ref
API being
applied to value-based classes.
No deep reflection. Third-party tools that modify final
fields via
Field.setAccessible
are incompatible with safe initialization and will not be
able to modify value class fields. These tools will need to initialize instances
of a value class using the class's constructors, not indirectly with deep
reflection.
Run-time optimizations for value objects
At run time, the JVM will typically optimize the use of value object references by avoiding traditional heap object allocation as much as possible, preferring reference encodings that refer to data stored on the stack, or that embed the data in the reference itself.
As we saw earlier, an array of LocalDate
references might be flattened so
that the array stores the objects' data directly. (The details of flattened
encodings will vary, of course, at the discretion of the JVM implementation.)
+--------------+
| LocalDate[5] |
+--------------+
| 1|1996|01|23 |
| 1|1996|01|23 |
| 1|2026|01|23 |
| 0|0000|00|00 |
| 1|1996|01|23 |
+--------------+
An array of boxed Integer
objects can be similarly flattened, in this case by
simply concatenating a null flag to each int
value.
+--------------+
| Integer[5] |
+--------------+
| 1|1996 |
| 1|2006 |
| 1|1996 |
| 1|1 |
| 1|23 |
+--------------+
The layout of this array is not significantly different from that of a plain
int
array, except that it requires some extra bits for each null flag (in
practice, this probably means that each reference takes up 64 bits).
Flattening can be applied to fields as well—a LocalDateTime
on the heap could
store flattened LocalDate
and LocalTime
references directly in its object
layout.
+----------------------+
| LocalDateTime |
+----------------------+
| date=1|2026|01|23 |
| time=1|09|00|00|0000 |
+----------------------+
Heap flattening must maintain the integrity of object data. For example, the
flattened reference must be read and written atomically, or it could become
corrupted. On common platforms, this limits the size of most flattened
references to no more than 64 bits. So while it would theoretically be possible
to flatten LocalDateTime
references too, in practice they would probably be
too big. In the future, 128-bit flattened encodings may be used on platforms
that support atomic reads and writes of that size. And the
Null-Restricted Value Types JEP
will enable heap flattening for even larger value classes if the programmer is
willing to opt out of atomicity guarantees.
When these flattened references are read from heap storage, they need to be re-encoded in a form that the JVM can readily work with. One strategy is to store each field of the flattened reference in a separate local variable. This set of local variables constitutes a scalarized encoding of the value object reference.
In HotSpot, scalarization is a JIT compilation technique, affecting the representation of references to value objects in the bodies and signatures of JIT-compiled methods.
The following code reads a LocalDate
from an array and invokes the plusYears
method. The simplified contents of the plusYears
method is included for
reference.
LocalDate d = arr[0];
arr[0] = d.plusYears(30);
public LocalDate plusYears(long yearsToAdd) {
// avoid overflow:
int newYear = YEAR.checkValidIntValue(this.year + yearsToAdd);
// (simplification, skipping leap year adjustment)
return new LocalDate(newYear, this.month, this.day);
}
In pseudo-code, the JIT-compiled code might look like the following, where the
notation { ... }
refers to a vector of multiple values. (Importantly, this is
purely notational: there is no wrapper at run time.)
{ d_null, d_year, d_month, d_day } = $decode(arr[0]);
arr[0] = $encode($plusYears(d_null, d_year, d_month, d_day, 30));
static { boolean, int, byte, byte }
$plusYears(boolean this_null, int this_year,
byte this_month, byte this_day,
long yearsToAdd) {
if (this_null) throw new NullPointerException();
int newYear = YEAR.checkValidIntValue(this_year + yearsToAdd);
return { false, newYear, this_month, this_day };
}
Notice that this code never interacts with a pointer to a heap-allocated
LocalDate
—a flattened reference is converted to a scalarized reference, a
new scalarized reference is created, and then that reference is converted to
another flattened reference.
Unlike heap flattening, scalarization is not constrained by the size of the
data—local variables being operated on in the stack are not at risk of data
races. A scalarized encoding of a LocalDateTime
reference might consist of a
null flag, four components for the LocalDate
reference, and five components
for the LocalTime
reference.
JVMs have used similar techniques to scalarize identity objects in local code when the JVM is able to prove that an object's identity is never used. But scalarization of value objects is more predictable and far-reaching, even across non-inlinable method invocation boundaries.
One limitation of both heap flattening and scalarization is that it is not
typically applied to a variable with a type that is a supertype of a value
class type. Notably, this includes method parameters of generic code whose
erased type is Object
. Instead, when an assignment to a supertype occurs, a
scalarized value object reference may be converted to an ordinary heap object
reference. But this allocation occurs only when necessary, and as late as
possible.
Scope of changes
Value classes and objects have a broad and deep impact on the Java Platform. This JEP includes preview language, VM, and library features, summarized as follows.
In the Java language:
-
The
value
class modifier, with associated compilation rules, opts in to the semantics of value classes. -
Safe construction rules are enforced for record classes.
In the JVM:
-
The
ACC_IDENTITY
flag indicates that a class is an identity class. It is left unset for value classes and interfaces. This flag replaces theACC_SUPER
flag, which has been unused by the JVM since Java 8. -
The
LoadableDescriptors
attribute lists the names of value classes appearing in the field or method descriptors of the current class. This attribute authorizes the JVM to load the named value classes early enough that it can optimize the layouts of references to instances from the current class. -
Heap storage and JIT-compiled code are engineered to optimize the handling of value object references.
(Additionally, compiled value classes use the features of Strict Field Initialization in the JVM (Preview) to guarantee that the class's fields are properly initialized.)
In the Java platform API:
-
The full list of classes in the JDK that are treated as value classes when preview features is as follows:
In
java.lang
:Integer
,Long
,Float
,Double
,Byte
,Short
,Boolean
,Character
,Number
, andRecord
In
java.util
:Optional
,OptionalInt
,OptionalLong
, andOptionalDouble
In
java.time
:LocalDate
,Period
,Year
,YearMonth
,MonthDay
,LocalTime
,Instant
,Duration
,LocalDateTime
,OffsetTime
,OffsetDateTime
,ZonedDateTime
In
java.time.chrono
:HijrahDate
,JapaneseDate
,MinguoDate
,ThaiBuddhistDate
, andChronoLocalDateImpl
-
The methods
Objects.hasIdentity
andObjects.requireIdentity
, and theIdentityException
class, are (reflective?) preview APIs. -
The
java.lang.Object
,java.lang.ref
,System.identityHashCode
, and serialization APIs are modified to give special handling to value objects.
Future Work
Null-Restricted Value Class Types (Preview) will build on this JEP, allowing programmers to manage the storage of nulls and enable more dense heap flattening in fields and arrays.
Enhanced Primitive Boxing (Preview) will enhance the language's use of primitive types, taking advantage of the lighter-weight characteristics of boxing to value objects.
JVM class and method specialization (JEP 218, with revisions) will allow generic classes and methods to specialize field, array, and local variable layouts when parameterized by value class types.
Alternatives
As discussed, JVMs have long performed escape analysis to identify objects that never rely on identity throughout their lifespan and can be scalarized. These optimizations are somewhat unpredictable, and do not help with objects that escape the scope of the optimization, including storage in fields and arrays.
Hand-coded optimizations via primitive values are possible to improve performance, but as noted in the "Motivation" section, these techniques require giving up valuable abstractions.
The C language and its relatives support flattened storage for struct
s and
similar class-like abstractions. For example, the C# language has
value types.
Unlike value objects, instances of these abstractions have identity, meaning
they support operations such as field mutation. As a result, the semantics of
copying on assignment, invocation, etc., must be carefully specified, leading to
a more complex user model and less flexibility for runtime implementations. We
prefer an approach that leaves these low-level details to the discretion of JVM
implementations.
Risks and Assumptions
The feature makes significant changes to the Java object model. Developers may
be surprised by, or encounter bugs due to, changes in the behavior of operations
such as ==
and synchronized
. We expect such disruptions to be rare and
tractable.
Some changes could potentially affect the performance of identity objects. The
if_acmpeq
test, for example, typically only costs one instruction
cycle, but will now need an additional check to detect value objects. But the
identity class case can be optimized as a fast path, and we believe we have
minimized any performance regressions.
There is a security risk that ==
and hashCode
can indirectly expose
private
field values. Further, two large trees of value objects can take
unbounded time to compute ==
. Developers need to understand these risks.
Dependencies
Strict Field Initialization in the JVM (Preview) provides the JVM mechanism necessary to require, through verification, that value class instance fields are initialized during early construction