JEP 513: Flexible Constructor Bodies
Author | Archie Cobbs & Gavin Bierman |
Owner | Gavin Bierman |
Type | Feature |
Scope | SE |
Status | Candidate |
Component | specification / language |
Discussion | amber dash dev at openjdk dot org |
Relates to | JEP 492: Flexible Constructor Bodies (Third Preview) |
Reviewed by | Alex Buckley, Brian Goetz |
Created | 2024/11/21 12:03 |
Updated | 2025/04/22 11:49 |
Issue | 8344702 |
Summary
In the body of a constructor, allow statements to appear before an explicit
constructor invocation, i.e., super(...)
or this(...)
. Such statements
cannot reference the object under construction, but they can initialize its
fields and perform other safe computations. This change allows many constructors
to be expressed more naturally. It also allows fields to be initialized before
they become visible to other code in the class, such as methods called from a
superclass constructor, thereby improving safety.
History
Flexible constructor bodies were first proposed as a preview feature by JEP 447 (JDK 22), under a different title. They were revised and re-previewed by JEP 482 (JDK 23) and then previewed again, without change, by JEP 492 (JDK 24). We here propose to finalize the feature in JDK 25, without change.
Goals
-
Remove unnecessary restrictions on code in constructors, so that arguments can easily be validated before calling superclass constructors.
-
Provide additional guarantees that the state of a new object is fully initialized before any code can use it.
-
Reimagine the process of how constructors interact with each other to create a fully initialized object.
Motivation
The constructors of a class are responsible for creating valid instances of the class. Typically, a constructor validates and transforms its arguments and then initializes the fields declared in its class to legitimate values. In the presence of subclassing, constructors of superclasses and subclasses share responsibility for creating valid instances.
For example, consider a Person
class with an Employee
subclass. Every
Employee
constructor will invoke, either implicitly or explicitly, a Person
constructor, and the two constructors should work together to construct a valid
instance. The Employee
constructor is responsible for the fields declared in
the Employee
class, while the Person
constructor is responsible for the
fields declared in the Person
class. Since code in the Employee
constructor
can refer to fields declared in the Person
class, it is only safe for the
Employee
constructor to access those fields after the Person
constructor
has finished assigning values to them.
The Java language ensures construction of valid instances by running
constructors from the top down: A constructor in a superclass runs before a
constructor in a subclass. To achieve this, the language requires the first
statement in a constructor to be a constructor invocation, i.e., super(...)
or
this(...)
. If no such statement exists then the Java compiler inserts a
superclass constructor, i.e., super()
.
Since the superclass constructor runs first, fields declared in the superclass
are initialized before fields declared in the subclass. Thus the Person
constructor runs in its entirety before the Employee
constructor validates its
arguments, which means that the Employee
constructor can assume that the
Person
constructor has properly initialized the fields declared in Person
.
Constructors are too restrictive
The top-down rule for constructors helps ensure that newly-constructed instances are valid, but it outlaws some familiar and reasonable programming patterns. Developers are often frustrated that they cannot write code in constructors that is perfectly safe.
For example, suppose that our Person
class has an age
field, but that
employees are required to be between the ages of 18 and 67 years old. In the
Employee
constructor, we would like to validate an age
argument before
passing it to the Person
constructor — but the constructor invocation must
come first. We can validate the argument afterwards, but that means doing the
potentially unnecessary work of invoking the superclass constructor:
class Person {
...
int age;
Person(..., int age) {
if (age < 0)
throw new IllegalArgumentException(...);
...
this.age = age;
}
}
class Employee extends Person {
Employee(..., int age) {
super(..., age); // Potentially unnecessary work
if (age < 18 || age > 67)
throw new IllegalArgumentException(...);
}
}
It would be better to declare an Employee
constructor that fails fast, by
validating its argument before invoking the Person
constructor. This is
clearly safe, but since the constructor invocation must come first the only way
to do this is to call an auxiliary method in-line, as part of the constructor
invocation:
class Employee extends Person {
private static int verifyAge(int value) {
if (age < 18 || age > 67)
throw new IllegalArgumentException(...);
return value;
}
Employee(..., int age) {
super(..., verifyAge(age));
}
}
The requirement that the superclass constructor must come first causes trouble in other scenarios, too. For example, we might need to perform some non-trivial computation to prepare the arguments for a superclass constructor invocation. Or, we might need to prepare a complex value to be shared among several arguments of a superclass constructor invocation.
Superclass constructors can violate the integrity of subclasses
Each class has a specification, either expressed or assumed, of the valid states of its own fields. The class’s implementation, if written correctly, establishes and preserves only valid states. It does so regardless of the actions of its superclasses, subclasses, and all other classes in the program. In other words, every class is intended to have integrity. An instance has integrity insofar as its class and all its superclasses have integrity.
The top-down rule ensures that a superclass constructor always runs before the
subclass constructor, ensuring that the fields of the superclass are initialized
properly. Unfortunately, the rule is not sufficient to ensure the integrity of
the new instance as a whole. The superclass constructor can, indirectly, access
fields of the subclass before the subclass constructor initializes them. For
example, suppose the Employee
class has an officeID
field, and the
constructor in Person
calls a method which is overridden in Employee
:
class Person {
...
int age;
void show() {
System.out.println("Age: " + this.age);
}
Person(..., int age) {
if (age < 0)
throw new IllegalArgumentException(...);
...
this.age = age;
show();
}
}
class Employee extends Person {
String officeID;
@Override
void show() {
System.out.println("Age: " + this.age);
System.out.println("Office: " + this.officeID);
}
Employee(..., int age, String officeID) {
super(..., age); // Potentially unnecessary work
if (age < 18 || age > 67)
throw new IllegalArgumentException(...);
this.officeID = officeID;
}
}
What does new Employee(42, "CAM-FORA")
print? You might expect it to print
Age: 42
, and perhaps additionally Office: CAM-FORA
, but actually it prints
Age: 42
and Office: null
! This is because the Person
constructor runs
before the officeID
field is initialized by the Employee
constructor. The
Person
constructor calls the show
method, causing the overriding show
method in Employee
to run, all before the Employee
constructor initializes
the officeID
field to "CAM-FORA"
. As a result, the show
method prints the
default value of the officeID
field, which is null
.
This behavior violates the integrity of the Employee
class, which requires
that its fields not be accessed before they are initialized to valid states by
its constructor. Even final
fields in Employee
can be accessed before they
are initialized to their final
values, thus the values of final
fields can
be observed to change!
This particular example is troublesome due to the fact that constructors can invoke overridable methods. While doing so is considered bad practice — Item 19 of Effective Java advises that "Constructors must not invoke overridable methods" — it is not uncommon, and is a source of subtle real-world bugs and errors. But this is just one example of such behavior. For another example, there is nothing to stop a superclass constructor from passing the current instance to another method that accesses subclass fields before they are assigned values by the subclass constructor.
Toward more expressiveness and safety
In sum, the top-down rule often limits the expressiveness of constructors. There is, moreover, little that a class can do to defend itself against violations of its integrity by its own superclasses or by other code. We need a solution to both problems.
Description
We propose to remove the simplistic syntactic top-down rule, enforced since the
Java language was created, that every constructor body begin, either explicitly
or implicitly, with a constructor invocation, i.e., super(..)
or this(..)
.
This change allows us to write readable constructors that validate their
arguments before invoking superclass constructors. For example, we can write our
Employee
constructor directly and more clearly to fail fast:
class Employee extends Person {
String officeID;
Employee(..., int age, String officeID) {
if (age < 18 || age > 67)
// Now fails fast!
throw new IllegalArgumentException(...);
super(..., age);
this.officeID = officeID;
}
}
This change also enables us to ensure that subclass constructors establish
integrity by initializing their fields before invoking superclass constructors.
For example, we can further revise the Employee
constructor to initialize the
officeID
field before invoking the superclass constructor:
class Employee extends Person {
String officeID;
Employee(..., int age, String officeID) {
if (age < 18 || age > 67)
// Now fails fast!
throw new IllegalArgumentException(...);
this.officeID = officeID; // Initialize before calling superclass constructor!
super(..., age);
}
}
Now, new Employee(42, "CAM-FORA")
prints Age: 42
and Office: CAM-FORA
, as expected. The integrity of the Employee
class is maintained.
A new model for constructor bodies
Dropping the top-down rule represents a new semantic model for constructor bodies. A constructor body now has two distinct phases: The prologue is the code before the invocation of the next constructor, and the epilogue is the code after that invocation.
To illustrate, consider this class hierarchy:
class Object {
Object() {
// Object constructor body
}
}
class A extends Object {
A() {
super();
// A constructor body
}
}
class B extends A {
B() {
super();
// B constructor body
}
}
class C extends B {
C() {
super();
// C constructor body
}
}
class D extends C {
D() {
super();
// D constructor body
}
}
Currently, when creating a new instance of class D, via new D()
, the
invocations of the constructors and the executions of their bodies flow like
this:
D
--> C
--> B
--> A
--> Object constructor body
--> A constructor body
--> B constructor body
--> C constructor body
D constructor body
That is, the constructors are invoked bottom-up, starting at the bottom of the
hierarchy, but the constructor bodies run top-down, starting at the top of the
hierarchy with the class Object
and moving down, one by one, through the
subclasses.
When constructor bodies have both a prologue and an epilogue, we can generalize the class declarations:
class Object {
Object() {
// Object constructor body
}
}
class A extends Object {
A() {
// A prologue
super();
// A epilogue
}
}
class B extends A {
B() {
// B prologue
super();
// B epilogue
}
}
class C extends B {
C() {
// C prologue
super();
// C epilogue
}
}
class D extends C {
D() {
// D prologue
super();
// D epilogue
}
}
The invocations of the constructors and the executions of the prologues and epilogues flow like this:
D prologue
--> C prologue
--> B prologue
--> A prologue
--> Object constructor body
--> A epilogue
--> B epilogue
--> C epilogue
D epilogue
That is, the prologues run bottom-up and then the epilogues run top-down. We can create valid instances by writing prologues which ensure, from the bottom up, that the fields of each subclass are assigned valid values. That, in turn, enables us to write epilogues secure in the knowledge that the state which they observe is valid, so they can freely reference the instance under construction.
Syntax
We revise the grammar of constructor bodies to allow statements before explicit constructor invocations; that is, from:
ConstructorBody:
{ [ExplicitConstructorInvocation] [BlockStatements] }
to:
ConstructorBody:
{ [BlockStatements] ExplicitConstructorInvocation [BlockStatements] }
{ [BlockStatements] }
Eliding some details, an
ExplicitConstructorInvocation
is either a superclass constructor invocation, i.e., super(...)
, or an
alternate constructor invocation, i.e., this(...)
.
The statements that appear before an explicit constructor invocation constitute the prologue of the constructor body.
The statements that appear after an explicit constructor invocation constitute the epilogue of the constructor body.
A constructor body need not contain an explicit constructor invocation. In that
case the prologue is empty, an invocation of the constructor of the direct
superclass that takes no arguments, i.e., super()
, is considered to implicitly
appear at the beginning of the constructor body, and all of the statements in
the constructor body constitute the epilogue.
A return
statement is permitted in the epilogue of a constructor body, but it
must not include an expression. That is, return
is permitted but return e
is
not. It is a compile-time error for a return
statement to appear in the
prologue of a constructor body.
Throwing an exception in the prologue or epilogue of a constructor body is permitted. Throwing an exception in the prologue will be typical in fail-fast scenarios.
Early construction contexts
Currently, code that appears in the argument list of an explicit constructor
invocation is said to appear in a static
context.
This means that the arguments to the explicit constructor invocation are treated
as if they were code in a static
method, in which no instance is available.
The technical restrictions of a static context are stronger than necessary,
however, and they prevent code that is useful and safe from appearing as
constructor arguments.
Rather than revise the concept of a static context, we introduce the concept of an early construction context that covers both the argument list of an explicit constructor invocation and any statements that appear before it in the constructor body, i.e., in the prologue. Code in an early construction context must not use the instance under construction, except to initialize fields that do not have their own initializers.
In other words, code in an early construction context must not use this
,
either explicitly or implicitly, to refer to the current instance or access
fields or invoke methods of the current instance. The only exception to this
rule is that such code may use simple assignment
statements
to fields declared in the same class, provided that the declarations of those
fields do not have initializers.
For example:
class X {
int i;
String s = "hello";
X() {
System.out.print(this); // Error - explicitly refers to the current instance
var x = this.i; // Error - explicitly refers to field of the current instance
this.hashCode(); // Error - explicitly refers to method of the current instance
var y = i; // Error - implicitly refers to field of the current instance
hashCode(); // Error - implicitly refers to method of the current instance
i = 42; // OK - assignment to an uninitialized declared field
s = "goodbye"; // Error - assignment to an initialized declared field
super();
}
}
A further restriction is that code in an early construction context must not use
super
to access fields or invoke methods of the superclass:
class Y {
int i;
void m() { ... }
}
class Z extends Y {
Z() {
var x = super.i; // Error
super.m(); // Error
super();
}
}
Records
Constructors of record classes are already subject to more restrictions than constructors of normal classes. In particular,
-
Canonical record constructors must not contain an explicit constructor invocation, and
-
Non-canonical record constructors must contain an alternate constructor invocation, i.e.,
this(...)
, and not a superclass constructor invocation, i.e.,super(...)
.
These restrictions remain. Otherwise, record constructors benefit from the changes described above, primarily because non-canonical record constructors may now contain statements before the alternate constructor invocation.
Enums
Constructors of enum classes may contain alternate constructor invocations, but not superclass constructor invocations. Just like record classes, enum classes benefit from the changes described above, primarily because their constructors may now contain statements before the alternate constructor invocation.
Nested classes
When class declarations are nested, the code of an inner class can refer to the
instance of an enclosing class. This is because the instance of the enclosing
class is created before the instance of the inner class. The code of the inner
class — including constructor bodies — can access fields and invoke methods of
the enclosing instance, using either simple names or qualified this
expressions.
Accordingly, operations on an enclosing instance are permitted in early
construction contexts.
In the code below, the declaration of Inner
is nested in the declaration of
Outer
, so every instance of Inner
has an enclosing instance of Outer
. In
the constructor of Inner
, code in the early construction context can refer to
the enclosing instance and its members, either via simple names or via
Outer.this
.
class Outer {
int i;
void hello() { System.out.println("Hello"); }
class Inner {
int j;
Inner() {
var x = i; // OK - implicitly refers to field of enclosing instance
var y = Outer.this.i; // OK - explicitly refers to field of enclosing instance
hello(); // OK - implicitly refers to method of enclosing instance
Outer.this.hello(); // OK - explicitly refers to method of enclosing instance
super();
}
}
}
By contrast, in the constructor of Outer
shown below, code in the early
construction context cannot instantiate the Inner
class with new Inner()
.
This expression is really this.new Inner()
, meaning that it uses the current
instance of Outer
as the enclosing instance for the Inner
instance. Per the
earlier rule, code in an early construction context must not use this
, either
explicitly or implicitly, to refer to the current instance.
class Outer {
class Inner {}
Outer() {
var x = new Inner(); // Error - implicitly refers to the current instance of Outer
var y = this.new Inner(); // Error - explicitly refers to the current instance of Outer
super();
}
}
Testing
-
We will test the compiler changes with existing unit tests, unchanged except for those tests that verify changed behavior, plus new positive and negative test cases as appropriate.
-
We will compile all JDK classes using the previous and new versions of the compiler and verify that the resulting bytecode is identical.
-
No platform-specific testing should be required.
Risks and Assumptions
The changes we propose above are both source- and behavior-compatible. They strictly expand the set of legal Java programs while preserving the meaning of all existing Java programs.
These changes, though modest in themselves, represent a significant change in how constructors participate in safe object initialization. They relax the long-standing requirement that a constructor invocation, if present, must always be the first statement in a constructor body. This requirement is deeply embedded in code analyzers, style checkers, syntax highlighters, development environments, and other tools in the Java ecosystem. As with any language change, there may be a period of pain as tools are updated.
Dependencies
The Java Virtual Machine
Flexible constructor bodies in the Java language depend on the ability of the JVM to verify and execute arbitrary code that appears before constructor invocations in constructors, so long as that code does not reference the instance under construction except to initialize uninitialized fields.
Fortunately, the JVM already supports a more flexible treatment of constructor bodies:
-
Multiple constructor invocations may appear in a constructor body provided that, on any code path, there is exactly one invocation;
-
Arbitrary code may appear before constructor invocations so long as that code does not reference the instance under construction except to assign fields; and
-
Explicit constructor invocations may not appear within a
try
block, i.e., within a bytecode exception range.
The JVM's rules still ensure safe object initialization:
-
Superclass initialization always happens exactly once, either directly via a superclass constructor invocation or indirectly via an alternate constructor invocation; and
-
Uninitialized instances are off-limits except for field assignments, which do not affect outcomes, until superclass initialization is complete.
As a result, this proposal does not require any changes to the Java Virtual Machine Specification, only to the Java Language Specification.
The existing mismatch between the JVM, which allows flexible constructor bodies, and the Java language, which does not, is an accident of history. Originally the JVM was more restrictive, but this led to issues with the initialization of compiler-generated fields for new language features such as inner classes. To accommodate compiler-generated code, we relaxed the JVM Specification many years ago, but we never revised the Java Language Specification to leverage this new flexibility.
Value classes
JEP 401, from Project Valhalla, proposes value classes and builds upon this work. When the constructor of a value class does not contain an explicit constructor invocation then an implicit constructor invocation is considered to implicitly appear at the end of the constructor body, rather than the beginning. Thus all of the statements in such a constructor constitute its prologue, and its epilogue is empty.