JEP draft: Null-Restricted and Nullable Types (Preview)
Owner | Dan Smith |
Type | Feature |
Scope | SE |
Status | Draft |
Component | tools / javac |
Discussion | valhalla dash dev at openjdk dot org |
Effort | L |
Duration | M |
Created | 2023/02/23 01:23 |
Updated | 2024/08/20 20:19 |
Issue | 8303099 |
Summary
Support nullness markers on Java types to indicate that a type rejects or deliberately allows nulls. This is a preview language feature.
Goals
-
Enhance Java's reference types to let programmers express whether
null
references are expected as values of the type -
Support conversions between types with different nullness properties, accompanied by warnings about possibly-mishandled
null
values -
Compatibibly interoperate with traditional Java code that makes no assertions about the
null
compatibility of its types, and support gradual adoption of these new features without introducing source or binary incompatibilities -
Ensure that variables with types that reject
null
are initialized before they are first read -
Enforce types that reject
null
at run time, even when classes are compiled separately -
Provide the metadata and integrity guarantees necessary for run-time optimizations (such as the flattening of value objects) to rely on types that claim to exclude
null
Non-Goals
-
It is not a goal to automatically re-interpret existing code—use of these features should be optional and explicitly opted in to (future work will explore mechanisms to request a bulk opt-in without needing to change individual types)
-
It is not a goal to require programs to explicitly account for all
null
values that might occur; unaccounted-fornull
values may cause compile-time warnings, but not compile-time errors -
It is not a goal to make any changes to the primitive types, such as adding support for a nullable
int
type -
It is not a goal (at this time) to apply the language enhancements to the standard libraries
Motivation
In a Java program, a variable of type String
may hold either a reference to a
String
object or the special value null
. In some cases, the author
intends that the variable will always hold a reference to a String
object;
in other cases, the author expects null
as a meaningful value. Unfortunately,
there is no way to formally express in the language which of these alternatives
is intended, leading to confusion and bugs.
For example, programs often make a blanket assumption that no null
values will
be present. But it takes extra care to state this expectation in Javadoc
specifications and to reliably enforce it in implementation code. If extra care
isn't taken, and someone fails to follow the assumed protocol, then null
values may end up flowing freely through implementation code, eventually
triggering an exception at some point far removed from the bug.
This situation can be greatly improved by giving developers tools to assert, as
part of a type, that either (1) null
values are not supported and will be
rejected, or (2) null
values are expected and should be properly accounted
for.
By default, the language can't reasonably assume either of these interpretations. It needs developers to explicitly indicate their intent.
Given a clear expression of intent, the language could then introduce both compile-time feedback and run-time checks to help developers detect unexpected nulls earlier.
In the Valhalla project, a variable of a value class type can
be optimized with a flattened representation of its values. But this flattened
representation may need extra bits to encode null
, negatively impacting memory
footprint and sometimes making it impossible to optimize the storage at all.
If the developer could exclude null
from the domain of the variable, a
better encoding could be achieved.
In the Amber project, the nullness of a pattern match candidate may influence
whether a switch
should be considered exhaustive, and the nullness of a type
pattern may influence whether the pattern matches null
. It would be useful for
developers to be able to control these behaviors.
Description
The features described below are preview features, enabled with the
--enable-preview
compile-time and runtime flags.
Nullness properties and markers
A reference type may optionally express nullness—whether null
is intended
to be included in the value set of the type.
In Java syntax, a nullness marker is used to indicate this property.
The type Foo!
is null-restricted: the value set excludes null
.
The type Foo?
is nullable: the value set deliberately includes null
.
By default, the nullness of the type Foo
is unspecified: a null
may occur,
but we don't know whether its presence is deliberate.
The nullness of a type is an intrinsic part of the type—in other words, Foo?
and Foo
are different types, because they have different nullness.
However, as outlined later, most language rules are defined in a way that
either ignores nullness or helpfully adapts (perhaps with a warning) between
types with different nullness.
Array types and their array component types may both have nullness markers.
Foo?[]!
is a null-restricted array type, whose components are of nullable type
Foo
. Null markers for multi-dimensional arrays may occur after each bracket
pair, and by convention are interpreted outermost to innermost, from left to
right.
Parameterized types and their type arguments may similarly both have nullness
markers. Predicate!<Foo?>
is a null-restricted Predicate
with nullable type
Foo
as a type argument. The interpretation of type arguments is described in
more detail later.
In this JEP, nullness markers are explicit: to express a null-restricted or
nullable type, the !
or ?
symbol must appear in source. In the future, it
may be useful, for example, for a programmer to somehow express that every
type in a class or compilation unit is to be interpreted as null-restricted,
unless a ?
symbol is present. The details of any such a feature will be
explored separately.
Field and array initialization
Most variables in Java must be initialized with a value of the variable's type before they can be used. Local variables are checked for definite assignment before they can be referenced; method parameters get initial values from a method invocation expression; pattern variables are bound in the process of pattern matching; etc.
Traditionally, fields and array components get special treatment: because they can be accessed by multiple program components immediately upon creation of a class or object, they are automatically initialized "at birth" with a default value, which programmers will typically overwrite in the course of program execution.
The default value of a reference type is null
. But this is an unsuitable
initial value for a null-restricted field or array component—if someone reads
the variable before it has been written, they will observe a value that is not
of the variable's type.
Thus, fields and arrays with null-restricted types behave differently than other fields and arrays: they must always be initialized by the program before they can be read. This is enforced as follows:
-
A null-restricted instance field without an initializer must be definitely assigned before the (explicit or implicit)
super(...)
call in each of the class's constructors. The Flexible Constructor Bodies JEP allows the necessary initialization code to be written at the start of a constructor. In this early construction context, the initialization logic is not allowed to refer tothis
or risk any attempts to read the uninitialized field.class Person { private String! name; public Person(String name) { this.name = name; super(); } }
-
If a null-restricted instance field has an initializer, the initializer is executed at the start of each constructor, before the
super(...)
call. (Constructors that callthis(...)
are a special case and, as usual, do not execute initializers at all.) Again, this means that the initialization logic of the field occurs in an early construction context and may not refer tothis
or risk any reads of the uninitialized field. -
A null-restricted static field must be definitely assigned by the end of all static initializers and initializer blocks of the class. Note, however, that this rule does not prevent some other class from trying to read the field during the class initialization process; in that case, a run time check detects the early read attempt and throws an exception.
In the following example, the initialization code of field
Foo.s
has a circular dependency. Traditionally, the circular reference toFoo.s
would produce the default value ofs
,null
; with a null-restricted field type, that is impossible and so an exception is thrown alerting the developer to the bug.class Foo { public static String! s = Bar.getString(); } class Bar { static String! getString() { return Foo.s; // may throw an exception } }
-
An array with a null-restricted component type must provide an initializer for each component in the array creation expression. This can be achieved by explicitly listing each initial value in an array initializer, or by using a new shorthand form (syntax TBD).
String![] labels; labels = new String![]{ "x", "y", "z" }; labels = new String![100]{ "" }; // strawman syntax labels = new String![100]{ i -> "x"+i }; // strawman syntax
Expression nullness and conversions
As part of type checking, the Java compiler is responsible for determining the nullness of every expression.
-
The nullness of a variable reference is given by the referenced variable's declaration (but TBD whether the Java compiler will further observe that, due to previous uses of the variable, a
null
is known not to be present). -
The nullness of a method invocation is given by the referenced method's return type.
-
The type of a cast expression is explicit in the cast (but, again, TBD whether the Java compiler takes other information into account).
-
A
null
literal is nullable (of course). -
Most other reference-typed expressions are null-restricted. These include literals, string concatenations,
this
, class instance and array creations, method references, and lambda expressions.
A nullness conversion allows an expression with one kind of nullness to be treated as having a different nullness. Nullness conversions are permitted in all assignment, invocation, and casting contexts.
The following are widening nullness conversions:
Foo!
toFoo?
Foo!
to unspecifiedFoo
Foo?
to unspecifiedFoo
- unspecified
Foo
toFoo?
And these are narrowing nullness conversions:
Foo?
toFoo!
- unspecified
Foo
toFoo!
Narrowing nullness conversions are analogous to unboxing conversions: the
compiler performs them automatically, while at run time they impose a dynamic
check, possibly causing a NullPointerException
.
In fact, the boxing and unboxing conversions can now be simplified: boxing
converts int
to Integer!
, potentially followed by a widening nullness
conversion to Integer
; unboxing converts Integer!
to int
, possibly
preceded by a narrowing nullness conversion from Integer
to Integer!
.
It is a compile-time error to attempt to directly convert a null
literal to a
null-restricted type.
Run-time null checking
At run time, if a null
value undergoes a narrowing nullness conversion to a
null-restricted type, a NullPointerException
is thrown.
String? id(String! arg) { return arg; }
String s = null;
Object! o1 = s; // NPE
Object o2 = id(s); // NPE
Object o3 = (String!) s; // NPE
Some narrowing nullness conversions are not apparent in the source code, but occur implicitly as part of run time execution. These include:
-
An array that was allocated with a null-restricted component type may be given a less specific type in the source code, but will still reject
null
values during the usual array store check. The failed conversion will prompt anArrayStoreException
. -
Similary, a field that was not null-restricted at compile time but was later separately compiled to be null-restricted will reject
null
values during a new field store check. The failed conversion will prompt aFieldStoreException
. -
When one method overrides another, the argument to an invocation of the superclass method undergoes conversion to the parameter's invocation type, followed by conversion to the type of the overriding method's parameter. As discussed below, the nullness of these two parameter types may not be the same.
-
Similarly, the return value of a method undergoes conversion to the declared method return type, followed by conversion to the invocation's expected return type.
Nullness of type variables
Like other types, a type-variable type (that is, a use of a type variable) may
express nullness. T!
is a null-restricted type, and T?
is a nullable type.
Null-restricted and nullable type variable types (T!
and T?
) assert a
specific nullness within the generic code. This may be appropriate if the
generic code directly interacts with null
.
class Box<T> {
boolean set;
T? val; // nullable field
public Box() { set = false; }
public void set(T val) { this.val = val; set = true; }
public T? getOrNull() { // nullable result
return set ? val : null;
}
public T! getNonNull(T! alt) { // null-restricted result
return (set && val != null) ? (T!) val : alt;
}
}
Types used as type arguments may express nullness; null markers on type-variable types override whatever nullness was asserted by the type argument.
Box<String!> b1 = new Box<String!>();
b1.getOrNull(); // nullable result
Box<String?> b2 = new Box<String?>();
b2.set(null);
b2.getNonNull(""); // null-restricted result
Of course, null restrictions cannot be enforced within the erased implementation of a generic API. But the usual implicit casts that occur at the boundaries of generic APIs will enforce null-restricted type arguments at run time.
Type arguments and bounds
As illustrated above, type arguments may express nullness, which influences the substituted nullness of the API wherever parametric type variables occur.
For interoperability, nullness in type arguments is not strongly enforced, and
unchecked nullness conversions allow modifications to the nullness of type
arguments. For example, Predicate<String!>
can be converted to
Predicate<String>
or Predicate<String?>
. These conversions may cause
warnings (see "Compiler warnings", below).
Similarly, unchecked nullness conversions allow modifications to the nullness of array component types. TBD under what conditions these conversions are checked at run time.
A type variable declaration or wildcard may have nullness markers on its bounds. A type may satisfy the bounds via nullness conversion, though, so again these nullness markers are not strongly enforced, but may cause warnings.
Method overriding and type argument inference
Nullness is ignored when determining whether two methods have the same signature. One method may override another even if the nullness of their parameters and returns do not match.
class A {
String? lookup(String! arg) { ... }
}
class B extends A {
String lookup(String arg) { ... }
}
Such mismatches will be common as different APIs adopt nullness markers independently.
Formally, two methods are considered to have the same signature if each parameter type and type parameter bound can be converted to the other via nullness and unchecked conversions.
Similarly, the return type of an overriding method must be convertible to the overridden return type via a widening reference conversion, possibly followed by nullness and unchecked conversions.
If a method is generic, any parametric uses of its type parameters in the method signature will influence the inferred nullness of type arguments. Nullness does not influence method applicability and cannot cause type argument inference to fail, but it can influence the nullness inferred for the method's return type. (Details of the inference algorithm TBD.)
Compiler warnings
As described above, making a type null-restricted may cause new compile-time
errors if a field or array of the type is left uninitialized, or if an
attempt is made to convert a null
literal to the type. It may also be a
compile-time error to compare a null
literal to an expression with a
null-restricted type.
In other situations, nullness analysis is supplementary and does not cause
compile-time errors. However, javac
will provide warnings to help programmers
avoid runtime errors. IDEs and other analysis tools are encouraged to do the
same. Possible sources of warnings include:
-
Narrowing nullness conversions, especially from
?
types -
?
-typed expressions used in member accesses or other null-hostile operations -
Type arguments whose nullness is inconsistent with their bounds
-
Method parameters or returns with nullness that doesn't match an overridden method
-
Unchecked conversions that change the nullness of a type
Compilation & class file representation
Most uses of null markers are erased in class
files, with the accompanying
run-time conversions being expressed directly in bytecode.
Signature
attributes have an updated grammar to allow !
and ?
in
types, as appropriate. Nullness is not encoded in method and
field descriptors.
However, to prevent pollution of fields, a new NullRestricted
attribute allows
a field to indicate that it does not allow null
values. This has the following
effects:
-
The field must also be marked
ACC_STRICT
, which indicates that it must be "strictly-initialized". The verifier ensures that all strictly-initialized instance fields have been assigned to at the point that a constructor makes asuper(...)
call. -
All attempts to write to the field check for a
null
value and, if found, throw aFieldStoreException
.
Null-restricted array creation is not supported by the anewarray
instruction,
and must be accomplished with a call to the reflection API (see below). All
attempts to write to a null-restricted array component reject null
values
during the usual array store check.
Core reflection
There are no Foo!.class
or Foo?.class
literals, and no associated instances
of java.lang.Class
. These types are derived from a class declaration, but
do not represent distinct classes. (Compare List<String>
and List<Integer>
.)
However, a new RuntimeType
API describes the set of types that are enforced
by array and field store checks at run time, including a null-restricted
variant of every class and interface type. (This is a superset of
the linkage types, which can appear in descriptors and are represented by
the Class
API.)
The Field
API supports querying a field's RuntimeType
, which may not be
the same as its getType
result.
The Array
API supports variations of newInstance
that allow a component type
to be expressed with a RuntimeType
. These variations also allow the caller to
provide initial values for the array components, and will reject attempts to
create null-restricted arrays without initial values. Another new method
reflects the RuntimeType
of an array's components.
Supplementary changes
Traditional deserialization is not compatible with null-restricted fields and arrays. A separate JEP will provide a new mechanism to support serialization without exposing uninitialized null-restricted fields and arrays.
Documentation generated by javadoc
will include nullness markers.
The java.lang.reflect.Type
and javax.lang.model
APIs will encode nullness in
their representation of types.
Alternatives
A variety of development tools in the Java ecosystem have implemented their own compile-time tracking of nulls. These tools don't change the Java language and so naturally have some limitations, particularly in the syntax they can use (annotations) and the behavior they can affect (compile-time checks).
Other programming languages track nullness in their type systems. Many are null-restricted by default. Many also consider it an error to assign to a null-restricted type without an explicit null check. In the case of Java, it's important that the feature be optional, and be something that programmers can incrementally make use of without a monolithic migration effort.
Runtime enforcement of nullness can be implemented with explicit checks or calls
to the Objects.requireNonNull
standard API. But consistently applying these
checks is tedious, necessitates additional documentation work, and makes
programs harder to read. There's no way to apply these checks directly to
variable storage, particularly fields and arrays.
Dependencies
Prerequisites:
- Flexible Constructor Bodies (Second Preview) allows
constructors to execute statements before a
super(...)
call and allows assignments to instance fields in this context. These changes facilitate the initialization requirements of null-restricted fields.
Future work:
-
Null-Restricted Value Class Types (Preview) will optimize the encodings of null-restricted, value-class-typed fields and arrays, and may allow some value classes to declare their own default values.
-
JEP 402: Enhanced Primitive Boxing (Preview) will track nullness as it makes wider use of implicit boxing conversions in the language.
-
JVM class and method specialization (JEP 218, with revisions) will allow generic classes and methods to reify and enforce the nullness of (at least some of) their type arguments.
Other possible future enhancements building on this JEP may include:
-
Applying nullness markers to certain parts of the standard APIs.
-
Enhancing the JVM to provide a concise, minimal-footprint way to express a null check in bytecode.
-
Enhancing the JVM to provide stronger low-level enforcement of null-restricted method parameters.
-
Providing a mechanism in the language to assert that all types in a certain context are implicitly null-restricted, without requiring the programmer to use explicit
!
symbols.