JEP draft: Primitive types in patterns, instanceof, and switch
Owner | Angelos Bimpoudis |
Type | Feature |
Scope | SE |
Status | Submitted |
Component | specification / language |
Discussion | amber dash dev at openjdk dot org |
Effort | M |
Duration | M |
Reviewed by | Alex Buckley |
Created | 2022/06/15 10:05 |
Updated | 2023/03/21 19:28 |
Issue | 8288476 |
Summary
Enhance pattern matching by allowing primitive type patterns to be used in all
pattern contexts. Align the semantics of primitive type patterns with
instanceof. Extend switch
to allow primitive constants as case
labels. This
is a preview language feature.
Goals
-
Enable uniform data exploration by allowing type patterns to match values of any type (primitive or reference). Align primitive type patterns with safe casting.
-
Allow pattern matching to use primitive type patterns in both a nested context and at top-level.
-
Following the enhancements to
switch
in Java 5 (enum switch) and Java 7 (string switch), allowswitch
to process values of any primitive type (primitive switch). -
Provide easy-to-use constructs that eliminate the risk of losing information due to unsafe casts.
Non-Goals
- It is not a goal to create any new types of conversions or any new conversion contexts.
Motivation
Record classes and record patterns work together to streamline data processing in Java, because records make it easy to freely aggregate components, and dually record patterns decompose that aggregate using pattern matching.
For example, a JSON model can be encoded with a sealed hierarchy according to its specification as follows:
sealed interface JsonValue {
record JsonString(String s) implements JsonValue { }
record JsonNumber(double d) implements JsonValue { }
record JsonNull() implements JsonValue { }
record JsonBoolean(boolean b) implements JsonValue { }
record JsonArray(List<JsonValue> values) implements JsonValue { }
record JsonObject(Map<String, JsonValue> pairs) implements JsonValue { }
}
It is noteworthy that in JSON the number type represents either integers or
floating-point numbers: the double
data type is the widest possible primitive
type that can represent both. Assuming a JSON payload of { "name":"John", "age":30 }
, a Java developer could represent the same information in the
JsonValue
domain by running the following code:
var json = new JsonObject(Map.of(
"name", new JsonString("John")
"age", new JsonNumber(30)))
The previous code initializes two key-value pairs and for each key a record
class is instantiated. For the first, the value John
has the same type as the
String s
component. However for the second, a widening primitive conversion is
applied to 30
: from int
to the type of the record component, double
. To disaggregate
the json
value, Java offers record patterns, however, primitive type patterns
in a nested context already expose a limitation. While it would be desirable to recover
the int
value provided to the JsonNumber
constructor, currently Java developers
can only extract the double
value, and rely on lossy manual casts:
record Customer(String name, int age) {}
...
if (j instanceof JsonObject(var pairs)
&& pairs.get("name") instanceof JsonString(String name)
&& pairs.get("age") instanceof JsonNumber(double age)) {
int age2 = (int) age; // extraneous cast is unavoidable
if (age2 < 0 || age2 > 125) { /* ... error handling ... */ }
... orderIds ...
Customer c = new Customer(name, age2);
}
While developers can inline custom-validation logic, unfortunately the manual conversion is still unavoidable:
if (j instanceof JsonObject(var pairs)
&& pairs.get("name") instanceof JsonString(String name)
&& pairs.get("age") instanceof JsonNumber(double age) && ageValidation((int) age)) {
... orderIds ...
Customer c = new Customer(name, (int) age);
}
It would be ideal if developers could use int
directly without relying on
manual casts or conversions:
if (j instanceof JsonObject(var pairs)
&& pairs.get("name") instanceof JsonString(String name)
&& pairs.get("age") instanceof JsonNumber(int age) && ageValidation(age)) {
... orderIds ..
Customer c = new Customer(name, age);
}
Pattern matching automatically attempts such narrowing conversions for reference
types. In the example below, new Box(new RedBall())
is widening a RedBall
type to Object
, while pattern matching with Box(RedBall r)
ensures that if
b
is a Box
that holds a RedBall
, then o
can be safely casted to
RedBall
which is also the type of r
inside the if
:
record Box(Object o){}
Box b = new Box(new RedBall()); // automatic widening conversion
if (b instanceof Box(RedBall r)) { ... } // automatic narrowing conversion
Unfortunately, primitives are limited; if the type in the component is T
then
the type in the primitive type pattern must be T
as well (the type of the
use-site is invariant). Being able to vary double
to int
in a primitive type
pattern is characteristic of the pattern matching ability to reject illegal
state automatically similarly to the previous example. Back to the JSON example
if the age
element was a huge number then instanceof JsonNumber(int age)
would fail and the if
-branch would not be taken. Pattern matching means that
wherever there would be a potentially unsafe cast it would be made safe by
raising match failures to control flow decisions. Primitive type patterns should
not mean something different from reference type patterns; they both mean "can
this value be safely cast."
Primitive type patterns are useful outside record patterns as well. The ability
of a primitive type pattern to label the primitive value that was matched is
extremely helpful. For example, in the following switch
, the primitive pattern
int i
is better than a default
and avoids the need for one:
switch (x.getStatus()) {
case 0 -> ...
case 1 -> ...
case 2 -> ...
case int i -> ...
}
The case int i
serves as a remainder in all of the previous switch
expressions.
Supporting primitive type patterns means that guards can also be used to further
restrict the values matched by a case
:
switch (x.getYearlyFlights()) {
case 0 -> ...
case 1 -> ...
case 5 -> issueDiscount();
case int i when i > 100 -> issueGoldCard();
case int i -> ...
}
Combining primitive type patterns and record patterns facilitates further opportunities for case analysis even within nested record patterns:
switch (x.order()) {
case NormalOrder(Product(int productCode)) -> ...
case BadOrder x -> switch (x.reason()) {
case MissingProduct q -> switch (q.code()) {
case 1 -> ...
case 2 -> ...
case int i -> ...
}
}
}
Before this JEP, code that involved float
, double
, long
, and boolean
needed to rely on manual conversions once again. For example in the following
code, a long
value type
could be inspected with a switch if and only if the
values were actually in the int
range; code with long
integers was impossible:
long type = ...;
...
if (type >= Integer.MIN_VALUE && type <= Integer.MAX_VALUE) {
switch ((int) type) {
case 0x01 -> ...
case 0x02 -> ...
case int i -> ...
}
}
if (type == 10_000_000_000L) { ... }
if (type == 20_000_000_000L) { ... }
Consequently, it would make sense for case labels to allow constant expressions
of any primitive type, including float
, double
, long
, and boolean
. For
example:
long type = ...;
switch (type) {
case 0x01 -> ...
case 0x02 -> ...
case 10_000_000_000L -> ...
case 20_000_000_000L -> ...
case long l -> ... l ...
}
or current code that uses if-else chains to test float
s:
float f = ...;
if (Float.isNaN(f)) {
...
}
else if (Float.isInfinite(f)) {
...
}
else { ... }
will be decluttered into:
float f = ...;
switch (f) {
case Float.NaN -> ...
case Float.POSITIVE_INFINITY -> ...
case Float.NEGATIVE_INFINITY -> ...
case float g -> ... g ...
}
The Boolean
switch would be a useful alternative to the conditional operator (?:)
when making inline decisions. Unlike the conditional operator, a boolean switch
expression can contain both expressions and statements in its true and false
arms. For example, in the method call below, the second argument uses a boolean
switch to encapsulate some business logic:
startProcessing(OrderStatus.NEW, switch (user.isLoggedIn()) {
case true -> user.id();
case false -> { log("Unrecognized user"); yield -1; }
});
It would be ideal if the primitive-supporting switch
could automatically
perform reasonable conversions between the type of its expression and the types
of its case labels. For example, if the expression is of type float
, then the
case labels could be of type float
, double
, int
, or long
. However, the
loss of precision and range that can occur with other automatic conversions is
best avoided. In the following example, switch
accepts a float
but its case
labels are integral values that (as described earlier) convert to the same
float
value; in other words, the cases are indistinguishable at run time, and
the code would be rejected.
float f = ...;
switch (f) {
case 16_777_216 -> ...
case 16_777_217 -> ...
default -> ...
}
Turning to the instanceof
type comparison operator, its semantics can be
naturally derived from pattern matching. Assuming the type pattern String s
,
pattern matching safeguards that s
has the correct run-time type before it
casts o
to String
:
Object o = ...
if (o instanceof String s) { // type pattern
... s.isEmpty() ... // will execute without error
}
Pattern matching the String s
type pattern means safeguarding the casting
conversion to String
. The previous code denotes that if instanceof
succeeds,
then casting o
to the reference type String
will succeed, and the resulting
object will be non-`null``:
Object o = ...
if (o instanceof String) { // type comparison
String s = (String) o;
... s.isEmpty() ...
}
Lifting restrictions to type patterns means that instanceof
must now be able
to safeguard any casting conversion supported by Java, since a type pattern
byte b
would imply a casting conversion of a primitive type int
to byte
:
int i = ...
if (i instanceof byte) {
byte b = (byte) i;
... b ...
}
As with casting among references, many casting conversions between primitives
are unsafe too. However, applying a cast between primitives will not fail (as
with reference types); a cast between primitives will accommodate a
representation mismatch with potential information loss. As a result, applying a
cast may lead to loss of information about e.g., magnitude and sign, likely
causing bugs. For example, if the int
variable i
holds 1000
then the value
of (byte) i
will become -24
. Pattern matching and occasionally Java
developers, must safeguard casts by checking, for example, that a 32-bit int
can be represented by an 8-bit byte
:
int i = ...;
if (i >= -128 && i <= 127) {
byte b = (byte) i;
... b ...
}
instanceof
is in principle about asking whether an upcoming cast of a value to
a type would succeed without loss of information or error. When instanceof
returns true, the program has gained information: a value can be safely cast and
the program knows a sharper type for that value than previously known. It would
be ideal to remove the restrictions from instanceof
and extend those
safeguarding and sharpening semantics to conversions involving primitive types
as well. instanceof
for a primitive type would succeed if a conversion exists
and can be performed without loss of magnitude, sign, precision, or range, thus
defending against lossy casts between primitive types.
In summary, primitive types in instanceof
, and in type patterns for
instanceof
and switch
, would increase program reliability and enable more
uniform data exploration with pattern matching. This JEP removes the following
restrictions:
- primitive type patterns could only be used on a match target of the exact same type,
- primitive type patterns were only allowed in a nested context and not at top-level,
switch
and constant case labels were restricted to support only a subset of primitive types andinstanceof
was restricted to reference types only,
Description
Primitive Type Patterns
Type patterns currently do not allow primitive types when they are top-level, only when they appear in a nested pattern list of a record pattern. We lift that restriction so that primitives types are allowed in top-level as well.
The semantics of primitive type patterns (and reference type patterns on targets of primitive type) are derived from casting conversions.
A type pattern T t
is applicable to a target of type U
if a U
could be
cast to T
without an unchecked warning.
A type pattern T t
is unconditional on a target of type U
if all values of
U
can be exactly cast to T
. This includes widening from one reference type
to another, widening from one integral type to another, widening from one
floating-point type to another, widening from byte
, short
, or char
to a
floating point type, widening int
to double
, and boxing.
A set of patterns containing a type pattern T t
is exhaustive on a target of
type U
if T t
is unconditional on U
or if there is an unboxing conversion
from T
to U
.
A type pattern T t
dominates a type pattern U u
, or a record pattern
U(...)
, if T t
would be unconditional on a target of type U
.
A type pattern T t
that does not resolve to any
pattern matches a target
u
if u instanceof T
.
With pattern labels involving record patterns, some patterns are allowed to be
exhaustive even when they are not unconditional. For example, the following
switch is considered exhaustive on Box<Box<String>>
, even though it will not
match new Box(null)
:
Box<Box<String>> bbs = ...
switch (bbs) {
case Box(Box(String s)): ...
}
The pathological value new Box(null)
is considered "remainder", and is handled
by a synthetic default
clause that throws MatchException
. Unboxing follows
the same philosophy, being allowed even when there are pathological values that
cannot be converted (a null
boxed value), because it would be burdensome to
require a null
check every time we want to unbox. Similarly, novel subtypes
(those not known at compile time) of sealed types are considered "remainder" at
runtime. This accommodation is made because requiring users to specify all
possible combinations of pathological values would be tedious and impractical.
Analogously, a type pattern int x
is considered exhaustive on Integer
, so
the following switch is considered exhaustive on Box<Integer>
for the same
reason:
Box<Integer> bi = ...
switch (bi) {
case Box(int i): ...
}
Primitive Types in instanceof
As of Java 16, the instanceof
operator is either a type comparison operator
or a pattern match operator, depending on its syntactic form.
When instanceof
is a type comparison operator, support for primitive types is
realized by removing the restrictions that (1) the type of the left-hand operand
must be a reference type, and (2) the right-hand operand must name a reference
type. The form of a type comparison operator becomes:
InstanceofExpression:
RelationalExpression instanceof Type
...
Before this JEP, the result of a type comparison operator was false if the
value was the null reference, true if the value could be cast to the
right-hand operand without raising a ClassCastException
, and false otherwise.
This JEP generalizes an expression e instanceof T
as if asking whether a value
e
of static type S
can be converted to the given primitive or reference type
T
in a casting context (JLS 5.5) without error or loss of information. This
makes instanceof
the precondition test for safe casting in general.
Under this generalization, the instanceof
type comparison operator is defined
to work for all pairs of types that are allowed to be converted in a casting
context. Before this JEP, pairs between reference types that are not supported,
a compile-time error occurs. Under this JEP, type-checking instanceof
continues to follow the rules of cast conversions and for pairs between both
reference and primitive types that are not supported, a compile-time error
occurs. The examples given earlier rely on conversions allowed in a casting
context, so they can be rewritten to use instanceof
directly:
int i = 1000;
if (i instanceof byte) { // false
byte b = (byte) i;
... b ...
}
byte b = 42;
if (b instanceof int) { // true
int i = (byte) b;
... i ...
}
int i = 16_777_216; // 2^24
if (i instanceof float) { // true
float f = (float) i;
... f ...
}
int i = 16_777_217; // 2^24+1
if (i instanceof float) { // false
float f = (float) i;
... f ...
}
This JEP does not add any conversions to the casting context, nor creates any
new conversion contexts. Whether instanceof
is applicable to a given
expression and type is determined entirely by whether there is already a
conversion allowed by the casting context. The conversions permitted in casting
context are as follows:
- identity conversions (JLS 5.1.1)
- widening primitive conversions (JLS 5.1.2)
- narrowing primitive conversions (JLS 5.1.3)
- widening and narrowing primitive conversions (JLS 5.1.4)
- boxing conversions (JLS 5.1.7)
- unboxing conversions (JLS 5.1.8)
and specified combinations of these:
- an identity conversion (JLS 5.1.1)
- a widening reference conversion (JLS 5.1.5)
- a widening reference conversion followed by an unboxing conversion
- a widening reference conversion followed by an unboxing conversion, then followed by a widening primitive conversion
- a narrowing reference conversion (JLS 5.1.6)
- a narrowing reference conversion followed by an unboxing conversion
- an unboxing conversion (JLS 5.1.8)
- an unboxing conversion followed by a widening primitive conversion
The following tables present all the pairs where instanceof
is defined. This
JEP does not propose any changes to those tables.
- When the left-hand operand, is an expression of a primitive type:
To → | byte |
short |
char |
int |
long |
float |
double |
boolean |
---|---|---|---|---|---|---|---|---|
From ↓ | ||||||||
byte |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
- |
short |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
- |
char |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
- |
int |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
- |
long |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
- |
float |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
- |
double |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
- |
boolean |
- |
- |
- |
- |
- |
- |
- |
✓ |
- When the left-hand operand, is an expression of a reference type:
To → | byte |
short |
char |
int |
long |
float |
double |
boolean |
---|---|---|---|---|---|---|---|---|
From ↓ | ||||||||
Byte |
✓ |
✓ |
- |
✓ |
✓ |
✓ |
✓ |
- |
Short |
- |
✓ |
- |
✓ |
✓ |
✓ |
✓ |
- |
Character |
- |
- |
✓ |
✓ |
✓ |
✓ |
✓ |
- |
Integer |
- |
- |
- |
✓ |
✓ |
✓ |
✓ |
- |
Long |
- |
- |
- |
- |
✓ |
✓ |
✓ |
- |
Float |
- |
- |
- |
- |
- |
✓ |
✓ |
- |
Double |
- |
- |
- |
- |
- |
- |
✓ |
- |
Boolean |
- |
- |
- |
- |
- |
- |
- |
✓ |
Object |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
- When the right-hand operand, a type
T
, is a reference type,instanceof
is similarly defined as in Table 5.5-B (JLS 5.5):
To → | Byte |
Short |
Character |
Integer |
Long |
Float |
Double |
Boolean |
Object |
---|---|---|---|---|---|---|---|---|---|
From ↓ | |||||||||
byte |
✓ |
- |
- |
- |
- |
- |
- |
- |
✓ |
short |
- |
✓ |
- |
- |
- |
- |
- |
- |
✓ |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Byte |
✓ |
- |
- |
- |
- |
- |
- |
- |
✓ |
Short |
- |
✓ |
- |
- |
- |
- |
- |
- |
✓ |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Object |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Consider the following examples. All of the following are allowed because the
left-hand operand of instanceof
, an expression e
, can be converted to the
specified type in a casting context:
int i = ...
i instanceof byte
i instanceof float
boolean b = ...
b instanceof Boolean
Short s = ...
s instanceof int
s instanceof long
long l = ...
l instanceof float
l instanceof double
Long ll = ...
ll instanceof float
ll instanceof double
However, all of the following examples raise a compile-time error, since they do not correspond to a pre-existing casting conversion:
boolean b = ...
b instanceof char // error
Byte bb = ...
bb instanceof char // error
Integer ii = ...
ii instanceof byte // error
ii instanceof short // error
Long ll = ...
ll instanceof int // error
ll instanceof Float // error
ll instanceof Double // error
If e
has a reference type and the relational expression is null
,
instanceof
continues to evaluate to false
.
Exactness of Conversions
A conversion is exact if no loss of information occurs. Whether a conversion is exact depends on the pair of types involved and potentially on the input value:
-
For some pairs, the conversion from the first type to the second type is guaranteed not to lose information for any value, and requires no action at run time. The conversion is said to be unconditionally exact. Examples include
int
toint
andint
tolong
. -
For other pairs, a run-time test is needed to check whether the value can be converted from the first type to the second type without loss of information. Examples include
long
toint
andint
tofloat
-- both of these conversions detect loss of precision by relying to the notion of "representation equivalence" in java.lang.Double.
Adopting the notation from JLS (5.5) the primitive conversions in the following
table show which conversions are unconditionally exact with the symbol ɛ
. For
completeness: -
(no conversion allowed), ≈
(identity conversion), ω
(widening primitive conversion), η
(narrowing primitive conversion), ωη
(widening and narrowing primitive conversion):
To → | byte |
short |
char |
int |
long |
float |
double |
boolean |
---|---|---|---|---|---|---|---|---|
From ↓ | ||||||||
byte |
≈ |
ɛ |
ωη |
ɛ |
ɛ |
ɛ |
ɛ |
- |
short |
η |
≈ |
η |
ɛ |
ɛ |
ɛ |
ɛ |
- |
char |
η |
η |
≈ |
ɛ |
ɛ |
ɛ |
ɛ |
- |
int |
η |
η |
η |
≈ |
ɛ |
ω |
ɛ |
- |
long |
η |
η |
η |
η |
≈ |
ω |
ω |
- |
float |
η |
η |
η |
η |
η |
≈ |
ɛ |
- |
double |
η |
η |
η |
η |
η |
η |
≈ |
- |
boolean |
- |
- |
- |
- |
- |
- |
- |
≈ |
Consider the following examples, the unconditionally exact conversions are marked with (ε), those always return true regardless the value, the rest of the results were obtained via a runtime check:
byte b = 42;
b instanceof int; // true (ε)
int i = 1000;
i instanceof byte; // false
int i = 42;
i instanceof byte; // true
int i = 16_777_217; // 2^24+1
i instanceof float; // false
i instanceof double; // true (ε)
i instanceof Integer; // true (ε)
i instanceof Number; // true (ε)
float f = 1000.0f;
f instanceof byte; // false
f instanceof int; // true
f instanceof double; // true (ε)
double d = 1000.0d;
d instanceof byte; // false
d instanceof int; // true
d instanceof float; // true
Integer ii = 1000;
ii instanceof int; // true
ii instanceof float; // true
ii instanceof double; // true
Integer ii = 16_777_217;
ii instanceof float; // false
ii instanceof double; // true
Constant Expressions in case
labels
Turning to constant expressions in the case
labels of a switch
, the
primitive types long
, float
, double
, boolean
, and their boxes can be
associated with a switch block as long as the type of the selector expression
(which can be a primitive type or a boxed reference type) is the same as the
type of the constant expression.
For example, the constant expression 0f
can only be used when the selector
expression's type is float
or Float
:
float f = ...
switch (f) {
case 0f -> 5f + 0f;
case Float fi when fi == 1f -> 6f + fi;
case Float fi -> 7f + fi;
}
Two floating-point numbers are the same per IEEE 754 if their finite values, the
sign, exponent, and significand components of the floating-point values are the
same. For that reason, representation equivalence defines how switch labels can
be selected in the presence of non-integral or boolean values. The same
definition is used to signal duplicate label errors in case a developer writes
the following switch
:
float f = ...
switch (f) {
case 1.0f -> ...
case 0.999999999f -> ...
default -> ...
}
While 1.0f
is represented as a float
, 0.999999999f
is not. The latter is
rounded up to 1.0f
as well, a situation that results in a compile-time error.
Since boolean
(and its box) consist of only two distinct values, a switch
that lists both the true and false cases is considered exhaustive:
boolean b = ...
switch (b) {
case true -> ...
case false -> ...
// Alternatively: case true, false -> ...
}
It is a compile-time error for that switch
to include a default
clause.
Risks and Assumptions
Outside pattern matching and instanceof
, lossy assignment is endemic in Java
source code. For example if a method returns int
then its result can be
assigned to a float
variable without casting:
int getSalary() { ... }
float salary = getSalary();
The risk is that Java developers do not realize the possible loss of range that can occur at this assignment, because it is silent.
We assume that developers of static analysis tools will realize the new role of
instanceof
, and avoid flagging code that uses converted data without a prior
manual range-check while at the same time they are safeguarded by the extended
instanceof
.