Patterns: Exhaustiveness, Unconditionality, and Remainder
As the switch
construct has been made steadily more expressive (first to support switch
expressions, and later to support patterns in switch
), it has become important to provide compile-time checking for whether a particular switch
is exhaustive for its selector type. All switch
expressions, and any switch
statement that uses a pattern label, must be exhaustive, or a compilation error will occur. For example, given:
enum Color { RED, YELLOW, GREEN }
int numLetters = switch (color) { // Error - not exhaustive
case RED -> 3;
case GREEN -> 5;
}
we would like to get a compile-time error that tells us that this switch is not exhaustive, because the anticipated input YELLOW
is not covered. Which raises the question: what does “exhaustive” mean?
A switch with a match-all label (a default
label, or a case null, default
label, or a case
label with a type pattern that matches every value of the selector expression) is clearly exhaustive; for every possible value of the selector, one of the labels will definitely be selected - the match-all label! It is tempting to try to define exhaustiveness for a switch
without a match-all label as meaning “if a match-all label were added, it would never be selected” As it turns out, this definition is too strong, and even if we were to adopt this definition, we probably wouldn’t enjoy programming in the resulting language.
Switching over enums
Many of the tensions in defining exhaustiveness are visible even in the simple case of exhaustive switch expressions over enum types. If we complete our switch over colors by handling all the enum constants, is it exhaustive?
int numLetters = switch (color) { // Exhaustive!
case RED -> 3;
case GREEN -> 5;
case YELLOW -> 6;
}
We would surely like for this switch
to be exhaustive — for multiple reasons. It would definitely be cumbersome to have to write a match-all clause which probably just throws an exception, since we have already handled all the cases:
int numLetters = switch (color) {
case RED -> 3;
case GREEN -> 5;
case YELLOW -> 6;
default -> throw new ArghThisIsIrritatingException(color.toString());
}
Manually writing a default
clause in this situation is not only irritating but actually pernicious, since the compiler can do a better job of checking exhaustiveness without one. (The same is true of any other match-all clause such as case null, default
, or an unconditional type pattern.) If we omit the default
clause, then we will discover at compile time if we have forgotten a case
label, rather than finding out at run time — and maybe not even then.
More importantly, what happens if someone later adds another constant to the Color
enum? If we have an explicit match-all clause then we will only discover the new constant value if it shows up at run time. But if we code the switch
to cover all the constants known at compile time and omit the match-all clause, then we will find out about this change the next time we recompile the class containing the switch
— and can then choose how to handle it. A match-all clause risks sweeping exhaustiveness errors under the rug.
In conclusion: An exhaustive switch
without a match-all clause is better than an exhaustive switch
with one, when possible.
Looking to run time, what happens if a new Color
constant is added, and the class containing the switch
is not recompiled? There is a risk that the new constant will be exposed to our switch
. Because this risk is always present with enums, if an exhaustive enum switch
does not have a match-all clause, then the compiler will actually synthesize a default
clause that throws an exception. This guarantees that the switch
cannot complete normally without selecting one of the clauses. (Given that the compiler will insert a synthetic default
clause, it is tempting to ask whether we should outlaw an explicit match-all clause in an otherwise exhaustive switch. But, this would be taking it too far, as the user may want to provide customized error handling code.)
Unconditionality, exhaustiveness, and remainder
Even the simple case of switching over enums illustrates that exhaustiveness is more subtle than it first appears. We want to say that having case
labels for RED
, YELLOW
, and GREEN
means that the switch is exhaustive for Color
, but it isn’t really, or at least not completely; there are possible run time values of Color
that are not matched by any of these labels. This isn’t a bug in our definition of exhaustiveness; “strengthening” the compile-time checking (which, in this case, would mean requiring a match-all clause) would be both inconvenient for users and result in worse type checking (and therefore less reliable programs). The reality is that the compile-time notion of “exhaustive enough” and true run time exhaustiveness are not the same thing. Similarly, we would not want to weaken the run time checking by omitting the synthetic default
, as this could create surprising results. So we need both concepts, which we call unconditionality (for the strong run time version) and exhaustiveness (for the compile-time version.)
A pattern is unconditional for a candidate type if we can prove at compile time that it will always match all possible run time values of that type. (An unconditional pattern thus requires no run time checks.) Unconditionality is a strong condition; the only patterns currently supported that are unconditional are inferred type patterns (var x
), unnamed patterns (_
, which is sugar for var _
), and type patterns where the type of the match candidate is a subtype of the type named in the pattern (e.g. the type pattern CharSequence cs
is unconditional for the type CharSequence
and also for String
but not for the type Object
.)
Exhaustiveness for a type is a property not of a single pattern, but more properly of a set of patterns (or case
labels). A set of enum case
labels is exhaustive for the corresponding enum type if the set of case
labels contains all of the enum constants of that type. As we will see, there are other ways for a set of patterns to be exhaustive as well.
If a set of patterns is exhaustive for a type, we call the run time values that are not matched by any pattern in the set the remainder of the set. In our enum switch example, the remainder includes any novel enum constants, as well as null
. (That null
is part of the remainder for our enum switch is less obvious, because the innate null-hostility of switches hides this in the simple cases, but this will become important when we get to nested patterns.)
Sealed types
Switching over a selector whose type names a sealed
class is similar to switching over enums, just lifted from the term level to the type level. If all of the permitted subtypes of the sealed type are handled by the cases, we would like to be able to omit a match-all clause just as with enum switches.
sealed interface Container<T> permits Box, Bag { }
switch (container) {
case Box<T> box: ...
case Bag<T> bag: ...
}
For the same reasons as with enum switches, we would like for this switch to be considered exhaustive on Container<T>
. And, just as with enum switches, it is possible for a novel subtype of Container
to show up at run time, so the compiler similarly inserts a synthetic default
clause that throws. A set of patterns is exhaustive on a sealed type if it is exhaustive on every permitted subtype, and for an exhaustive set of patterns on a sealed type, any novel subtypes are considered part of the remainder. As with enum switches, for an exhaustive set of patterns on a sealed type, the remainder contains the null
value.
Record (and deconstruction) patterns
Given a record:
record IntBox(int i) { }
should the following switch be considered exhaustive?
IntBox ib = ...
switch (ib) {
case IntBox(int i): ...
}
The pattern IntBox(int i)
is not unconditional on IntBox
, because it doesn’t match null
(which is a valid run time value of IntBox
). But the above switch certainly should be considered exhaustive; requiring a match-all clause, or even a case null
label here would help no one. Given a record class R
with an x
component, the record pattern R(p)
matches a value if the value is instanceof R
and (recursively) if the record value’s x
component value matches the pattern p
. The pattern R(p)
is considered exhaustive for R
if p
is exhaustive for the type of the x
component of R
.
The pattern IntBox(int i)
cannot be considered to match null
, even though null
can be cast to IntBox
. This is because, in general, to extract a record value’s component value, after casting the value to the record type, we invoke the component accessor method on the resulting reference; if the candidate is null
, this invocation will fail. Just as with the earlier cases, the remainder for any record pattern always includes null
.
Nested patterns
Nested patterns are also a source of remainder. Say we have:
record Box<T>(T t) { }
and a switch:
Box<Box<String>> bbs = ...
switch (bbs) {
case Box(Box(String s)): ...
}
First, is this switch exhaustive? By the above definition, Box(Box(String s))
is exhaustive for the type Box<Box<String>>
if the nested pattern Box(String s)
is exhaustive for Box<String>
; by the same rule, this is exhaustive if String s
is exhaustive for String
, which it is, by virtue of being unconditional. So the switch is exhaustive.
So, what is the remainder? This is where it gets interesting, and why we’ve bothered to include null
in the remainder for the previous examples even when it might appear to be irrelevant because of the null
hostility of switch. The remainder here clearly includes null
, but it also includes a Box
value with a null
component value! More generally, for a record pattern R(p)
where r
is in the remainder of p
for the component type of R
, the remainder includes an R
value with r
as its component value.
Nested patterns and sealed types
Let’s put together nested patterns with sealed types.
sealed interface Fruit permits Apple, Orange { }
final class Apple implements Fruit { }
final class Orange implements Fruit { }
Box<Fruit> bf = ...
switch (bf) {
case Box(Apple a): ...
case Box(Orange o): ...
}
Is this switch exhaustive on Box<Fruit>
? Together, the patterns Apple a
and Orange o
are exhaustive on Fruit
, with remainder containing null
and any novel subtype of Fruit
. So Box(Apple a)
and Box(Orange o)
are exhaustive on Box<Fruit>
. The remainder includes a Box
value with a null
component value, and also Box
values whose component value’s type is a novel subtype of Fruit
. (The rules get more complicated when the record has more than one component.)
It’s complicated, but at its heart it’s simple
This may all seem complicated but, at its heart, it derives from some straightforward rules about how patterns are exhaustive for a type:
- A set of patterns containing a type pattern
T t
is exhaustive forT
and any of its subtypes; - A set of patterns is exhaustive for a
sealed
type if it is exhaustive for its permitted direct subtypes; - A set of patterns containing a record pattern
R(P)
is exhaustive forR
ifP
is exhaustive for the component type ofR
; and - A set of patterns containing the record patterns
R(P0) .. R(Pn)
is exhaustive forR
if the set of patternsP0..Pn
is exhaustive forR
’s component type.
The remainder of an exhaustive set of patterns is the set of values that do not match any pattern in the set. These are the ‘weird’ values that might appear at run time, but for which it would be unreasonable or even counter-productive to require that they be explicitly handled every time in every switch
.