Skip to content

Latest commit

 

History

History
299 lines (235 loc) · 12.3 KB

3573-is.md

File metadata and controls

299 lines (235 loc) · 12.3 KB

Summary

Introduce an is operator in Rust 2024, to test if an expression matches a pattern and bind the variables in the pattern.

Motivation

This RFC introduces an is operator that tests if an expression matches a pattern, and if so, binds the variables bound by the pattern and evaluates to true. This operator can be used as part of any boolean expression, and combined with boolean operators.

Previous discussions around let-chains have treated the is operator as an alternative on the basis that they serve similar functions, rather than proposing that they can and should coexist. This RFC proposes that we allow let-chaining and add the is operator.

if-let provides a natural extension of let for fallible bindings, and highlights the binding by putting it on the left, like a let statement. Allowing developers to chain multiple let operations and other expressions in the if condition provides a natural extension that simplifies what would otherwise require complex nested conditionals. As the let-chains RFC notes, this is a feature people already expect to work.

The is operator similarly allows developers to chain multiple match-and-bind operations and simplify what would otherwise require complex nested conditionals. However, the is operator allows writing and reading a pattern match from left-to-right, which reads more naturally in many circumstances. For instance, consider an expression like x is Some(y) && y > 5; that boolean expression reads more naturally from left-to-right than let Some(y) = x && y > 5.

This is even more true at the end of a longer expression chain, such as x.method()?.another_method().await? is Some(y). Rust method chaining and ? and .await all encourage writing code that reads in operation order from left to right, and is fits naturally at the end of such a sequence.

Having an is operator would also help to reduce the proliferation of methods on types such as Option and Result, by allowing prospective users of those methods to write a condition using is instead. While any such condition could equivalently be expressed using let-chains, the binding would then move further away from the condition expression referencing the binding, which would result in a less natural reading order for the expression.

Consider the following examples:

if expr_producing_option().is_some_and(|v| condition(v))

if let Some(v) = expr_producing_option() && condition(v)

if expr_producing_option() is Some(v) && condition(v)

The condition using is is a natural translation from the is_some_and method, whereas the if-let construction requires reversing the binding of v and the expression producing the option. This seems sufficiently cumbersome in some cases that the absence of is would motivate continued use and development of helper methods.

Guide-level explanation

Rust provides an is operator, which can be used in any expression: EXPR is PATTERN

This operator tests if the value of EXPR matches the specified PATTERN; see https://doc.rust-lang.org/reference/patterns.html for details on patterns.

If the EXPR matches the PATTERN, the is expression evaluates to true, and additionally binds any bindings specified in PATTERN in the current scope for code subsequently executed along the path where the is expression is known to be true.

For example:

if an_option is Some(x) && x > 3 {
    println!("{x}");
}

The bindings in the pattern are not bound along any code path potentially reachable where the expression did not match:

if (an_option is Some(x) && x > 3) || (more_conditions /* x is not bound here*/) {
    // x is not bound here
} else {
    // x is not bound here
}
// x is not bound here

The pattern may use alternation (within parentheses), but must have the same bindings in every alternative:

if color is (RGB(r, g, b) | RGBA(r, g, b, _)) && r == b && g < 10 {
    println!("condition met")
}

// ERROR: `a` is not bound in all alternatives of the pattern
if color is (RGB(r, g, b) | RGBA(r, g, b, a)) && r == b && g < 10 {
    println!("condition met")
}

is may appear anywhere a boolean expression is accepted:

func(x is Some(y) && y > 3);

The is operator may not appear as a statement, because the bindings won't be usable after the end of the statement, and because the boolean return value is ignored. The compiler will issue a deny-by-default lint in that case, and suggest using let if the user wants to bind a pattern for the rest of the scope.

// deny-by-default lint: the binding `x` isn't usable after the statement, and the value of `is` is ignored.
// Suggestion: use `let` to introduce a binding in this scope: `let x = an_expression();`.
an_expression() is x;

Reference-level explanation

Add a new operator expression, IsExpression:

Syntax
IsExpression :
   Expression is PatternNoTopAlt

Add is to the operator precedence table, at the same precedence level as ==, and likewise non-associative (requiring parentheses).

Detect is appearing as a top-level statement and produce an error, with a rustfix suggestion to use let instead.

Drawbacks

Introducing both the is operator and let-chains would provide two different ways to do a pattern match as part of a condition. Having more than one way to do something could lead people to wonder if there's a difference; we would need to clearly communicate that they serve similar purposes.

An is operator will produce a name conflict with the is method on dyn Any in the standard library, and with the (relatively few) methods named is in the ecosystem. This will not break any existing Rust code, as the operator will only exist in the Rust 2024 edition and newer. The Rust standard library and any other library that wants to avoid requiring the use of r#is in Rust 2024 and newer could provide aliases of these methods under a new name; for instance, the standard library could additionally provide Any::is under a new name is_type.

Rationale and alternatives

As noted in the motivation section, adding the is operator allows writing pattern matches from left-to-right, which reads more naturally in some conditionals and fits well with method chains and similar. As noted under prior art, other languages such as C# already have this exact operator for this exact purpose.

We could choose not to add this operator, and have only let-chains. This would provide equivalent functionality, semantically; however, it would force pattern-matches to be written with the pattern on the left, which won't read as naturally in some expressions. Notably, this seems unlikely to do as effective a job of reducing the desire for is_variant() methods and helpers like is_some_and(...).

We could choose to add the is operator and not add let-chains. However, many people already expect let-chains to work as an obvious extrapolation from seeing if let/while let syntax.

We could add this operator using punctuation instead (e.g. ~). However, there is no "natural" operator that conveys "pattern match" to people (the way that + is well-known as addition). Using punctuation also seems likely to make the language more punctuation-heavy, less obvious, and less readable.

We could add this operator using a different name. Most other names, however, seem likely to be longer and to fit people's expectations less well, given the widespread precedent of is_xyz() methods. The most obvious choice would be something like matches.

We could permit top-level alternation in the pattern. However, this seems likely to produce visual and semantic ambiguity. This is technically a one-way door, in that x is true|false would parse differently depending on our decision here; however, the use of a pattern-match for a boolean here seems unlikely, redundant, and in poor style. In any case, the compiler could easily detect most attempts at top-level alternation and suggest adding parentheses.

Prior art

let-chains provide prior art for having this functionality in the language.

The matches! macro similarly provides precedent for having pattern matches in boolean expressions. is would likely be a natural replacement for most uses of matches!.

Many Rust enums provide is_variant() functions:

  • is_some() and is_none() for Option
  • is_ok() and is_err() for Result
  • is_eq() and is_lt() and is_gt() for Ordering
  • is_ipv4() and is_ipv6() for SocketAddr
  • is_break() and is_continue() for ControlFlow
  • is_borrowed() and is_owned() for Cow
  • is_pending() and is_ready() for Poll

These functions serve as precedent for using the word is for this purpose. (Note that this RFC does not propose to encourage people to switch away from these methods. Among other things, users may wish to use e.g. Option::is_some as a function pointer rather than having to write a closure using is.)

There's extensive prior art in Rust for having more than one way to accomplish the same thing. You can write a for loop or you can write iterator code. You can use combinators or write a match or write an if let. You can write let-else or use a match. You can write x > 3 or 3 < x. You can write x + 3 or 3 + x. Rust does not normatively require one alternative. We do, in general, avoid adding constructs that are entirely redundant with each other. However, this RFC proposes that the constructs are not redundant: some code will be more readable with let-chains, and some code will be more readable with is.

Kotlin has a similar is operator for casts to a type, which are similarly flow-sensitive: in the code path where the is test has succeeded, subsequent code can use the tested value as that type.

C# has an is operator for type-matching and pattern matching, which supports the same style of chaining as the proposed is operator for Rust. For instance, the following are valid C# code:

if (expr is int x && other_expr is int y)
{
    func(x - y);
}

if (bounding_box is { P1.X: 0 } or { P2.Y: 0 })
{
    check(bounding_box);
}

if (GetData() is var data
    && data.Field == value
    && data.OtherField is [2, 4, 6])
{
    show(data);
}

Unresolved questions

Can we make x is 10..=20 work without requiring the user to parenthesize the pattern, or would that not be possible with our precedence? We could potentially make this work over an edition boundary, but would it be worth the churn?

Pattern types propose using is for a different purpose, in types rather than in expressions: u32 is 1.. would be a u32 that can never be 0, and Result<T, E> is Err(_) would be a Result<T, E> that can never be the Ok variant. Can we introduce the is operator in expressions without conflicting with its potential use in pattern types? We could require the use of parentheses in v as u32 is 1.., to force it to be parsed as either (v as u32) is 1.. or v as (u32 is 1..) (assuming that pattern types can be used in as in the first place).

What new method name should we add to dyn Any as an alias for is? (This does not need to be settled in this RFC, but should be handled contemporaneously with the implementation, and tracked in the tracking issue.)

Should we add a rustc lint or clippy lint (with rustfix suggestion) to turn matches! into is?

Future possibilities

As with let-chains, we could potentially allow cases involving || to bind the same patterns, such as expr1 is V1(value) || expr2 is V2(value). This RFC does not propose allowing that syntax to bind variables, to avoid confusing code.

If in a future edition we decide to allow this for let chains, we should similarly allow it for is. This RFC does not make or recommend such a future proposal.

We could choose to offer a clippy lint or even a rustc lint, with a rustfix suggestion, to turn matches! into is.