Skip to content

Latest commit

 

History

History
845 lines (694 loc) · 26.7 KB

patterns.md

File metadata and controls

845 lines (694 loc) · 26.7 KB

Patterns

Syntax
Pattern :
      |? PatternNoTopAlt ( | PatternNoTopAlt )*

PatternNoTopAlt :
      PatternWithoutRange
   | RangePattern

PatternWithoutRange :
      LiteralPattern
   | IdentifierPattern
   | WildcardPattern
   | RestPattern
   | ObsoleteRangePattern
   | ReferencePattern
   | StructPattern
   | TupleStructPattern
   | TuplePattern
   | GroupedPattern
   | SlicePattern
   | PathPattern
   | MacroInvocation

Patterns are used to match values against structures and to, optionally, bind variables to values inside these structures. They are also used in variable declarations and parameters for functions and closures.

The pattern in the following example does four things:

  • Tests if person has the car field filled with something.
  • Tests if the person's age field is between 13 and 19, and binds its value to the person_age variable.
  • Binds a reference to the name field to the variable person_name.
  • Ignores the rest of the fields of person. The remaining fields can have any value and are not bound to any variables.
# struct Car;
# struct Computer;
# struct Person {
#     name: String,
#     car: Option<Car>,
#     computer: Option<Computer>,
#     age: u8,
# }
# let person = Person {
#     name: String::from("John"),
#     car: Some(Car),
#     computer: None,
#     age: 15,
# };
if let
    Person {
        car: Some(_),
        age: person_age @ 13..=19,
        name: ref person_name,
        ..
    } = person
{
    println!("{} has a car and is {} years old.", person_name, person_age);
}

Patterns are used in:

Destructuring

Patterns can be used to destructure structs, enums, and tuples. Destructuring breaks up a value into its component pieces. The syntax used is almost the same as when creating such values. In a pattern whose scrutinee expression has a struct, enum or tuple type, a placeholder (_) stands in for a single data field, whereas a wildcard .. stands in for all the remaining fields of a particular variant. When destructuring a data structure with named (but not numbered) fields, it is allowed to write fieldname as a shorthand for fieldname: fieldname.

# enum Message {
#     Quit,
#     WriteString(String),
#     Move { x: i32, y: i32 },
#     ChangeColor(u8, u8, u8),
# }
# let message = Message::Quit;
match message {
    Message::Quit => println!("Quit"),
    Message::WriteString(write) => println!("{}", &write),
    Message::Move{ x, y: 0 } => println!("move {} horizontally", x),
    Message::Move{ .. } => println!("other move"),
    Message::ChangeColor { 0: red, 1: green, 2: _ } => {
        println!("color change, red: {}, green: {}", red, green);
    }
};

Refutability

A pattern is said to be refutable when it has the possibility of not being matched by the value it is being matched against. Irrefutable patterns, on the other hand, always match the value they are being matched against. Examples:

let (x, y) = (1, 2);               // "(x, y)" is an irrefutable pattern

if let (a, 3) = (1, 2) {           // "(a, 3)" is refutable, and will not match
    panic!("Shouldn't reach here");
} else if let (a, 4) = (3, 4) {    // "(a, 4)" is refutable, and will match
    println!("Matched ({}, 4)", a);
}

Literal patterns

Syntax
LiteralPattern :
      BOOLEAN_LITERAL
   | CHAR_LITERAL
   | BYTE_LITERAL
   | STRING_LITERAL
   | RAW_STRING_LITERAL
   | BYTE_STRING_LITERAL
   | RAW_BYTE_STRING_LITERAL
   | -? INTEGER_LITERAL
   | -? FLOAT_LITERAL

Literal patterns match exactly the same value as what is created by the literal. Since negative numbers are not literals, literal patterns also accept an optional minus sign before the literal, which acts like the negation operator.

Floating-point literals are currently accepted, but due to the complexity of comparing them, they are going to be forbidden on literal patterns in a future version of Rust (see issue #41620).

Literal patterns are always refutable.

Examples:

for i in -2..5 {
    match i {
        -1 => println!("It's minus one"),
        1 => println!("It's a one"),
        2|4 => println!("It's either a two or a four"),
        _ => println!("Matched none of the arms"),
    }
}

Identifier patterns

Syntax
IdentifierPattern :
      ref? mut? IDENTIFIER (@ Pattern ) ?

Identifier patterns bind the value they match to a variable. The identifier must be unique within the pattern. The variable will shadow any variables of the same name in scope. The scope of the new binding depends on the context of where the pattern is used (such as a let binding or a match arm).

Patterns that consist of only an identifier, possibly with a mut, match any value and bind it to that identifier. This is the most commonly used pattern in variable declarations and parameters for functions and closures.

let mut variable = 10;
fn sum(x: i32, y: i32) -> i32 {
#    x + y
# }

To bind the matched value of a pattern to a variable, use the syntax variable @ subpattern. For example, the following binds the value 2 to e (not the entire range: the range here is a range subpattern).

let x = 2;

match x {
    e @ 1 ..= 5 => println!("got a range element {}", e),
    _ => println!("anything"),
}

By default, identifier patterns bind a variable to a copy of or move from the matched value depending on whether the matched value implements Copy. This can be changed to bind to a reference by using the ref keyword, or to a mutable reference using ref mut. For example:

# let a = Some(10);
match a {
    None => (),
    Some(value) => (),
}

match a {
    None => (),
    Some(ref value) => (),
}

In the first match expression, the value is copied (or moved). In the second match, a reference to the same memory location is bound to the variable value. This syntax is needed because in destructuring subpatterns the & operator can't be applied to the value's fields. For example, the following is not valid:

# struct Person {
#    name: String,
#    age: u8,
# }
# let value = Person { name: String::from("John"), age: 23 };
if let Person { name: &person_name, age: 18..=150 } = value { }

To make it valid, write the following:

# struct Person {
#    name: String,
#    age: u8,
# }
# let value = Person { name: String::from("John"), age: 23 };
if let Person {name: ref person_name, age: 18..=150 } = value { }

Thus, ref is not something that is being matched against. Its objective is exclusively to make the matched binding a reference, instead of potentially copying or moving what was matched.

Path patterns take precedence over identifier patterns. It is an error if ref or ref mut is specified and the identifier shadows a constant.

Identifier patterns are irrefutable if the @ subpattern is irrefutable or the subpattern is not specified.

Binding modes

To service better ergonomics, patterns operate in different binding modes in order to make it easier to bind references to values. When a reference value is matched by a non-reference pattern, it will be automatically treated as a ref or ref mut binding. Example:

let x: &Option<i32> = &Some(3);
if let Some(y) = x {
    // y was converted to `ref y` and its type is &i32
}

Non-reference patterns include all patterns except bindings, wildcard patterns (_), const patterns of reference types, and reference patterns.

If a binding pattern does not explicitly have ref, ref mut, or mut, then it uses the default binding mode to determine how the variable is bound. The default binding mode starts in "move" mode which uses move semantics. When matching a pattern, the compiler starts from the outside of the pattern and works inwards. Each time a reference is matched using a non-reference pattern, it will automatically dereference the value and update the default binding mode. References will set the default binding mode to ref. Mutable references will set the mode to ref mut unless the mode is already ref in which case it remains ref. If the automatically dereferenced value is still a reference, it is dereferenced and this process repeats.

Move bindings and reference bindings can be mixed together in the same pattern, doing so will result in partial move of the object bound to and the object cannot be used afterwards. This applies only if the type cannot be copied.

In the example below, name is moved out of person, trying to use person as a whole or person.name would result in an error because of partial move.

Example:

# struct Person {
#    name: String,
#    age: u8,
# }
# let person = Person{ name: String::from("John"), age: 23 };
// `name` is moved from person and `age` referenced
let Person { name, ref age } = person;

Wildcard pattern

Syntax
WildcardPattern :
   _

The wildcard pattern (an underscore symbol) matches any value. It is used to ignore values when they don't matter. Inside other patterns it matches a single data field (as opposed to the .. which matches the remaining fields). Unlike identifier patterns, it does not copy, move or borrow the value it matches.

Examples:

# let x = 20;
let (a, _) = (10, x);   // the x is always matched by _
# assert_eq!(a, 10);

// ignore a function/closure param
let real_part = |a: f64, _: f64| { a };

// ignore a field from a struct
# struct RGBA {
#    r: f32,
#    g: f32,
#    b: f32,
#    a: f32,
# }
# let color = RGBA{r: 0.4, g: 0.1, b: 0.9, a: 0.5};
let RGBA{r: red, g: green, b: blue, a: _} = color;
# assert_eq!(color.r, red);
# assert_eq!(color.g, green);
# assert_eq!(color.b, blue);

// accept any Some, with any value
# let x = Some(10);
if let Some(_) = x {}

The wildcard pattern is always irrefutable.

Rest patterns

Syntax
RestPattern :
   ..

The rest pattern (the .. token) acts as a variable-length pattern which matches zero or more elements that haven't been matched already before and after. It may only be used in tuple, tuple struct, and slice patterns, and may only appear once as one of the elements in those patterns. It is also allowed in an identifier pattern for slice patterns only.

The rest pattern is always irrefutable.

Examples:

# let words = vec!["a", "b", "c"];
# let slice = &words[..];
match slice {
    [] => println!("slice is empty"),
    [one] => println!("single element {}", one),
    [head, tail @ ..] => println!("head={} tail={:?}", head, tail),
}

match slice {
    // Ignore everything but the last element, which must be "!".
    [.., "!"] => println!("!!!"),

    // `start` is a slice of everything except the last element, which must be "z".
    [start @ .., "z"] => println!("starts with: {:?}", start),

    // `end` is a slice of everything but the first element, which must be "a".
    ["a", end @ ..] => println!("ends with: {:?}", end),

    rest => println!("{:?}", rest),
}

if let [.., penultimate, _] = slice {
    println!("next to last is {}", penultimate);
}

# let tuple = (1, 2, 3, 4, 5);
// Rest patterns may also be used in tuple and tuple struct patterns.
match tuple {
    (1, .., y, z) => println!("y={} z={}", y, z),
    (.., 5) => println!("tail must be 5"),
    (..) => println!("matches everything else"),
}

Range patterns

Syntax
RangePattern :
   RangePatternBound ..= RangePatternBound

ObsoleteRangePattern :
   RangePatternBound ... RangePatternBound

RangePatternBound :
      CHAR_LITERAL
   | BYTE_LITERAL
   | -? INTEGER_LITERAL
   | -? FLOAT_LITERAL
   | PathInExpression
   | QualifiedPathInExpression

Range patterns match values that are within the closed range defined by its lower and upper bounds. For example, a pattern 'm'..='p' will match only the values 'm', 'n', 'o', and 'p'. The bounds can be literals or paths that point to constant values.

A pattern a ..= b must always have a ≤ b. It is an error to have a range pattern 10..=0, for example.

The ... syntax is kept for backwards compatibility.

Range patterns only work on scalar types. The accepted types are:

  • Integer types (u8, i8, u16, i16, usize, isize, etc.).
  • Character types (char).
  • Floating point types (f32 and f64). This is being deprecated and will not be available in a future version of Rust (see issue #41620).

Examples:

# let c = 'f';
let valid_variable = match c {
    'a'..='z' => true,
    'A'..='Z' => true,
    'α'..='ω' => true,
    _ => false,
};

# let ph = 10;
println!("{}", match ph {
    0..=6 => "acid",
    7 => "neutral",
    8..=14 => "base",
    _ => unreachable!(),
});

// using paths to constants:
# const TROPOSPHERE_MIN : u8 = 6;
# const TROPOSPHERE_MAX : u8 = 20;
#
# const STRATOSPHERE_MIN : u8 = TROPOSPHERE_MAX + 1;
# const STRATOSPHERE_MAX : u8 = 50;
#
# const MESOSPHERE_MIN : u8 = STRATOSPHERE_MAX + 1;
# const MESOSPHERE_MAX : u8 = 85;
#
# let altitude = 70;
#
println!("{}", match altitude {
    TROPOSPHERE_MIN..=TROPOSPHERE_MAX => "troposphere",
    STRATOSPHERE_MIN..=STRATOSPHERE_MAX => "stratosphere",
    MESOSPHERE_MIN..=MESOSPHERE_MAX => "mesosphere",
    _ => "outer space, maybe",
});

# pub mod binary {
#     pub const MEGA : u64 = 1024*1024;
#     pub const GIGA : u64 = 1024*1024*1024;
# }
# let n_items = 20_832_425;
# let bytes_per_item = 12;
if let size @ binary::MEGA..=binary::GIGA = n_items * bytes_per_item {
    println!("It fits and occupies {} bytes", size);
}

# trait MaxValue {
#     const MAX: u64;
# }
# impl MaxValue for u8 {
#     const MAX: u64 = (1 << 8) - 1;
# }
# impl MaxValue for u16 {
#     const MAX: u64 = (1 << 16) - 1;
# }
# impl MaxValue for u32 {
#     const MAX: u64 = (1 << 32) - 1;
# }
// using qualified paths:
println!("{}", match 0xfacade {
    0 ..= <u8 as MaxValue>::MAX => "fits in a u8",
    0 ..= <u16 as MaxValue>::MAX => "fits in a u16",
    0 ..= <u32 as MaxValue>::MAX => "fits in a u32",
    _ => "too big",
});

Range patterns for (non-usize and -isize) integer and char types are irrefutable when they span the entire set of possible values of a type. For example, 0u8..=255u8 is irrefutable. The range of values for an integer type is the closed range from its minimum to maximum value. The range of values for a char type are precisely those ranges containing all Unicode Scalar Values: '\u{0000}'..='\u{D7FF}' and '\u{E000}'..='\u{10FFFF}'.

Reference patterns

Syntax
ReferencePattern :
   (&|&&) mut? PatternWithoutRange

Reference patterns dereference the pointers that are being matched and, thus, borrow them.

For example, these two matches on x: &i32 are equivalent:

let int_reference = &3;

let a = match *int_reference { 0 => "zero", _ => "some" };
let b = match int_reference { &0 => "zero", _ => "some" };

assert_eq!(a, b);

The grammar production for reference patterns has to match the token && to match a reference to a reference because it is a token by itself, not two & tokens.

Adding the mut keyword dereferences a mutable reference. The mutability must match the mutability of the reference.

Reference patterns are always irrefutable.

Struct patterns

Syntax
StructPattern :
   PathInExpression {
      StructPatternElements ?
   }

StructPatternElements :
      StructPatternFields (, | , StructPatternEtCetera)?
   | StructPatternEtCetera

StructPatternFields :
   StructPatternField (, StructPatternField) *

StructPatternField :
   OuterAttribute *
   (
         TUPLE_INDEX : Pattern
      | IDENTIFIER : Pattern
      | ref? mut? IDENTIFIER
   )

StructPatternEtCetera :
   OuterAttribute *
   ..

Struct patterns match struct values that match all criteria defined by its subpatterns. They are also used to destructure a struct.

On a struct pattern, the fields are referenced by name, index (in the case of tuple structs) or ignored by use of ..:

# struct Point {
#     x: u32,
#     y: u32,
# }
# let s = Point {x: 1, y: 1};
#
match s {
    Point {x: 10, y: 20} => (),
    Point {y: 10, x: 20} => (),    // order doesn't matter
    Point {x: 10, ..} => (),
    Point {..} => (),
}

# struct PointTuple (
#     u32,
#     u32,
# );
# let t = PointTuple(1, 2);
#
match t {
    PointTuple {0: 10, 1: 20} => (),
    PointTuple {1: 10, 0: 20} => (),   // order doesn't matter
    PointTuple {0: 10, ..} => (),
    PointTuple {..} => (),
}

If .. is not used, it is required to match all fields:

# struct Struct {
#    a: i32,
#    b: char,
#    c: bool,
# }
# let mut struct_value = Struct{a: 10, b: 'X', c: false};
#
match struct_value {
    Struct{a: 10, b: 'X', c: false} => (),
    Struct{a: 10, b: 'X', ref c} => (),
    Struct{a: 10, b: 'X', ref mut c} => (),
    Struct{a: 10, b: 'X', c: _} => (),
    Struct{a: _, b: _, c: _} => (),
}

The ref and/or mut IDENTIFIER syntax matches any value and binds it to a variable with the same name as the given field.

# struct Struct {
#    a: i32,
#    b: char,
#    c: bool,
# }
# let struct_value = Struct{a: 10, b: 'X', c: false};
#
let Struct{a: x, b: y, c: z} = struct_value;          // destructure all fields

A struct pattern is refutable when one of its subpatterns is refutable.

Tuple struct patterns

Syntax
TupleStructPattern :
   PathInExpression ( TupleStructItems? )

TupleStructItems :
   Pattern ( , Pattern )* ,?

Tuple struct patterns match tuple struct and enum values that match all criteria defined by its subpatterns. They are also used to destructure a tuple struct or enum value.

A tuple struct pattern is refutable when one of its subpatterns is refutable.

Tuple patterns

Syntax
TuplePattern :
   ( TuplePatternItems? )

TuplePatternItems :
      Pattern ,
   | RestPattern
   | Pattern (, Pattern)+ ,?

Tuple patterns match tuple values that match all criteria defined by its subpatterns. They are also used to destructure a tuple.

The form (..) with a single RestPattern is a special form that does not require a comma, and matches a tuple of any size.

The tuple pattern is refutable when one of its subpatterns is refutable.

An example of using tuple patterns:

let pair = (10, "ten");
let (a, b) = pair;

assert_eq!(a, 10);
assert_eq!(b, "ten");

Grouped patterns

Syntax
GroupedPattern :
   ( Pattern )

Enclosing a pattern in parentheses can be used to explicitly control the precedence of compound patterns. For example, a reference pattern next to a range pattern such as &0..=5 is ambiguous and is not allowed, but can be expressed with parentheses.

let int_reference = &3;
match int_reference {
    &(0..=5) => (),
    _ => (),
}

Slice patterns

Syntax
SlicePattern :
   [ SlicePatternItems? ]

SlicePatternItems :
   Pattern (, Pattern)* ,?

Slice patterns can match both arrays of fixed size and slices of dynamic size.

// Fixed size
let arr = [1, 2, 3];
match arr {
    [1, _, _] => "starts with one",
    [a, b, c] => "starts with something else",
};
// Dynamic size
let v = vec![1, 2, 3];
match v[..] {
    [a, b] => { /* this arm will not apply because the length doesn't match */ }
    [a, b, c] => { /* this arm will apply */ }
    _ => { /* this wildcard is required, since the length is not known statically */ }
};

Slice patterns are irrefutable when matching an array as long as each element is irrefutable. When matching a slice, it is irrefutable only in the form with a single .. rest pattern or identifier pattern with the .. rest pattern as a subpattern.

Path patterns

Syntax
PathPattern :
      PathInExpression
   | QualifiedPathInExpression

Path patterns are patterns that refer either to constant values or to structs or enum variants that have no fields.

Unqualified path patterns can refer to:

  • enum variants
  • structs
  • constants
  • associated constants

Qualified path patterns can only refer to associated constants.

Constants cannot be a union type. Struct and enum constants must have #[derive(PartialEq, Eq)] (not merely implemented).

Path patterns are irrefutable when they refer to structs or an enum variant when the enum has only one variant or a constant whose type is irrefutable. They are refutable when they refer to refutable constants or enum variants for enums with multiple variants.

Or-patterns

Or-patterns are patterns that match on one of two or more sub-patterns (e.g. A | B | C). They can nest arbitrarily. Syntactically, or-patterns are allowed in any of the places where other patterns are allowed (represented by the Pattern production), with the exceptions of let-bindings and function and closure arguments (represented by the PatternNoTopAlt production).

Static semantics

  1. Given a pattern p | q at some depth for some arbitrary patterns p and q, the pattern is considered ill-formed if:

    • the type inferred for p does not unify with the type inferred for q, or
    • the same set of bindings are not introduced in p and q, or
    • the type of any two bindings with the same name in p and q do not unify with respect to types or binding modes.

    Unification of types is in all instances aforementioned exact and implicit type coercions do not apply.

  2. When type checking an expression match e_s { a_1 => e_1, ... a_n => e_n }, for each match arm a_i which contains a pattern of form p_i | q_i, the pattern p_i | q_i is considered ill formed if, at the depth d where it exists the fragment of e_s at depth d, the type of the expression fragment does not unify with p_i | q_i.

  3. With respect to exhaustiveness checking, a pattern p | q is considered to cover p as well as q. For some constructor c(x, ..) the distributive law applies such that c(p | q, ..rest) covers the same set of value as c(p, ..rest) | c(q, ..rest) does. This can be applied recursively until there are no more nested patterns of form p | q other than those that exist at the top level.

    Note that by "constructor" we do not refer to tuple struct patterns, but rather we refer to a pattern for any product type. This includes enum variants, tuple structs, structs with named fields, arrays, tuples, and slices.

Dynamic semantics

  1. The dynamic semantics of pattern matching a scrutinee expression e_s against a pattern c(p | q, ..rest) at depth d where c is some constructor, p and q are arbitrary patterns, and rest is optionally any remaining potential factors in c, is defined as being the same as that of c(p, ..rest) | c(q, ..rest).

Precedence with other undelimited patterns

As shown elsewhere in this chapter, there are several types of patterns that are syntactically undelimited, including identifier patterns, reference patterns, and or-patterns. Or-patterns always have the lowest-precedence. This allows us to reserve syntactic space for a possible future type ascription feature and also to reduce ambiguity. For example, x @ A(..) | B(..) will result in an error that x is not bound in all patterns, &A(x) | B(x) will result in a type mismatch between x in the different subpatterns.