-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
PHP 8.0 introduces a new type of control structure: match expressions.
Ref: https://wiki.php.net/rfc/match_expression_v2
function matchWithDefault($i) {
return match ($i) {
1 => 1,
2 => 2,
default => 'default',
};
}I've been working on adding support for these to PHP_CodeSniffer and believe I am nearly finished, but, no surprise, I'm left with two three questions for which I'd like a second opinion from @gsherwood and whoever else has an opinion on it.
Double arrow handling/tokenizing
As can be seen in the above code sample, match expressions introduce yet another use for the double arrow.
I've ran some tests and code like the below appears to be supported:
$array = array(
match ($test) { 1 => 'a', 2 => 'b' },
);
$array = [
match ($test) { 1 => 'a', 2 => 'b' } => 'dynamic keys, woho!',
];While I honestly and truly hope that I will never in my lifetime come across code such as the second sample in a real life codebase, it is valid in PHP 8.0+.
So, that brings me to my question: to prevent issues with array arrow alignment sniffs, should the double arrow when used in match expressions be tokenized differently ? I imagine a T_MATCH_ARROW custom token could be used.
Mind: if so, code like the below would need to tokenize correctly:
$array = [
// In order: match arrow, array arrow, match arrow, array arrow, match arrow, array arrow, match arrow.
match ($test) { 1 => [ 1 => 'a'], 2 => 'b' } => match ($test) { 1 => [ 1 => 'a'], 2 => 'b' },
];Scope closer sharing with arrow functions
function matchInArrowFunction($x) {
$fn = fn($x) => match(true) {
1, 2, 3, 4, 5 => 'foo',
default => 'bar',
};
return $fn;
}For the above code sample, which would be valid in PHP 8, the scope closer of the arrow function and the scope closer of the match expression would both be the }, as things are at the moment.
So what should be the scope_condition for the close curly ? And what should be in the conditions array for the code between the match curly braces ?
I would personally find it more intuitive for the match to be the scope owner when writing sniffs.
So, what about if for the arrow function, the ; after the match expression is seen as the scope closer (if available) ?
That would line up with simpler arrow function code, like the below where the ; is also the scope closer for the arrow function.
$fn = fn($x) => 10 * $x;Default case scope start/end
As the cases in a match expression have no break or return statement and default is treated as a scope and the tokenizer tries to assign it scope openers/closers, this becomes interesting.
The scope opener is clearly the T_DOUBLE_ARROW (or T_MATCH_ARROW if it would be named separately), the scope closer a T_COMMA or T_CLOSE_CURLY_BRACKET (shared with T_MATCH), where in both cases, we need to make sure that the scope closer doesn't get confused with comma's or curly braces from, for instance, a short array being returned or a nested match expression or other construct which also uses curly braces.
This also begs the question whether the T_DEFAULT in a T_MATCH should have its own token to prevent the scope setting for a "normal" switch default case getting confused by the different scope openers/closers allowed for the switch default and the match default.
Either way, I'd appreciate some input on this so I can finish off the PR.