-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Match tuple formatting #18147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Match tuple formatting #18147
Conversation
to have consistent formatting with Black. Previously, we only checked if the first element is a Bracket, "[", which gives false positives for tuples containing lists as lists are wrapped in brackets. Tuples not necissarily.
|
by replacing "scan" and "filter" with "try_fold" which allows early stopping and is generally more succinct.
MichaReiser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. I've a small suggestion which may make this less fragile.
| if source[pattern.range()].starts_with('[') { | ||
| SequenceType::List | ||
| // A top-level comma indicates a tuple with a leading list, not a list | ||
| let is_list = | ||
| SimpleTokenizer::new(source, TextRange::new(pattern.start(), pattern.end())) | ||
| .skip_trivia() | ||
| .try_fold(0, |depth, token| match token.kind() { | ||
| SimpleTokenKind::LBracket => Ok(depth + 1), | ||
| SimpleTokenKind::RBracket => Ok(depth - 1), | ||
| SimpleTokenKind::Comma if depth == 0 => Err(()), | ||
| _ => Ok(depth), | ||
| }); | ||
| match is_list { | ||
| Err(()) => SequenceType::TupleNoParens, | ||
| Ok(_) => SequenceType::List, | ||
| } | ||
| } else if source[pattern.range()].starts_with('(') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the proper fix here is to only look at the text of the outer pattern by using:
let text = &source[TextRange::new(pattern.start(), pattern.values.first().map(Ranged::end).unwrap_or(pattern.end())];This will return `` case [], []: because the `[` belongs to the inner pattern (and not the outer.
We can then use the same text for the elif on line 97
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if my understanding is correct.
On the on the first example of the issue it gives
[crates/ruff_python_formatter/src/pattern/pattern_match_sequence.rs:82:29] &source[TextRange::new(pattern.start(),
pattern.patterns.first().map(Ranged::end).unwrap_or(pattern.end()),)] = "[]"Do you mean something like this:
pub(crate) fn from_pattern(pattern: &PatternMatchSequence, source: &str) -> SequenceType {
let first_element = dbg!(&source[TextRange::new(
pattern.start(),
pattern
.patterns
.first()
.map(Ranged::end)
.unwrap_or(pattern.end()),
)]);
if first_element.starts_with("[") {
if first_element == &source[pattern.range()] {
SequenceType::List
} else {
SequenceType::TupleNoParens
}
} else if first_element.starts_with('(') {(this breaks format and black compatibility tests, though).
We cannot check for a leading [ because that way we cannot discriminate between a tuple that starts with a list, [],_ and a list, [_].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I messed up the code example. We should take the start of the first pattern, not the end
let text = &source[TextRange::new(pattern.start(), pattern.values.first().map(Ranged::start).unwrap_or(pattern.end())];There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation
pub(crate) fn from_pattern(pattern: &PatternMatchSequence, source: &str) -> SequenceType {
let text = &source[TextRange::new(
pattern.start(),
pattern
.patterns // Note: I use `.patterns` as `.values` doesn't exist.
.first()
.map(Ranged::start)
.unwrap_or(pattern.end()),
)];
if text.starts_with('[') {
SequenceType::List
} else if text.starts_with('(') {
...fails for nested lists. E.g. the following Black consistency test:
65 │- case [
57 │+ case (
66 58 │ [[5], (6)],
67 59 │ [7],
68 │- ]:
60 │+ ):
69 61 │ pass
70 62 │ case _:The rationale behind counting top level commata was they are the defining characteristic of a tuple.
Or am I still misunderstanding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is good! It reduces another incompatibility with black. But I do think that we also need to account for a trailing comma like this
Index: crates/ruff_python_formatter/src/pattern/pattern_match_sequence.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/ruff_python_formatter/src/pattern/pattern_match_sequence.rs b/crates/ruff_python_formatter/src/pattern/pattern_match_sequence.rs
--- a/crates/ruff_python_formatter/src/pattern/pattern_match_sequence.rs (revision 660375d429c41878c9a8866c383d5f7ec060c229)
+++ b/crates/ruff_python_formatter/src/pattern/pattern_match_sequence.rs (date 1747634249453)
@@ -79,9 +79,27 @@
impl SequenceType {
pub(crate) fn from_pattern(pattern: &PatternMatchSequence, source: &str) -> SequenceType {
- if source[pattern.range()].starts_with('[') {
+ let before_first_pattern = &source[TextRange::new(
+ pattern.start(),
+ pattern
+ .patterns
+ .first()
+ .map(Ranged::start)
+ .unwrap_or(pattern.end()),
+ )];
+
+ let after_last_pattern = &source[TextRange::new(
+ pattern
+ .patterns
+ .last()
+ .map(Ranged::end)
+ .unwrap_or(pattern.start()),
+ pattern.end(),
+ )];
+
+ if before_first_pattern.starts_with('[') && !after_last_pattern.ends_with(',') {
SequenceType::List
- } else if source[pattern.range()].starts_with('(') {
+ } else if before_first_pattern.starts_with('(') {
// If the pattern is empty, it must be a parenthesized tuple with no members. (This
// branch exists to differentiate between a tuple with and without its own parentheses,
// but a tuple without its own parentheses must have at least one member.)To correctly hanlde
match more := (than, one), indeed,:
case [[5], (6)],:
pass
case _:
passbut we should add a test for this. We should also add tests that this change doesn't the formatting of any already formatted code (where Ruff added the extra [ ] because we otherwise need to gate this change behind preview mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allright, i've adapted the code according to your suggestions in fdcebbc
The updates to the Black compatibility are in 0ac564f -- thanks for pointing out that the changes are desired. I was very much in a red things = bad mindset. Took some time to learn about how you do this after. :)
The previously applied formatting is not changed back. So i don't think preview gating is necessary. I added a test for it in 019e5c0.
|
Nice, thank you |
Closes #17969
Summary
Add check to sequence type determination of
matchformatting to distinguish between a list and a tuple whose leading element is a list.Previously, we only checked for an opening bracket,
[, now we additionally check for top level commata to determine if it is a tuple and return the according type.The resulting formatting behaviour is consistent with Black.
Test Plan
Added a test case to ruff's tests.
Note: this case is not covered in tests imported from Black.
Manually compared Black, and this fix on the example from the issue.