Skip to content

Conversation

@maxmynter
Copy link
Contributor

@maxmynter maxmynter commented May 17, 2025

Closes #17969

Summary

Add check to sequence type determination of match formatting to distinguish between a list and a tuple whose leading element is a list.

Previously, we only checked for an opening bracket, [, now we additionally check for top level commata to determine if it is a tuple and return the according type.

The resulting formatting behaviour is consistent with Black.

Test Plan

Added a test case to ruff's tests.
Note: this case is not covered in tests imported from Black.

Manually compared Black, and this fix on the example from the issue.

maxmynter added 2 commits May 17, 2025 00:44
to have consistent formatting with Black.

Previously, we only checked if the first element is a Bracket, "[",
which gives false positives for tuples containing lists as lists
are wrapped in brackets. Tuples not necissarily.
@maxmynter maxmynter requested a review from MichaReiser as a code owner May 17, 2025 04:51
@github-actions
Copy link
Contributor

github-actions bot commented May 17, 2025

ruff-ecosystem results

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

maxmynter added 3 commits May 17, 2025 01:15
by replacing "scan" and "filter" with "try_fold" which allows
early stopping and is generally more succinct.
@MichaReiser MichaReiser added the formatter Related to the formatter label May 17, 2025
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I've a small suggestion which may make this less fragile.

Comment on lines 82 to 97
if source[pattern.range()].starts_with('[') {
SequenceType::List
// A top-level comma indicates a tuple with a leading list, not a list
let is_list =
SimpleTokenizer::new(source, TextRange::new(pattern.start(), pattern.end()))
.skip_trivia()
.try_fold(0, |depth, token| match token.kind() {
SimpleTokenKind::LBracket => Ok(depth + 1),
SimpleTokenKind::RBracket => Ok(depth - 1),
SimpleTokenKind::Comma if depth == 0 => Err(()),
_ => Ok(depth),
});
match is_list {
Err(()) => SequenceType::TupleNoParens,
Ok(_) => SequenceType::List,
}
} else if source[pattern.range()].starts_with('(') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the proper fix here is to only look at the text of the outer pattern by using:

let text = &source[TextRange::new(pattern.start(), pattern.values.first().map(Ranged::end).unwrap_or(pattern.end())];

This will return `` case [], []: because the `[` belongs to the inner pattern (and not the outer.

We can then use the same text for the elif on line 97

Copy link
Contributor Author

@maxmynter maxmynter May 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if my understanding is correct.

On the on the first example of the issue it gives

[crates/ruff_python_formatter/src/pattern/pattern_match_sequence.rs:82:29] &source[TextRange::new(pattern.start(),
pattern.patterns.first().map(Ranged::end).unwrap_or(pattern.end()),)] = "[]"

Do you mean something like this:

    pub(crate) fn from_pattern(pattern: &PatternMatchSequence, source: &str) -> SequenceType {
        let first_element = dbg!(&source[TextRange::new(
            pattern.start(),
            pattern
                .patterns
                .first()
                .map(Ranged::end)
                .unwrap_or(pattern.end()),
        )]);
        if first_element.starts_with("[") {
            if first_element == &source[pattern.range()] {
                SequenceType::List
            } else {
                SequenceType::TupleNoParens
            }
        } else if first_element.starts_with('(') {

(this breaks format and black compatibility tests, though).

We cannot check for a leading [ because that way we cannot discriminate between a tuple that starts with a list, [],_ and a list, [_].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I messed up the code example. We should take the start of the first pattern, not the end

let text = &source[TextRange::new(pattern.start(), pattern.values.first().map(Ranged::start).unwrap_or(pattern.end())];

Copy link
Contributor Author

@maxmynter maxmynter May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation

    pub(crate) fn from_pattern(pattern: &PatternMatchSequence, source: &str) -> SequenceType {
        let text = &source[TextRange::new(
            pattern.start(),
            pattern
                .patterns // Note: I use `.patterns` as `.values` doesn't exist.
                .first()
                .map(Ranged::start)
                .unwrap_or(pattern.end()),
        )];
        if text.starts_with('[') {
            SequenceType::List
        } else if text.starts_with('(') {
...

fails for nested lists. E.g. the following Black consistency test:

   65       │-    case [
         57 │+    case (
   66    58 │         [[5], (6)],
   67    59 │         [7],
   68       │-    ]:
         60 │+    ):
   69    61 │         pass
   70    62 │     case _:

The rationale behind counting top level commata was they are the defining characteristic of a tuple.

Or am I still misunderstanding?

Copy link
Member

@MichaReiser MichaReiser May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is good! It reduces another incompatibility with black. But I do think that we also need to account for a trailing comma like this

Index: crates/ruff_python_formatter/src/pattern/pattern_match_sequence.rs
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/crates/ruff_python_formatter/src/pattern/pattern_match_sequence.rs b/crates/ruff_python_formatter/src/pattern/pattern_match_sequence.rs
--- a/crates/ruff_python_formatter/src/pattern/pattern_match_sequence.rs	(revision 660375d429c41878c9a8866c383d5f7ec060c229)
+++ b/crates/ruff_python_formatter/src/pattern/pattern_match_sequence.rs	(date 1747634249453)
@@ -79,9 +79,27 @@
 
 impl SequenceType {
     pub(crate) fn from_pattern(pattern: &PatternMatchSequence, source: &str) -> SequenceType {
-        if source[pattern.range()].starts_with('[') {
+        let before_first_pattern = &source[TextRange::new(
+            pattern.start(),
+            pattern
+                .patterns
+                .first()
+                .map(Ranged::start)
+                .unwrap_or(pattern.end()),
+        )];
+
+        let after_last_pattern = &source[TextRange::new(
+            pattern
+                .patterns
+                .last()
+                .map(Ranged::end)
+                .unwrap_or(pattern.start()),
+            pattern.end(),
+        )];
+
+        if before_first_pattern.starts_with('[') && !after_last_pattern.ends_with(',') {
             SequenceType::List
-        } else if source[pattern.range()].starts_with('(') {
+        } else if before_first_pattern.starts_with('(') {
             // If the pattern is empty, it must be a parenthesized tuple with no members. (This
             // branch exists to differentiate between a tuple with and without its own parentheses,
             // but a tuple without its own parentheses must have at least one member.)

To correctly hanlde

match more := (than, one), indeed,:
    case [[5], (6)],:
        pass
    case _:
        pass

but we should add a test for this. We should also add tests that this change doesn't the formatting of any already formatted code (where Ruff added the extra [ ] because we otherwise need to gate this change behind preview mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allright, i've adapted the code according to your suggestions in fdcebbc

The updates to the Black compatibility are in 0ac564f -- thanks for pointing out that the changes are desired. I was very much in a red things = bad mindset. Took some time to learn about how you do this after. :)

The previously applied formatting is not changed back. So i don't think preview gating is necessary. I added a test for it in 019e5c0.

@maxmynter maxmynter requested a review from MichaReiser May 21, 2025 15:30
@MichaReiser
Copy link
Member

Nice, thank you

@MichaReiser MichaReiser added the bug Something isn't working label May 22, 2025
@MichaReiser MichaReiser merged commit bdf4884 into astral-sh:main May 22, 2025
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working formatter Related to the formatter

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistent formatting of match-case on [] and _

2 participants