Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle pattern parentheses in FormatPattern #6800

Merged
merged 2 commits into from
Aug 25, 2023
Merged

Conversation

charliermarsh
Copy link
Member

@charliermarsh charliermarsh commented Aug 23, 2023

Summary

This PR fixes the duplicate-parenthesis problem that's visible in the tests from #6799. The issue is that we might have parentheses around the entire match-case pattern, like in (1) here:

match foo:
    case (1):
        y = 0

In this case, the inner expression (1) will think it's parenthesized, but we'll also detect the parentheses at the case level -- so they get rendered by the case, then again by the expression. Instead, if we detect parentheses at the case level, we can force-off the parentheses for the pattern using a design similar to the way we handle parentheses on expressions.

Closes #6753.

Test Plan

cargo test

@charliermarsh charliermarsh added the formatter Related to the formatter label Aug 23, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Aug 23, 2023

PR Check Results

Benchmark

Linux

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.00      4.7±0.05ms     8.7 MB/sec    1.00      4.7±0.04ms     8.6 MB/sec
formatter/numpy/ctypeslib.py               1.02    999.6±5.17µs    16.7 MB/sec    1.00   983.7±10.99µs    16.9 MB/sec
formatter/numpy/globals.py                 1.01     95.4±1.12µs    30.9 MB/sec    1.00     94.8±1.33µs    31.1 MB/sec
formatter/pydantic/types.py                1.01  1943.6±18.59µs    13.1 MB/sec    1.00  1915.0±32.64µs    13.3 MB/sec
linter/all-rules/large/dataset.py          1.00     11.9±0.17ms     3.4 MB/sec    1.01     12.0±0.14ms     3.4 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.01      3.2±0.02ms     5.2 MB/sec    1.00      3.2±0.04ms     5.2 MB/sec
linter/all-rules/numpy/globals.py          1.00    456.0±6.73µs     6.5 MB/sec    1.00    456.6±4.04µs     6.5 MB/sec
linter/all-rules/pydantic/types.py         1.02      6.3±0.06ms     4.0 MB/sec    1.00      6.2±0.08ms     4.1 MB/sec
linter/default-rules/large/dataset.py      1.00      6.4±0.06ms     6.4 MB/sec    1.00      6.4±0.07ms     6.4 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00  1411.5±13.50µs    11.8 MB/sec    1.04  1464.2±83.98µs    11.4 MB/sec
linter/default-rules/numpy/globals.py      1.00    165.5±2.07µs    17.8 MB/sec    1.00    165.6±2.57µs    17.8 MB/sec
linter/default-rules/pydantic/types.py     1.00      2.9±0.02ms     8.8 MB/sec    1.00      2.9±0.03ms     8.8 MB/sec

Windows

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.03      6.4±0.29ms     6.4 MB/sec    1.00      6.2±0.21ms     6.6 MB/sec
formatter/numpy/ctypeslib.py               1.02  1283.4±57.25µs    13.0 MB/sec    1.00  1257.8±48.67µs    13.2 MB/sec
formatter/numpy/globals.py                 1.00    115.2±5.20µs    25.6 MB/sec    1.01    116.5±8.42µs    25.3 MB/sec
formatter/pydantic/types.py                1.01      2.5±0.11ms    10.1 MB/sec    1.00      2.5±0.09ms    10.1 MB/sec
linter/all-rules/large/dataset.py          1.00     17.9±0.76ms     2.3 MB/sec    1.03     18.5±0.61ms     2.2 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.00      4.8±0.16ms     3.4 MB/sec    1.02      4.9±0.18ms     3.4 MB/sec
linter/all-rules/numpy/globals.py          1.03   615.4±44.17µs     4.8 MB/sec    1.00   599.2±22.48µs     4.9 MB/sec
linter/all-rules/pydantic/types.py         1.00      9.2±0.38ms     2.8 MB/sec    1.01      9.2±0.36ms     2.8 MB/sec
linter/default-rules/large/dataset.py      1.00      9.9±0.27ms     4.1 MB/sec    1.00      9.9±0.31ms     4.1 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.01      2.1±0.10ms     8.0 MB/sec    1.00      2.1±0.07ms     8.1 MB/sec
linter/default-rules/numpy/globals.py      1.03   266.6±22.51µs    11.1 MB/sec    1.00   257.9±10.99µs    11.4 MB/sec
linter/default-rules/pydantic/types.py     1.01      4.5±0.15ms     5.7 MB/sec    1.00      4.4±0.10ms     5.8 MB/sec

Comment on lines -136 to +120
+ case ["go", NOT_YET_IMPLEMENTED_PatternMatchOf | (y)]:
+ case ["go", (NOT_YET_IMPLEMENTED_PatternMatchOf | (y))]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the addition of parentheses around unions, I believe that will be handled in MatchOr?

Comment on lines 374 to 373
case (
4 as d,
5 as e,
NOT_YET_IMPLEMENTED_PatternMatchOf | (y) as g,
*NOT_YET_IMPLEMENTED_PatternMatchStar,
):
case 4 as d, (5 as e), (
NOT_YET_IMPLEMENTED_PatternMatchOf | (y) as g
), *NOT_YET_IMPLEMENTED_PatternMatchStar:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated but shouldn't we consider the magic trailing comma and keep each pattern on a separate line?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should

Comment on lines 42 to 46
parenthesized(
"(",
&pattern.format().with_options(Parentheses::Never),
")",
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need to preserve the parentheses? Black does but in other nodes we don't:

while (True):
	pass

Gets formatted to:

while True:
	pass

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our rule for expressions are:

  • Top level expression (inside a statement): Remove unnecessary parentheses, see your while (True) example
  • nested expressions: Preserve the parentheses (except for await where Black removes parentheses).

IMO the behavior should be the same for Patterns. Remove unnecessary parentheses around the outermost pattern but preserve them for nested patterns. See #6753

Meaning, we should use parenthesize_if_expands here instead of preserving the parentheses.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try out parenthesize_if_expands.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this also requires knowing whether the pattern has its own parentheses (e.g., for sequence patterns).

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this address #6753 ? If not, what's missing? Could we implement the required changes to match the expected output

Comment on lines 42 to 46
parenthesized(
"(",
&pattern.format().with_options(Parentheses::Never),
")",
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our rule for expressions are:

  • Top level expression (inside a statement): Remove unnecessary parentheses, see your while (True) example
  • nested expressions: Preserve the parentheses (except for await where Black removes parentheses).

IMO the behavior should be the same for Patterns. Remove unnecessary parentheses around the outermost pattern but preserve them for nested patterns. See #6753

Meaning, we should use parenthesize_if_expands here instead of preserving the parentheses.

crates/ruff_python_formatter/src/pattern/mod.rs Outdated Show resolved Hide resolved
crates/ruff_python_formatter/src/pattern/mod.rs Outdated Show resolved Hide resolved
Comment on lines 374 to 373
case (
4 as d,
5 as e,
NOT_YET_IMPLEMENTED_PatternMatchOf | (y) as g,
*NOT_YET_IMPLEMENTED_PatternMatchStar,
):
case 4 as d, (5 as e), (
NOT_YET_IMPLEMENTED_PatternMatchOf | (y) as g
), *NOT_YET_IMPLEMENTED_PatternMatchStar:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should

@@ -39,9 +39,13 @@ impl FormatNodeRule<MatchCase> for FormatMatchCase {
write!(f, [text("case"), space()])?;

if is_match_case_pattern_parenthesized(item, pattern, f.context())? {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why calling pattern.format isn't sufficient? Why can the pattern and the MatchCase both have trailing opening parentheses comments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider:

match thing:
    case ( # outer
        [  # inner
            1
        ]
    )

The # outer is attached to the MatchCase, but the # inner is part of the pattern itself. (Not all patterns have this behavior.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so it depends on whether the pattern has its own parentheses? How does this work if you have nested pattern, which doesn't go through the match case formatting?

match thing:
	case [
		( # outer
			[ # inner
				1, 
			]
		)
	]: ... 

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of these cases work as expected as of the following branch (#6801):

match thing:
    case [
		( # outer
			[ # inner
				1,
			]
		)
	]: ...
    case [ # outer
		( # inner outer
			[ # inner
				1,
			]
		)
	]: ...

# inner outer gets assigned as a leading end-of-line comment via our standard parenthesized comment handling, which then gets rendered as an open parenthesis comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, let me confirm that that's why it works, hang on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Err, I think it works because it's a leading end-of-line comment which we always treat as an open parenthesis comment. (And the comments on the inner square brackets work because they're correctly marked as dangling for the sequence pattern type.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we kind of get this wrong:

    case [ # outer
        # own line
		( # inner outer
			[ # inner
				1,
			]
		)
	]:
        pass

It renders as:

    case [  # outer
        (
            # own line
            # inner outer
            [  # inner
                1,
            ]
        )
    ]:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might be able to remove this, I'm not sure yet, I'll explore it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'd thought we wanted to preserve the parentheses here always which complicated this a bit.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I was able to remove the dangling comment from match_case.

Base automatically changed from charlie/value to main August 23, 2023 14:01
@charliermarsh charliermarsh force-pushed the charlie/paren-value branch 3 times, most recently from 5d9d616 to 161e3c5 Compare August 23, 2023 15:37
@charliermarsh
Copy link
Member Author

Does this address #6753 ? If not, what's missing?

Yes I believe it should (sorry, I missed that issue, I stumbled upon this independently). Added a test + linked the issue.

@charliermarsh
Copy link
Member Author

I still need to figure out the parenthesize_if_expands, working on that now.

@charliermarsh
Copy link
Member Author

Okay @MichaReiser I think this is ready for review based on your feedback.

OptionalParentheses::Never => {
pattern.format().with_options(Parentheses::Never).fmt(f)?;
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is relatively similar to maybe_parenthesize_expression...

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I love it that we can reuse the NeedsParentheses concept

@charliermarsh charliermarsh enabled auto-merge (squash) August 25, 2023 03:39
@charliermarsh charliermarsh merged commit 6f23469 into main Aug 25, 2023
2 checks passed
@charliermarsh charliermarsh deleted the charlie/paren-value branch August 25, 2023 03:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
formatter Related to the formatter
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Preserve pattern parentheses
4 participants