-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[red-knot] Consider all definitions after terminal statements unreachable #15676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
| code | total | + violation | - violation | + fix | - fix |
|---|---|---|---|---|---|
| SIM108 | 10 | 5 | 5 | 0 | 0 |
Linter (preview)
ℹ️ ecosystem check detected linter changes. (+5 -5 violations, +0 -0 fixes in 2 projects; 53 projects unchanged)
bokeh/bokeh (+4 -4 violations, +0 -0 fixes)
ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview --select ALL
+ examples/server/api/flask_embed.py:26:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f"{new}D").mean()` instead of `if`-`else`-block - examples/server/api/flask_embed.py:26:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f'{new}D').mean()` instead of `if`-`else`-block + examples/server/api/flask_gunicorn_embed.py:41:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f"{new}D").mean()` instead of `if`-`else`-block - examples/server/api/flask_gunicorn_embed.py:41:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f'{new}D').mean()` instead of `if`-`else`-block + examples/server/api/standalone_embed.py:18:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f"{new}D").mean()` instead of `if`-`else`-block - examples/server/api/standalone_embed.py:18:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f'{new}D').mean()` instead of `if`-`else`-block + examples/server/api/tornado_embed.py:29:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f"{new}D").mean()` instead of `if`-`else`-block - examples/server/api/tornado_embed.py:29:9: SIM108 Use ternary operator `data = df if new == 0 else df.rolling(f'{new}D').mean()` instead of `if`-`else`-block
zulip/zulip (+1 -1 violations, +0 -0 fixes)
ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview --select ALL
+ scripts/lib/sharding.py:65:21: SIM108 Use ternary operator `host = shard if "." in shard else f"{shard}.{external_host}"` instead of `if`-`else`-block - scripts/lib/sharding.py:65:21: SIM108 Use ternary operator `host = shard if "." in shard else f'{shard}.{external_host}'` instead of `if`-`else`-block
Changes by rule (1 rules affected)
| code | total | + violation | - violation | + fix | - fix |
|---|---|---|---|---|---|
| SIM108 | 10 | 5 | 5 | 0 | 0 |
|
There are a couple of new diagnostics in the benchmark that don't look correct to me. I need to see if I can minimize that into an mdtest to diagnose. |
carljm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic!! Love to see a feature that is easier than anticipated :)
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
* main: [red-knot] MDTests: Do not depend on precise public-symbol type inference (#15691) [red-knot] Make `infer.rs` unit tests independent of public symbol inference (#15690) Tidy knot CLI tests (#15685) [red-knot] Port comprehension tests to Markdown (#15688) Create Unknown rule diagnostics with a source range (#15648) [red-knot] Port 'deferred annotations' unit tests to Markdown (#15686) [red-knot] Support custom typeshed Markdown tests (#15683) Don't run the linter ecosystem check on PRs that only touch red-knot crates (#15687) Add `rules` table to configuration (#15645) [red-knot] Make `Diagnostic::file` optional (#15640) [red-knot] Add test for nested attribute access (#15684) [red-knot] Anchor relative paths in configurations (#15634) [`pyupgrade`] Handle multiple base classes for PEP 695 generics (`UP046`) (#15659) [`pyflakes`] Treat arguments passed to the `default=` parameter of `TypeVar` as type expressions (`F821`) (#15679) Upgrade zizmor to the latest version in CI (#15649) [`pyupgrade`] Add rules to use PEP 695 generics in classes and functions (`UP046`, `UP047`) (#15565) [red-knot] Ensure a gradual type can always be assigned to itself (#15675)
|
I don't think you need to change anything for this PR, but just so it's on your radar: def f():
x = 1
while True:
try:
break
finally:
x = 2
reveal_type(x) # revealed: Literal[2] (it gives a revealed type of |
|
Yeah, we can handle |
Co-authored-by: Carl Meyer <[email protected]>
* main: Add `check` command (#15692) [red-knot] Use itertools to clean up `SymbolState::merge` (#15702) [red-knot] Add `--ignore`, `--warn`, and `--error` CLI arguments (#15689) Use `uv init --lib` in tutorial (#15718) [red-knot] Use `Unknown | T_inferred` for undeclared public symbols (#15674) [`ruff`] Parenthesize fix when argument spans multiple lines for `unnecessary-round` (`RUF057`) (#15703) [red-knot] Rename `TestDbBuilder::typeshed` to `.custom_typeshed` (#15712) Honor banned top level imports by TID253 in PLC0415. (#15628) Apply `AIR302`-context check only in `@task` function (#15711) [`airflow`] Update `AIR302` to check for deprecated context keys (#15144) Remove test rules from JSON schema (#15627) Add two missing commits to changelog (#15701) Fix grep for version number in docker build (#15699) Bump version to 0.9.3 (#15698) Preserve raw string prefix and escapes (#15694) [`flake8-pytest-style`] Rewrite references to `.exception` (`PT027`) (#15680)
dcreager
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, the following test fails on this branch:
Thanks @dylwil3! I added that as a failing test case. I'm going to poke at it briefly to see if it's easy to add for this PR
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
* main: Run `cargo update` (#15769) [red-knot] Document public symbol type inferece (#15766) Update dawidd6/action-download-artifact action to v8 (#15760) Update NPM Development dependencies (#15758) Update pre-commit dependencies (#15756) Update dependency ruff to v0.9.3 (#15755) Update dependency mdformat-mkdocs to v4.1.2 (#15754) Update Rust crate uuid to v1.12.1 (#15753) Update Rust crate unicode-ident to v1.0.15 (#15752) Fix docstring in ruff_annotate_snippets (#15748) Update Rust crate insta to v1.42.1 (#15751) Update Rust crate clap to v4.5.27 (#15750) Add references to `trio.run_process` and `anyio.run_process` (#15761) [`ruff`] Do not emit diagnostic when all arguments to `zip()` are variadic (`RUF058`) (#15744) [red-knot] Ensure differently ordered unions are considered equivalent when they appear inside tuples inside top-level intersections (#15743) [red-knot] Ensure differently ordered unions and intersections are understood as equivalent even inside arbitrarily nested tuples (#15740) [red-knot] Promote the `all_type_pairs_are_assignable_to_their_union` property test to stable (#15739) [`pylint`] Do not trigger `PLR6201` on empty collections (#15732) Improve the file watching failure error message (#15728) Speed symbol state merging back up (#15731)
Co-authored-by: Carl Meyer <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I was curious how the interplay between statically-known branches and terminal statements worked
That's a good example @sharkdp! Before I was also adding a ~AlwaysTrue visibility constraint when we encountered a terminal statement, which (edit: I think) would handle your example. I removed it because it seemed to be interacting incorrectly with continue and break. (The new visibility constraint should apply to the rest of the current flow, but should not apply when we jump back to the beginning of the loop.) But @carljm and I convinced each other in Discord that the issue with continue is that we haven't implemented the jump back to top-of-loop yet (pending fixpoint support in salsa) — and I think that would solve the visibility constraint issue too...
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
| ## Early returns and list comprehensions | ||
|
|
||
| ```py | ||
| def f(x: str) -> int: | ||
| y = [x for i in range(len(x))] | ||
| return 4 | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a minimal reproduction of an error I was getting in the tomllib benchmark test. I thought to put it here to catch it earlier in the CI process, but since it's redundant with the tomllib test I'll remove it. (Maybe a better way to handle this is to move the assertions out of the benchmark and into a new test case that also analyzes tomllib? That way the benchmark is only concerned with performance, and the test with correctness.)
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
| // Unreachable snapshots should not be merged: If the current snapshot is unreachable, it | ||
| // should be completely overwritten by the snapshot we're merging in. If the other snapshot | ||
| // is unreachable, we should return without merging. | ||
| if !snapshot.reachable { | ||
| return; | ||
| } | ||
| if !self.reachable { | ||
| self.restore(snapshot); | ||
| return; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but it seems a little odd that we keep the visible definitions state from
selfin that case
If we want this, I think it would be best to add an invariant that marking a flow as unreachable clears out all of its definitions, and update mark_unreachable and restore to maintain that invariant. Then this code in merge would do the right thing as you describe.
I guess it will make a difference to the types we reveal in the following unreachable code?
Yes, that's exactly right. (In the sense that that's what the code does, not necessarily that that's what we want it to do 😅) For now, I was punting on this, because this PR isn't currently addressing what we want to do for unreachable code. (Note that in the mdtests I've tried to not put in any reveal_types in unreachable positions.)
I think there are a couple of issues at play here. One is that, not even considering the merge, what do we want to report in the unreachable code within the same block after a terminal statement?
x = 2
return
reveal_type(x) # ???Should it be an unresolved-reference error? Or should it act as if the terminal statement weren't there, and show what x would be if control flow could somehow make it to that point? Or should we silence all diagnostics completely in unreachable code?
Whatever we choose, we should have the same result for
if cond:
x = 2
return
else:
x = 3
return
reveal_type(x) # ???If we decide that we want the first case to reveal Literal[2], then we'd want this case to reveal Literal[2, 3] — which means that we actually do want to merge all of the flows, even if they're unreachable, and it's just the visibility of the relevant symbols that needs to be tracked/adjusted somehow.
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
carljm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments on the new tests, but the behavior looks good for this PR!
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
| reveal_type(x) # revealed: Literal["else"] | ||
| reveal_type(x) # revealed: Literal["else"] | ||
| except ValueError: | ||
| # TODO: Literal["raise"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this TODO is accurate, since reveal_type is a call, and I don't think we'd special-case it to assume it can't raise? So at the very least "else" is a possible value here.
I think it's accurate to say that "before" is not possible here, but only if we understand that boolean-testing a value of type bool is a special case that doesn't execute a __bool__ method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this TODO is accurate, since
reveal_typeis a call, and I don't think we'd special-case it to assume it can't raise? So at the very least"else"is a possible value here.
Ah, I was actually thinking we would try to do that somehow! But per above, that deserves discussion, so I'll back out the assumption that we'd try to do that.
I think it's accurate to say that
"before"is not possible here, but only if we understand that boolean-testing a value of typeboolis a special case that doesn't execute a__bool__method.
I removed the TODO entirely, leaving "before" as a potential possibility here too, so that we're not making any assumptions about how we might make exception tracking less approximate.
| TODO: We are not currently implementing the "jump" behavior correctly for `raise` statements. The | ||
| false positives in this section are because of that, and not our terminal statement support. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT from the TODOs below, it looks like the only problem you're referring to here is our over-approximation of the possible location where an exception could be raised. Let's describe this a bit more clearly, to save future us from wondering what we meant here. (Also re-wording to avoid making it specific to raise statements, since it's really about too many jumps from places that aren't raise statements at all, and to avoid describing it as incorrect, since technically (given KeyboardInterrupt) the current behavior is correct, just likely not preferable.
| TODO: We are not currently implementing the "jump" behavior correctly for `raise` statements. The | |
| false positives in this section are because of that, and not our terminal statement support. | |
| Currently we assume that an exception could be raised anywhere within a `try` block; the TODOs below reflect | |
| cases where we could implement a more precise understanding of where exceptions (barring `KeyboardInterrupt` | |
| and `MemoryError`) can and cannot actually be raised. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of points, though the tl;dr is that I like your edit:
-
The part about the jump behavior was less about "exceptions can come from anywhere" and more about "a
raisedefinitely doesn't execute theelseclause". The latter should be something we can model regardless of how approximate our exception tracking is. But we're actually giving the correct result below in theelsereveal_type, so you're right that this isn't accurately aTODO! That said, I think it's coincidence that we're giving the correct result in theelseclause —"raise"isn't included because we're treatingraisethe same asreturn, not because we know thatraiseskips theelseclause. (And"raise"is included in theexceptclauses not because we know theraisestatement jumps there — with this PR we think theraiseskips everything since it's terminal! — but because our approximation thinks an unrelated exception might occur just after the assignment.) -
I had written this (and the TODOs below) describing the goal of a less approximate exception tracking strategy. But that deserves discussion about what we'd want that to look like, so I like your suggestion to describe this in terms of what we're currently doing instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the first bullet point here is something I hadn't fully understood! It sort of seems like the current "right behavior by accident" might suffice until/unless we implement tighter understandings of where exceptions can be raised, at which point we might also need better understanding of what raise actually does. Certainly wouldn't object to adding some text to record this context for future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I tried to write up an edit describing this, and I kind of ended up concluding that we may never need to implement any special understanding of where raise can jump to? Even if we tighten up our understanding of where exceptions can be raised, it seems like the only thing we'll need to do is maintain two things: 1) understanding raise as terminal, as we do already in this PR, and 2) still understanding raise as "a point where an exception can be raised", as we do in this PR.
(2) seems unlikely to be something we'd miss in that future where we're adding more understanding of exception points, so I'm thinking maybe we don't need to document this any more than it is already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if we tighten up our understanding of where exceptions can be raised, it seems like the only thing we'll need to do is maintain two things: 1) understanding
raiseas terminal, as we do already in this PR, and 2) still understandingraiseas "a point where an exception can be raised", as we do in this PR.
Ah yes, that sounds right!
so I'm thinking maybe we don't need to document this any more than it is already.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still understanding
raiseas "a point where an exception can be raised", as we do in this PR.
The only place where this might fall over is that a raise can only raise an exception, whereas a call can but doesn't have to raise one. So calls could jump to except or else, while raise could only jump to except.
No, wait! Calls can jump to except or flow through to the next statement, and the end of the try block flows to else. So yes, you're right, raise being terminal within the try block would correctly encode that it can't "jump" to else. (Nothing actually "jumps" there, in fact.)
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
Co-authored-by: Carl Meyer <[email protected]>
* main: [red-knot] Extend instance-attribute tests (#15808) Fix formatter warning message for `flake8-quotes` option (#15788) [`flake8-bugbear`] Exempt `NewType` calls where the original type is immutable (`B008`) (#15765) Add missing config docstrings (#15803) [`refurb`] Do not emit diagnostic when loop variables are used outside loop body (`FURB122`) (#15757) [`ruff`] Check for shadowed `map` before suggesting fix (`RUF058`) (#15790) [red-knot] Do not use explicit `knot_extensions.Unknown` declaration (#15787) Preserve quotes in generated byte strings (#15778) [minor] Simplify some `ExprStringLiteral` creation logic (#15775) Preserve quote style in generated code (#15726) Rename internal helper functions (#15771) [`airflow`] Extend airflow context parameter check for `BaseOperator.execute` (`AIR302`) (#15713) Implement tab autocomplete for `ruff config` (#15603)
carljm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor edit, but looks land-ready to me!
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
| TODO: We are not currently implementing the "jump" behavior correctly for `raise` statements. The | ||
| false positives in this section are because of that, and not our terminal statement support. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I tried to write up an edit describing this, and I kind of ended up concluding that we may never need to implement any special understanding of where raise can jump to? Even if we tighten up our understanding of where exceptions can be raised, it seems like the only thing we'll need to do is maintain two things: 1) understanding raise as terminal, as we do already in this PR, and 2) still understanding raise as "a point where an exception can be raised", as we do in this PR.
(2) seems unlikely to be something we'd miss in that future where we're adding more understanding of exception points, so I'm thinking maybe we don't need to document this any more than it is already.
…ements.md Co-authored-by: Carl Meyer <[email protected]>
…ranches (#15817) This example from @sharkdp shows how terminal statements can appear in statically known branches: #15676 (comment) ```py def _(cond: bool): x = "a" if cond: x = "b" if True: return reveal_type(x) # revealed: "a", "b"; should be "a" ``` We now use visibility constraints to track reachability, which allows us to model this correctly. There are two related changes as a result: - New bindings are not assumed to be visible; they inherit the current "scope start" visibility, which effectively means that new bindings are visible if/when the current flow is reachable - When simplifying visibility constraints after branching control flow, we only simplify if none of the intervening branches included a terminal statement. That is, earlier unaffected bindings are only _actually_ unaffected if all branches make it to the merge point.
| visibility_constraints: VisibilityConstraints::default(), | ||
| scope_start_visibility: ScopedVisibilityConstraintId::ALWAYS_TRUE, | ||
| bindings_by_use: IndexVec::new(), | ||
| definitions_by_definition: FxHashMap::default(), |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
| reachable: bool, | ||
| } | ||
|
|
||
| #[derive(Debug)] |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
FlowSnapshotnow tracks areachablebool, which indicates whether we have encountered a terminal statement on that control flow path. When merging flow states together, we skip any that have been marked unreachable. This ensures that bindings that can only be reached through unreachable paths are not considered visible.Test Plan
The new mdtests failed (with incorrect
reveal_typeresults, and spuriouspossibly-unresolved-referenceerrors) before adding the new visibility constraints.