[ty] Use TDD-based narrowing constraints and support NoReturn narrowing#23109
Conversation
|
(Probably not ready for review yet, just wanted to see the ecosystem results.) |
6db3494 to
970a77d
Compare
Typing conformance resultsNo changes detected ✅ |
|
Replace sorted-list-based narrowing constraints (AND-only) with TDD (Ternary Decision Diagram) nodes that support AND, OR, and NOT. This changes the merge operation from intersection to OR, so narrowing from non-terminal branches is properly preserved after if/elif/else. Additionally, when a branch contains a NoReturn call (e.g. sys.exit()), gate the narrowing constraints by the ReturnsNever predicate. During the narrowing TDD walk, non-narrowing predicates are evaluated to determine reachability, so NoReturn branches contribute Never (absorbed by union) rather than polluting the merged type. Fixes astral-sh/ty#690 Fixes astral-sh/ty#685 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
970a77d to
2f7e96e
Compare
|
Hard to know exactly how to verify the mypy_primer results, but they appear to be overwhelmingly getting rid of "no attribute on X | None" which is basically what you'd expect form better narrowing propagation. |
We've currently got some non-determinism we're working to fix, so the mypy-primer results are particularly difficult to parse right now -- some projects (sympy, prefect) will have a lot of totally unrelated changes in the diffs. The results look very good to me overall. |
|
| Lint rule | Added | Removed | Changed |
|---|---|---|---|
possibly-missing-attribute |
14 | 118 | 21 |
invalid-argument-type |
4 | 98 | 33 |
invalid-return-type |
0 | 32 | 4 |
not-subscriptable |
0 | 20 | 0 |
unsupported-operator |
0 | 17 | 3 |
invalid-assignment |
0 | 19 | 0 |
no-matching-overload |
0 | 11 | 0 |
unresolved-attribute |
0 | 10 | 0 |
call-non-callable |
0 | 7 | 0 |
not-iterable |
0 | 5 | 1 |
type-assertion-failure |
1 | 0 | 5 |
unused-type-ignore-comment |
2 | 0 | 0 |
invalid-type-form |
0 | 1 | 0 |
unused-ignore-comment |
1 | 0 | 0 |
| Total | 22 | 338 | 67 |
You can also open a draft PR and you'll still get the CI results. Lmk if you want an early review on this now, if not maybe mark it draft until you do. Super exciting, thank you for taking this on! |
|
An early review would be appreciated, thanks! Particularly if there's test cases you think it's missing. |
|
This looks really great, thank you! I only did a coarse review as I understand that this is not quite finished. It's fantastic to see that we can make use of the existing TDD for narrowing constraints as well. I think we should eventually rename some types/functions (reachability constraints => some new term), if we go through with this, but it's probably better to do this in a follow-up to keep the diff small.
One test case that I found that is still not handled with this PR is the following: def _(x: int | None):
if 1 + 1 == 2:
if x is None:
return
reveal_type(x) # revealed: int
reveal_type(x) # revealed: `int | None`, should ideally be `int`Here, |
|
Interesting, in general those sorts of compile time values feel like they though to be basically the same as |
That did used to be true in ty (they were all handled via the reachability-constraints mechanism), but then we added special-cased earlier handling of "simple" statically-known conditions that don't require type inference to resolve. We did that because it was a significant performance boost -- but it does mean that it's now easy to accidentally introduce different behavior for the two cases. |
There was a problem hiding this comment.
Similar to @sharkdp 's suggested test, but slightly different (though I suspect with the same fix) is the OP example in astral-sh/ty#685 but modified to use a condition that (currently) requires type inference:
def _(val: int | None):
if val is None:
if 1+1+2:
return
reveal_type(val) # revealed: int | None, should be just `int`(Of course "requires type inference" is slippery here -- we could certainly fix this by implementing direct support for more constexprs like this in semantic indexing. The real example here would be something like importing a Final constant from another module; we can't resolve that in semantic indexing, we need a reachability constraint to be resolved later.)
|
|
||
| # Only the if-branch (A) and elif-branch (B) flow through. | ||
| # The else-branch returned, so its narrowing doesn't participate. | ||
| reveal_type(x) # revealed: B | (A & ~B) |
There was a problem hiding this comment.
I'm a bit confused about why this isn't A | (B & ~A). If we flow through the if we should just get A, if we flow through the elif we have B & ~A -- adding reveal_type(x) inside the branches above confirms. So it's not clear to me why we don't end up with the union of those here.
I guess those are equivalent types (and also equivalent to just A | B?) -- so maybe this is an acceptable (if slightly surprising) transformation that happens due to the TDD implementation?
There was a problem hiding this comment.
I guess those are equivalent types (and also equivalent to just
A | B?) -- so maybe this is an acceptable (if slightly surprising) transformation that happens due to the TDD implementation?
Yes, the root cause of this is the reversed ordering of TDD nodes that we introduced in #20098.
This didn't have any observable side effects for reachability constraints (which evaluate to a ternary answer), but it does play a role for narrowing constraints.
Removing that .reverse() changes the result here to A | (B & ~A). But it also breaks ~20 other tests. Some of them are just ordering related, but in some other cases, some union-builder simplifications do not seem to trigger which is a bit concerning. I have not investigated more closely, as we probably still need that optimization.
| return | ||
|
|
||
| # Only the elif-B and elif-C branches flow through. | ||
| reveal_type(x) # revealed: (C & ~A) | (B & ~A & ~C) |
There was a problem hiding this comment.
Similarly here I'd naively expect (B & ~A) | (C & ~A & ~B) -- but they are equivalent. I'm just curious why it happens.
|
We could also add some tests using |
|
Took a pass on the tests -- going to have @sharkdp take the lead on review here. |
Record if-statement conditions as narrowing constraints for all places in scope, not just those directly narrowed by the condition. This ensures that unreachable branches (e.g., the else-branch of `if 1 + 1 == 2:`) contribute `Never` rather than diluting narrowing from reachable branches. Uses the same ScopedPredicateId for both specific-places and all-places recording, so TDD atoms are shared via hash-consing. For places already narrowed by the condition, AND(atom, atom) = atom — no duplication. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1783c0a to
ada1084
Compare
Add tests for: - Always-true condition nested inside a narrowing branch - Match statement with terminal default branch (class patterns) - Match statement with reassignment and terminal default (issue astral-sh#1384) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Merging this PR will not alter performance
Comparing Footnotes
|
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
Revert the all-places narrowing propagation introduced in the previous two commits, as it causes exponential blowup on long if/elif chains (e.g. black/nodes.py with 16 conditions). Keep the test cases but update expectations to reflect the current behavior, with TODO comments for future improvement. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Ok, ended up rolling back the fix for |
|
I think this is ready for review:
|
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
|
Amazing 😍 |
sharkdp
left a comment
There was a problem hiding this comment.
Thank you very much for working on this.
I feel like I now have an understanding of how this approach works, so I'm submitting a first review with some higher-level questions, before doing another final pass.
| /// Uses TDD-level negation (`add_not_constraint`) rather than creating a new predicate atom | ||
| /// for the negated predicate. This ensures that `atom(P) OR NOT(atom(P))` simplifies to | ||
| /// `ALWAYS_TRUE` in the TDD, so narrowing is correctly cancelled out after complete | ||
| /// if/else blocks. |
There was a problem hiding this comment.
So this makes me wonder if we can completely replace the is_positive flag in predicates and replace it with NOT(…) from the TDD? With the change here, we now use P with is_positive: false for some situations and NOT(P) for other situations.
There was a problem hiding this comment.
I think yes, it is. But I don't want to make this PR too large, it's already annoyingly large :-( And I don't think the inverse is true, I don't think we could use is_positive for this.
There was a problem hiding this comment.
Ok, I'll look into this in a follow-up.
crates/ty_python_semantic/resources/mdtest/narrow/post_if_statement.md
Outdated
Show resolved
Hide resolved
crates/ty_python_semantic/resources/mdtest/narrow/post_if_statement.md
Outdated
Show resolved
Hide resolved
crates/ty_python_semantic/resources/mdtest/narrow/post_if_statement.md
Outdated
Show resolved
Hide resolved
| /// The "ambiguous" branch in the TDD is not followed for narrowing purposes, because | ||
| /// narrowing constraints record which predicates hold along the control flow path. | ||
| /// The predicates may be statically ambiguous (we can't determine their truthiness | ||
| /// at analysis time), but they still hold dynamically at runtime and should be used | ||
| /// for narrowing. |
There was a problem hiding this comment.
I think this means that a BDD would in principle be sufficient for the narrowing constraints?
There was a problem hiding this comment.
I think so, we're just piggy backing on the TDD from the constraints.
crates/ty_python_semantic/src/semantic_index/reachability_constraints.rs
Show resolved
Hide resolved
crates/ty_python_semantic/src/semantic_index/reachability_constraints.rs
Outdated
Show resolved
Hide resolved
| // True branch: predicate holds → accumulate positive narrowing | ||
| let true_accumulated = | ||
| accumulate_constraint(db, accumulated.clone(), pos_constraint); | ||
| let true_ty = self.narrow_by_constraint_inner( | ||
| db, | ||
| predicates, | ||
| node.if_true, | ||
| base_ty, | ||
| place, | ||
| true_accumulated, | ||
| ); | ||
|
|
||
| // False branch: predicate doesn't hold → accumulate negative narrowing | ||
| let neg_predicate = Predicate { | ||
| node: predicate.node, | ||
| is_positive: !predicate.is_positive, | ||
| }; | ||
| let neg_constraint = infer_narrowing_constraint(db, neg_predicate, place); | ||
| let false_accumulated = accumulate_constraint(db, accumulated, neg_constraint); | ||
| let false_ty = self.narrow_by_constraint_inner( | ||
| db, | ||
| predicates, | ||
| node.if_false, | ||
| base_ty, | ||
| place, | ||
| false_accumulated, | ||
| ); | ||
|
|
||
| UnionType::from_elements(db, [true_ty, false_ty]) |
There was a problem hiding this comment.
It feels like this would benefit from some fast-paths for AlwaysFalse nodes? Otherwise, we'll build a lot of unions with Never.
There was a problem hiding this comment.
Looks like this helped a lot with performance — all regressions are basically gone now.
- Use A/B/C class aliases in sequential if-statement tests for consistency - Remove distracting `y` assignments, use `pass` instead - Rename `val` to `x` for symmetry with adjacent test - Replace wasteful `infer_narrowing_constraint` calls with direct `matches!(predicate.node, PredicateNode::ReturnsNever(_))` check - Add fast-paths to skip ALWAYS_FALSE branches in narrow_by_constraint_inner - Mark Ambiguous arm as unreachable for ReturnsNever predicates Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Memory usage reportSummary
Significant changesClick to expand detailed breakdownprefect
sphinx
trio
flake8
|
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
|
Whoo! Thanks so much! |
Replace sorted-list-based narrowing constraints (AND-only) with TDD (Ternary Decision Diagram) nodes that support AND, OR, and NOT. This changes the merge operation from intersection to OR, so narrowing from non-terminal branches is properly preserved after if/elif/else.
Additionally, when a branch contains a NoReturn call (e.g. sys.exit()), gate the narrowing constraints by the ReturnsNever predicate. During the narrowing TDD walk, non-narrowing predicates are evaluated to determine reachability, so NoReturn branches contribute Never (absorbed by union) rather than polluting the merged type.
Fixes astral-sh/ty#690
Fixes astral-sh/ty#685