-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[ty] Some more simplifications when rendering constraint sets #21009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Pushing this up for initial CI checks and to checkpoint my work. Will add some mdtests highlighting the rendering change tomorrow |
Diagnostic diff on typing conformance testsNo changes detected when running ty on typing conformance tests ✅ |
|
4c937b5 to
671582a
Compare
| reveal_type(t1 | t2) | ||
| # revealed: ty_extensions.ConstraintSet[(U@_ = bool) ∨ (U@_ = str)] | ||
| reveal_type(u1 | u2) | ||
| # revealed: ty_extensions.ConstraintSet[((T@_ = bool) ∧ (U@_ = bool)) ∨ ((T@_ = bool) ∧ (U@_ = str)) ∨ ((T@_ = str) ∧ (U@_ = bool)) ∨ ((T@_ = str) ∧ (U@_ = str))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this PR, this rendered as
((T@_ = bool) ∧ (U@_ = bool) ∧ (U@_ ≠ str)) ∨ ((T@_ = bool) ∧ (U@_ = str)) ∨ ((T@_ = str) ∧ (T@_ ≠ bool) ∧ (U@_ = bool) ∧ (U@_ ≠ str)) ∨ (T@_ = str)
and the display_graph rendering for this BDD is
(T@_ = str)
┡━₁ (U@_ = str)
│ ┡━₁ always
│ └─₀ (U@_ = bool)
│ ┡━₁ always
│ └─₀ never
└─₀ (T@_ = bool)
┡━₁ (U@_ = str)
│ ┡━₁ always
│ └─₀ (U@_ = bool)
│ ┡━₁ always
│ └─₀ never
└─₀ never
| // For a positive and negative constraint, the ranges have to be disjoint, and the | ||
| // positive range implies the negative range. | ||
| // | ||
| // |---------------|...self...|---| | ||
| // ..|---other---|................| | ||
| ( | ||
| ConstraintAssignment::Positive(self_constraint), | ||
| ConstraintAssignment::Negative(other_constraint), | ||
| ) => self_constraint.intersect(db, other_constraint).is_none(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the big one that lets us simplify T = int ∧ T ≠ str to T = int when displaying.
AlexWaygood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you!
crates/ty_python_semantic/resources/mdtest/type_properties/constraints.md
Outdated
Show resolved
Hide resolved
|
|
||
| ## Other simplifications | ||
|
|
||
| When rendering a constraint set, we transform the internal BDD representation into a DNF formula |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I might have stumbled across this terminology in one of your PRs before 😅
When you say "render" here, are you just talking about "display rendering"? I.e., this "rendering" should have no impact on semantics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I mean "render" in the sense of "create a visual representation of", like you'd render an image. It doesn't change the semantics at all, just how the constraint sets are displayed. I can change this to "display"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I sort of have the same question as Micha in #21009 (review) in that case. If we do this simplification only when displaying these constraints, isn't there a risk that we'll be seeing something different to what is "actually happening behind the scenes" when we try to debug these constraints? The simplification logic isn't trivial! It feels like it would be safer to always apply these simplifications, so that we can be sure that what we see when we debug these constraints is actually what's being used in our subtyping/assignability logic by ty in production
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that we're talking about two completely different representations, that are used for different purposes.
As an example, one of the display simplifications that this PR adds would simplify T = int ∧ T ≠ str into T = int. If you know that T is an int, it doesn't add any information to say that it is also not a str.
We could create a BDD for T = int ∧ T ≠ str. Depending on what ordering Salsa assigns IDs, the BDD structure you get is either
(T = int)
┡━₁ (T = str)
│ ┡━₁ never
│ └─₀ always
└─₀ never
or
(T = str)
┡━₁ never
└─₀ (T = int)
┡━₁ always
└─₀ never
The equivalent transformation that you'd perform on the BDD would be something like "for any path through the BDD where T = int is true, you don't need to check T = str, because it must always be false".
We actually do already perform that transformation! But it's not necessarily going to simplify the BDD. It might actually make it more complex! For instance, if the BDD is checking other stuff as well, there might be other reasons we need a node that checks T = str. Since every node must have two outgoing edges, it might not be possible to remove it even if we wanted to.
(And going further into the weeds as an aside, in the BDD structure, we actually don't really ever want to throw information away, since it makes it easier to compare two BDDs without having to know in advance which particular constraints they check. The better analogue of the transformation we perform is "it's impossible for T = int to be true and T = str to be false, so when comparing two BDDs ignore any path with those assignments".)
So even though we have performed these transformations on the BDD already, we might still end up with constraints in the rendered formula that are redundant. And so I wanted to be able to print out the simplest formula that faithfully captures the BDD.
that we can be sure that what we see when we debug these constraints is actually what's being used in our subtyping/assignability logic by ty in production
The other possibility here is that we would avoid performing any display substitutions at all. Then the formula that we print out would be a more direct representation of the BDD structure. But I was finding that very confusing, because even simple cases end up looking more complex than they need to. (A prime example is that half of the time x ∨ y will display as x ∨ y, and half the time as (x ∧ ¬y) ∨ y.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. That makes sense to me :-)
Co-authored-by: Alex Waygood <[email protected]>
MichaReiser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit out of my comfort zone here, so please forgive any dumb questions 😅
This PR adds another useful simplification when rendering constraint sets: T = int instead of T = int ∧ T ≠ str. (The "smaller" constraint T = int implies the "larger" constraint T ≠ str. Constraint set clauses are intersections, and if one constraint in a clause implies another, we can throw away the "larger" constraint.)
What's the reason why we only do this during rendering and don't apply this as a general simplification?
Co-authored-by: Alex Waygood <[email protected]>
Co-authored-by: Alex Waygood <[email protected]>
| /// for a single typevar are guaranteed to be adjacent in the BDD structure. There are several | ||
| /// simplifications that we perform that operate on constraints with the same typevar, and this | ||
| /// ensures that we can find all candidate simplifications more easily. | ||
| fn ordering(self, db: &'db dyn Db) -> impl Ord { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clever :)
|
* main: (65 commits) [ty] Some more simplifications when rendering constraint sets (#21009) [ty] Make `attributes.md` mdtests faster (#21030) [ty] Set `INSTA_FORCE_PASS` and `INSTA_OUTPUT` environment variables from mdtest.py (#21029) [ty] Fall back to `Divergent` for deeply nested specializations (#20988) [`ruff`] Autogenerate TypeParam nodes (#21028) [ty] Add assertions to ensure that we never call `KnownClass::Tuple.to_instance()` or similar (#21027) [`ruff`] Auto generate ast Pattern nodes (#21024) [`flake8-simplify`] Skip `SIM911` when unknown arguments are present (#20697) Render a diagnostic for syntax errors introduced in formatter tests (#21021) [ty] Support goto-definition on vendored typeshed stubs (#21020) [ty] Implement go-to for binary and unary operators (#21001) [ty] Avoid ever-growing default types (#20991) [syntax-errors] Name is parameter and global (#20426) [ty] Disable panicking mdtest (#21016) [ty] Fix completions at end of file (#20993) [ty] Fix out-of-order semantic token for function with regular argument after kwargs (#21013) [ty] Fix auto import for files with `from __future__` import (#20987) [`fastapi`] Handle ellipsis defaults in FAST002 autofix (`FAST002`) (#20810) [`ruff`] Skip autofix for keyword and `__debug__` path params (#20960) [`flake8-bugbear`] Skip `B905` and `B912` if <2 iterables and no starred arguments (#20998) ...
This PR adds another useful simplification when rendering constraint sets:
T = intinstead ofT = int ∧ T ≠ str. (The "smaller" constraintT = intimplies the "larger" constraintT ≠ str. Constraint set clauses are intersections, and if one constraint in a clause implies another, we can throw away the "larger" constraint.)While we're here, we also normalize the bounds of a constraint, so that we equate e.g.
T ≤ int | strwithT ≤ str | int, and change the ordering of BDD variables so that all constraints with the same typevar are ordered adjacent to each other.Lastly, we also add a new
display_graphhelper method that prints out the full graph structure of a BDD.