Skip to content

[ty] Simplify union lower bounds and intersection upper bounds in constraint sets#21871

Merged
dcreager merged 4 commits intomainfrom
dcreager/die-die-intersections
Dec 10, 2025
Merged

[ty] Simplify union lower bounds and intersection upper bounds in constraint sets#21871
dcreager merged 4 commits intomainfrom
dcreager/die-die-intersections

Conversation

@dcreager
Copy link
Member

@dcreager dcreager commented Dec 9, 2025

In a constraint set, it's not useful for an upper bound to be an intersection type, or for a lower bound to be a union type. Both of those can be rewritten as simpler BDDs:

T ≤ α & β  ⇒ (T ≤ α) ∧ (T ≤ β)
T ≤ α & ¬β ⇒ (T ≤ α) ∧ ¬(T ≤ β)
α | β ≤ T  ⇒ (α ≤ T) ∧ (β ≤ T)

We were seeing performance issues on #21551 when not performing this simplification. For instance, pandas was producing some constraint sets involving intersections of 8-9 different types. Our sequent map calculation was timing out calculating all of the different permutations of those types:

t1 & t2 & t3 → t1
t1 & t2 & t3 → t2
t1 & t2 & t3 → t3
t1 & t2 & t3 → t1 & t2
t1 & t2 & t3 → t1 & t3
t1 & t2 & t3 → t2 & t3

(and then imagine what that looks like for 9 types instead of 3...)

With this change, all of those permutations are now encoded in the BDD structure itself, which is very good at simplifying that kind of thing.

Pulling this out of #21551 for separate review.

@dcreager dcreager added internal An internal refactor or improvement ty Multi-file analysis & type inference labels Dec 9, 2025
@astral-sh-bot
Copy link

astral-sh-bot bot commented Dec 9, 2025

Diagnostic diff on typing conformance tests

No changes detected when running ty on typing conformance tests ✅

@astral-sh-bot
Copy link

astral-sh-bot bot commented Dec 9, 2025

mypy_primer results

Changes were detected when running on open source projects
beartype (https://github.com/beartype/beartype)
+ beartype/claw/_package/clawpkgtrie.py:66:29: warning[unsupported-base] Unsupported class base with type `<class 'dict[str, PackagesTrieBlacklist]'> | <class 'dict[str, Divergent]'>`
+ beartype/claw/_package/clawpkgtrie.py:247:29: warning[unsupported-base] Unsupported class base with type `<class 'dict[str, PackagesTrieWhitelist]'> | <class 'dict[str, Divergent]'>`
- Found 492 diagnostics
+ Found 494 diagnostics

scikit-build-core (https://github.com/scikit-build/scikit-build-core)
+ src/scikit_build_core/build/_pathutil.py:25:38: error[invalid-argument-type] Argument to function `__new__` is incorrect: Expected `str | PathLike[str]`, found `DirEntry[Path]`
+ src/scikit_build_core/build/_pathutil.py:27:24: error[invalid-argument-type] Argument to function `__new__` is incorrect: Expected `str | PathLike[str]`, found `DirEntry[Path]`
+ src/scikit_build_core/build/wheel.py:98:20: error[no-matching-overload] No overload of bound method `__init__` matches arguments
- Found 41 diagnostics
+ Found 44 diagnostics

No memory usage changes detected ✅

@dcreager dcreager merged commit c343e94 into main Dec 10, 2025
41 checks passed
@dcreager dcreager deleted the dcreager/die-die-intersections branch December 10, 2025 00:49
dcreager added a commit that referenced this pull request Dec 10, 2025
* origin/main: (33 commits)
  [ty] Simplify union lower bounds and intersection upper bounds in constraint sets (#21871)
  [ty] Collapse `never` paths in constraint set BDDs (#21880)
  Fix leading comment formatting for lambdas with multiple parameters (#21879)
  [ty] Type inference for `@asynccontextmanager` (#21876)
  Fix comment placement in lambda parameters (#21868)
  [`pylint`] Detect subclasses of builtin exceptions (`PLW0133`) (#21382)
  Fix stack overflow with recursive generic protocols (depth limit) (#21858)
  New diagnostics for unused range suppressions (#21783)
  [ty] Use default settings in completion tests
  [ty] Infer type variables within generic unions  (#21862)
  [ty] Fix overload filtering to prefer more "precise" match (#21859)
  [ty] Stabilize auto-import
  [ty] Fix reveal-type E2E test (#21865)
  [ty] Use concise message for LSP clients not supporting related diagnostic information (#21850)
  Include more details in Tokens 'offset is inside token' panic message (#21860)
  apply range suppressions to filter diagnostics (#21623)
  [ty] followup: add-import action for `reveal_type` too (#21668)
  [ty] Enrich function argument auto-complete suggestions with annotated types
  [ty] Add autocomplete suggestions for function arguments
  [`flake8-bugbear`] Accept immutable slice default arguments (`B008`) (#21823)
  ...
Comment on lines +503 to +505
// T ≤ α & β ⇒ (T ≤ α) ∧ (T ≤ β)
// T ≤ α & ¬β ⇒ (T ≤ α) ∧ ¬(T ≤ β)
// α | β ≤ T ⇒ (α ≤ T) ∧ (β ≤ T)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You write these as implications, but don't we need those to be equivalences?

The one about negations looks interesting. I guess it could be simplified to T ≤ ¬β ⇒ ¬(T ≤ β), since the intersection part of it is already covered by the first rule. But more importantly, is this really correct? The left hand side seems to always be true for T = Never, whereas the right hand side seems to always be false for T = Never. That would mean it's not even true as an implication?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one about negations looks interesting. I guess it could be simplified to T ≤ ¬β ⇒ ¬(T ≤ β), since the intersection part of it is already covered by the first rule. But more importantly, is this really correct? The left hand side seems to always be true for T = Never, whereas the right hand side seems to always be false for T = Never. That would mean it's not even true as an implication?

Ah good catch! Negation doesn't distribute through the check like I wrote it. It should be something like

T ≤ ¬α & ¬β ⇒ (T ≤ ¬α) ∧ (T ≤ ¬β)

i.e. we can still separate out the negation elements of the intersection, but they should remain negated types and a positive ≤ check.

I'll fix this in a follow-on PR.

You write these as implications, but don't we need those to be equivalences?

They are equivalences, but we're using this as a normalization step, so we only want to apply them in the direction that I've written them. That is, the goal with this change is that the upper bound of a constraint will never be an intersection type anymore. (and ditto for the lower bound never being a union type)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You write these as implications, but don't we need those to be equivalences?

They are equivalences, but we're using this as a normalization step, so we only want to apply them in the direction that I've written them. That is, the goal with this change is that the upper bound of a constraint will never be an intersection type anymore. (and ditto for the lower bound never being a union type)

👍

I'm being pedantic, but I think what I meant was: we need these to be equivalences, or otherwise, the structural simplification that we apply here (in one direction) might lead to a constraint set that is not equivalent to the original constraint set anymore. But even if my thinking is correct, there's no need to change anything. The arrows can also just represent the direction in which we're performing the simplification. I mainly wanted to understand.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need these to be equivalences, or otherwise, the structural simplification that we apply here (in one direction) might lead to a constraint set that is not equivalent to the original constraint set anymore.

Got it! I think I got this meaning correct in the new comment in #21897

dcreager added a commit that referenced this pull request Dec 10, 2025
This fixes the logic error that @sharkdp
[found](#21871 (comment))
in the constraint set upper bound normalization logic I introduced in
#21871.

I had originally claimed that `(T ≤ α & ~β)` should simplify into `(T ≤
α) ∧ ¬(T ≤ β)`. But that also suggests that `T ≤ ~β` should simplify to
`¬(T ≤ β)` on its own, and that's not correct.

The correct simplification is that `~α` is an "atomic" type, not an
"intersection" for the purposes of our upper bound simplifcation. So `(T
≤ α & ~β)` should simplify to `(T ≤ α) ∧ (T ≤ ~β)`. That is, break apart
the elements of a (proper) intersection, regardless of whether each
element is negated or not.

This PR fixes the logic, adds a test case, and updates the comments to
be hopefully more clear and accurate.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

internal An internal refactor or improvement ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants