Skip to content

[ty] Consistent ordering of constraint set specializations, take 2#21983

Merged
dcreager merged 25 commits intomainfrom
dcreager/source-order-constraints
Dec 15, 2025
Merged

[ty] Consistent ordering of constraint set specializations, take 2#21983
dcreager merged 25 commits intomainfrom
dcreager/source-order-constraints

Conversation

@dcreager
Copy link
Member

@dcreager dcreager commented Dec 15, 2025

In #21957, we tried to use union_or_intersection_elements_ordering to provide a stable ordering of the union and intersection elements that are created when determining which type a typevar should specialize to. @AlexWaygood pointed out that this won't work, since that provides a consistent ordering within a single process run, but does not provide a stable ordering across runs.

This is an attempt to produce a proper stable ordering for constraint sets, so that we end up with consistent diagnostic and test output.

We do this by maintaining a new source_order field on each interior BDD node, which records when that node's constraint was added to the set. Several of the BDD operators (and, or, etc) now have _with_offset variants, which update each source_order in the rhs to be larger than any of the source_orders in the lhs. This is what causes that field to be in line with (a) when you add each constraint to the set, and (b) the order of the parameters you provide to and, or, etc. Then we sort by that new field before constructing the union/intersection types when creating a specialization.

@dcreager dcreager added the ty Multi-file analysis & type inference label Dec 15, 2025
@astral-sh-bot
Copy link

astral-sh-bot bot commented Dec 15, 2025

Diagnostic diff on typing conformance tests

No changes detected when running ty on typing conformance tests ✅

@astral-sh-bot
Copy link

astral-sh-bot bot commented Dec 15, 2025

mypy_primer results

Changes were detected when running on open source projects
pandas-stubs (https://github.com/pandas-dev/pandas-stubs)
+ pandas-stubs/_typing.pyi:1218:16: warning[unused-ignore-comment] Unused blanket `type: ignore` directive
- Found 5123 diagnostics
+ Found 5124 diagnostics

No memory usage changes detected ✅

@dcreager dcreager force-pushed the dcreager/source-order-constraints branch from 466fd1c to cba45ac Compare December 15, 2025 15:23
@dcreager dcreager marked this pull request as ready for review December 15, 2025 15:43
@dcreager dcreager requested a review from sharkdp as a code owner December 15, 2025 15:43
@AlexWaygood AlexWaygood added the internal An internal refactor or improvement label Dec 15, 2025
Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! It's too bad how much extra work we have to do (rebuilding entire BDD subtrees) in order to preserve this ordering information, but at the moment I'm not seeing a better way.

@dcreager
Copy link
Member Author

Looks good! It's too bad how much extra work we have to do (rebuilding entire BDD subtrees) in order to preserve this ordering information, but at the moment I'm not seeing a better way.

Hmm, that's a good thought. For expediency I'm just going to add a TODO, but I wonder if we can store the other_offset as another field on InteriorNode and apply it lazily whenever we walk a BDD tree.

@dcreager dcreager merged commit 7d3b7c5 into main Dec 15, 2025
42 checks passed
@dcreager dcreager deleted the dcreager/source-order-constraints branch December 15, 2025 19:24
dcreager added a commit that referenced this pull request Dec 15, 2025
* origin/main:
  [ty] Consistent ordering of constraint set specializations, take 2 (#21983)
  [ty] Remove invalid statement-keyword completions in for-statements (#21979)
  [ty] Avoid caching trivial is-redundant-with calls (#21989)
dcreager added a commit that referenced this pull request Dec 19, 2025
We're seeing a lot of nondeterminism in the ecosystem tests at the
moment, which started (or at least got worse) once `Callable` inference
landed.

This PR attempts to remove this nondeterminism. We recently
(#21983) added a `source_order`
field to BDD nodes, which tracks when their constraint was added to the
BDD. Since we build up constraints based on the order that they appear
in the underlying source, that gives us a stable ordering even though we
use an arbitrary salsa-derived ordering for the BDD variables.

The issue (at least for some of the flakiness) is that we add "derived"
constraints when walking a BDD tree, and those derived constraints
inherit or borrow the `source_order` of the "real" constraint that
implied them. That means we can get multiple constraints in our
specialization that all have the same `source_order`. If we're not
careful, those "tied" constraints can be ordered arbitrarily.

The fix requires ~three~ ~four~ several steps:

- When starting to construct a sequent map (the data structure that
stores the derived constraints), we first sort all of the "real"
constraints by their `source_order`. That ensures that we insert things
into the sequent map in a stable order.
- During sequent map construction, derived facts are discovered by a
deterministic process applied to constraints in a (now) stable order. So
derived facts are now also inserted in a stable order.
- We update the fields of `SequentMap` to use `FxOrderSet` instead of
`FxHashSet`, so that we retain that stable insertion order.
- When walking BDD paths when constructing a specialization, we were
already sorting the constraints by their `source_order`. However, we
were not considering that we might get derived constraints, and
therefore constraints with "ties". Because of that, we need to make sure
to use a _stable_ sort, that retains the insertion order for those ties.

All together, this...should...fix the nondeterminism. (Unfortunately, I
haven't been able to effectively test this, since I haven't been able to
coerce local tests to flop into the other order that we sometimes see in
CI.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

internal An internal refactor or improvement ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments