Skip to content

[ty] Improve union builder performance#22048

Merged
MichaReiser merged 4 commits intomainfrom
micha/union-builder-simplify
Dec 19, 2025
Merged

[ty] Improve union builder performance#22048
MichaReiser merged 4 commits intomainfrom
micha/union-builder-simplify

Conversation

@MichaReiser
Copy link
Member

@MichaReiser MichaReiser commented Dec 18, 2025

Summary

  • Lazily compute negated as it often isn't even needed
  • Remove a redundant early return check
  • Remove to_remove with eagerly removing elements (I don't feel a 100% sure about this change but there isn't a single failing test)

@MichaReiser MichaReiser added internal An internal refactor or improvement ty Multi-file analysis & type inference labels Dec 18, 2025
@astral-sh-bot
Copy link

astral-sh-bot bot commented Dec 18, 2025

Diagnostic diff on typing conformance tests

No changes detected when running ty on typing conformance tests ✅

@astral-sh-bot
Copy link

astral-sh-bot bot commented Dec 18, 2025

mypy_primer results

Changes were detected when running on open source projects
Tanjun (https://github.com/FasterSpeeding/Tanjun)
- tanjun/dependencies/data.py:347:12: error[invalid-return-type] Return type does not match returned value: expected `_T@cached_inject`, found `Coroutine[Any, Any, _T@cached_inject | Coroutine[Any, Any, _T@cached_inject]] | _T@cached_inject`
+ tanjun/dependencies/data.py:347:12: error[invalid-return-type] Return type does not match returned value: expected `_T@cached_inject`, found `_T@cached_inject | Coroutine[Any, Any, _T@cached_inject | Coroutine[Any, Any, _T@cached_inject]]`

prefect (https://github.com/PrefectHQ/prefect)
- src/integrations/prefect-dbt/prefect_dbt/cli/commands.py:461:21: error[invalid-await] `Unknown | None | Coroutine[Any, Any, Unknown | None]` is not awaitable
+ src/integrations/prefect-dbt/prefect_dbt/cli/commands.py:461:21: error[invalid-await] `Unknown | None | Coroutine[Any, Any, None | Unknown]` is not awaitable
- src/integrations/prefect-dbt/prefect_dbt/cli/commands.py:535:21: error[invalid-await] `Unknown | None | Coroutine[Any, Any, Unknown | None]` is not awaitable
+ src/integrations/prefect-dbt/prefect_dbt/cli/commands.py:535:21: error[invalid-await] `Unknown | None | Coroutine[Any, Any, None | Unknown]` is not awaitable
- src/integrations/prefect-dbt/prefect_dbt/cli/commands.py:610:21: error[invalid-await] `Unknown | None | Coroutine[Any, Any, Unknown | None]` is not awaitable
+ src/integrations/prefect-dbt/prefect_dbt/cli/commands.py:610:21: error[invalid-await] `Unknown | None | Coroutine[Any, Any, None | Unknown]` is not awaitable
- src/integrations/prefect-dbt/prefect_dbt/cli/commands.py:685:21: error[invalid-await] `Unknown | None | Coroutine[Any, Any, Unknown | None]` is not awaitable
+ src/integrations/prefect-dbt/prefect_dbt/cli/commands.py:685:21: error[invalid-await] `Unknown | None | Coroutine[Any, Any, None | Unknown]` is not awaitable
- src/integrations/prefect-dbt/prefect_dbt/cli/commands.py:760:21: error[invalid-await] `Unknown | None | Coroutine[Any, Any, Unknown | None]` is not awaitable
+ src/integrations/prefect-dbt/prefect_dbt/cli/commands.py:760:21: error[invalid-await] `Unknown | None | Coroutine[Any, Any, None | Unknown]` is not awaitable
- src/integrations/prefect-dbt/prefect_dbt/cli/commands.py:835:21: error[invalid-await] `Unknown | None | Coroutine[Any, Any, Unknown | None]` is not awaitable
+ src/integrations/prefect-dbt/prefect_dbt/cli/commands.py:835:21: error[invalid-await] `Unknown | None | Coroutine[Any, Any, None | Unknown]` is not awaitable

xarray (https://github.com/pydata/xarray)
- xarray/core/dataarray.py:5737:16: error[invalid-return-type] Return type does not match returned value: expected `T_Xarray@map_blocks`, found `T_Xarray@map_blocks | DataArray | Dataset`
+ xarray/core/dataarray.py:5737:16: error[invalid-return-type] Return type does not match returned value: expected `T_Xarray@map_blocks`, found `DataArray | Dataset`
- xarray/core/dataset.py:8866:16: error[invalid-return-type] Return type does not match returned value: expected `T_Xarray@map_blocks`, found `T_Xarray@map_blocks | DataArray | Dataset`
+ xarray/core/dataset.py:8866:16: error[invalid-return-type] Return type does not match returned value: expected `T_Xarray@map_blocks`, found `DataArray | Dataset`

scikit-build-core (https://github.com/scikit-build/scikit-build-core)
+ src/scikit_build_core/build/wheel.py:98:20: error[no-matching-overload] No overload of bound method `__init__` matches arguments
- Found 43 diagnostics
+ Found 44 diagnostics

jax (https://github.com/google/jax)
+ jax/_src/tree_util.py:295:31: error[invalid-argument-type] Argument to bound method `register_node` is incorrect: Expected `(Hashable, Iterable[object], /) -> T@register_pytree_node`, found `(_AuxData@register_pytree_node, _Children@register_pytree_node, /) -> T@register_pytree_node`
+ jax/_src/tree_util.py:298:31: error[invalid-argument-type] Argument to bound method `register_node` is incorrect: Expected `(Hashable, Iterable[object], /) -> T@register_pytree_node`, found `(_AuxData@register_pytree_node, _Children@register_pytree_node, /) -> T@register_pytree_node`
+ jax/_src/tree_util.py:301:31: error[invalid-argument-type] Argument to bound method `register_node` is incorrect: Expected `(Hashable, Iterable[object], /) -> T@register_pytree_node`, found `(_AuxData@register_pytree_node, _Children@register_pytree_node, /) -> T@register_pytree_node`
- Found 2799 diagnostics
+ Found 2802 diagnostics

static-frame (https://github.com/static-frame/static-frame)
- static_frame/core/bus.py:671:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemLocReduces[Bus[Any], object_]`, found `InterGetItemLocReduces[Bus[Any] | Top[Index[Any]] | Top[Series[Any, Any]] | ... omitted 7 union elements, object_]`
+ static_frame/core/bus.py:671:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemLocReduces[Bus[Any], object_]`, found `InterGetItemLocReduces[Bus[Any] | TypeBlocks | Batch | ... omitted 7 union elements, object_]`
- static_frame/core/bus.py:675:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[Bus[Any], object_]`, found `InterGetItemILocReduces[Bus[Any] | Top[Index[Any]] | TypeBlocks | ... omitted 7 union elements, generic[object]]`
+ static_frame/core/bus.py:675:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[Bus[Any], object_]`, found `InterGetItemILocReduces[Bus[Any] | Top[Index[Any]] | TypeBlocks | ... omitted 7 union elements, object_ | Self@iloc]`
- static_frame/core/yarn.py:418:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[Yarn[Any], object_]`, found `InterGetItemILocReduces[Yarn[Any] | TypeBlocks | Batch | ... omitted 7 union elements, generic[object]]`
+ static_frame/core/yarn.py:418:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[Yarn[Any], object_]`, found `InterGetItemILocReduces[Yarn[Any] | Top[Index[Any]] | TypeBlocks | ... omitted 7 union elements, generic[object]]`

pandas-stubs (https://github.com/pandas-dev/pandas-stubs)
- pandas-stubs/_typing.pyi:1223:16: warning[unused-ignore-comment] Unused blanket `type: ignore` directive
- Found 5086 diagnostics
+ Found 5085 diagnostics

pydantic (https://github.com/pydantic/pydantic)
- pydantic/fields.py:943:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, Divergent] | ((dict[str, Divergent], /) -> None) | None`
+ pydantic/fields.py:943:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, int | float | str | ... omitted 3 union elements] | ((dict[str, int | float | str | ... omitted 3 union elements], /) -> None) | None`
- pydantic/fields.py:983:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, Divergent] | ((dict[str, Divergent], /) -> None) | None`
+ pydantic/fields.py:983:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, int | float | str | ... omitted 3 union elements] | ((dict[str, int | float | str | ... omitted 3 union elements], /) -> None) | None`
- pydantic/fields.py:1026:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, Divergent] | ((dict[str, Divergent], /) -> None) | None`
+ pydantic/fields.py:1026:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, int | float | str | ... omitted 3 union elements] | ((dict[str, int | float | str | ... omitted 3 union elements], /) -> None) | None`
- pydantic/fields.py:1066:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, Divergent] | ((dict[str, Divergent], /) -> None) | None`
+ pydantic/fields.py:1066:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, int | float | str | ... omitted 3 union elements] | ((dict[str, int | float | str | ... omitted 3 union elements], /) -> None) | None`
- pydantic/fields.py:1109:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, Divergent] | ((dict[str, Divergent], /) -> None) | None`
+ pydantic/fields.py:1109:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, int | float | str | ... omitted 3 union elements] | ((dict[str, int | float | str | ... omitted 3 union elements], /) -> None) | None`
- pydantic/fields.py:1148:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, Divergent] | ((dict[str, Divergent], /) -> None) | None`
+ pydantic/fields.py:1148:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, int | float | str | ... omitted 3 union elements] | ((dict[str, int | float | str | ... omitted 3 union elements], /) -> None) | None`
- pydantic/fields.py:1188:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, Divergent] | ((dict[str, Divergent], /) -> None) | None`
+ pydantic/fields.py:1188:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, int | float | str | ... omitted 3 union elements] | ((dict[str, int | float | str | ... omitted 3 union elements], /) -> None) | None`
- pydantic/fields.py:1567:13: error[invalid-argument-type] Argument is incorrect: Expected `dict[str, Divergent] | ((dict[str, Divergent], /) -> None) | None`, found `Top[dict[Unknown, Unknown]] | (((dict[str, Divergent], /) -> None) & ~Top[dict[Unknown, Unknown]]) | None`
+ pydantic/fields.py:1567:13: error[invalid-argument-type] Argument is incorrect: Expected `dict[str, int | float | str | ... omitted 3 union elements] | ((dict[str, int | float | str | ... omitted 3 union elements], /) -> None) | None`, found `Top[dict[Unknown, Unknown]] | (((dict[str, int | float | str | ... omitted 3 union elements], /) -> None) & ~Top[dict[Unknown, Unknown]]) | None`

Memory usage changes were detected when running on open source projects
flake8 (https://github.com/pycqa/flake8)
-     struct fields = ~4MB
+     struct fields = ~3MB

trio (https://github.com/python-trio/trio)
- TOTAL MEMORY USAGE: ~167MB
+ TOTAL MEMORY USAGE: ~159MB
-     struct metadata = ~11MB
+     struct metadata = ~10MB
-     struct fields = ~12MB
+     struct fields = ~11MB

sphinx (https://github.com/sphinx-doc/sphinx)
- TOTAL MEMORY USAGE: ~301MB
+ TOTAL MEMORY USAGE: ~287MB
-     struct metadata = ~21MB
+     struct metadata = ~20MB
-     struct fields = ~21MB
+     struct fields = ~20MB

prefect (https://github.com/PrefectHQ/prefect)
-     struct fields = ~54MB
+     struct fields = ~52MB

Comment on lines -591 to -593
if element_td == ty_td {
return;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is redundant with the check on line 577

@codspeed-hq
Copy link

codspeed-hq bot commented Dec 18, 2025

CodSpeed Performance Report

Merging #22048 will improve performances by 19.12%

Comparing micha/union-builder-simplify (3848ada) with main (76854fd)

Summary

⚡ 8 improvements
✅ 14 untouched
⏩ 30 skipped1

Benchmarks breakdown

Mode Benchmark BASE HEAD Change
WallTime medium[pandas] 66.7 s 63.2 s +5.62%
WallTime medium[static-frame] 20.3 s 19.3 s +4.69%
WallTime small[tanjun] 2.6 s 2.5 s +4.31%
WallTime large[sympy] 52.8 s 50.2 s +5.25%
WallTime large[pydantic] 128.5 s 107.9 s +19.12%
WallTime small[altair] 5.5 s 5.3 s +4.03%
Simulation hydra-zen 1.3 s 1.3 s +4.04%
Simulation ty_micro[many_string_assignments] 83.7 ms 79.8 ms +4.77%

Footnotes

  1. 30 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@MichaReiser
Copy link
Member Author

MichaReiser commented Dec 18, 2025

The memory and performance improvements make me a little suspicious. But maybe it's because we now defer computing negated, which leads to fewer interned structs.

Comment on lines 557 to 564
let mut remove_element = |i: &mut usize, elements: &mut Vec<UnionElement<'db>>| {
if inserted {
elements.swap_remove(*i);
} else {
elements[*i] = UnionElement::Type(ty);
*i += 1;
}
inserted = true;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, too bad. This is actually incorrect. But I'm surprised that there isn't a single failing test!

The issue is that we now insert the element even in case where it later turns out that it's redundant?

We also end up removing existing elements if ty turns out to be redundant (although, not sure when this would happen because it would mean another existing element has been redundant too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Funny, that this was the main performance improvement (up to 10%). Sort of confusing why that would be

if insertion_point.is_none() {
insertion_point = Some(i);
} else {
elements.swap_remove(i);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike before, the new implementation removes redundant elements before we decided if we want to add ty.

I think this might be okay because we only early-exit if the type is redundant with any existing type and, if that's the case, then the element that would be removed here must have been redundant too? But not feeling a 100% sure about this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this rationale makes sense, yes. And the fact that all our tests pass (plus no ecosystem impact) gives me confidence that this is correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's correct, as long as our is-redundant-with implementation obeys transitivity. A comment might be good.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out, this assumption is incorrect or there is indeed a bug in our is_redundant_with. I added assert statements to all our early returns, asserting that insertion_point.is_none (meaning, we haven't inserted the element yet), and pep695_type_aliases.… - PEP 695 type aliases - Cyclic aliases - Recursive invariant now panics

@MichaReiser MichaReiser changed the title [ty] Small union builder nits [ty] Improve union builder performance Dec 18, 2025
@MichaReiser MichaReiser added performance Potential performance improvement and removed internal An internal refactor or improvement labels Dec 18, 2025
}
if existing.is_subtype_of(self.db, ty) {
to_remove = Some(index);
continue;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should continue here, the same as in push_type?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I think this is correct

@MichaReiser MichaReiser marked this pull request as ready for review December 18, 2025 08:51
Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool 🚀

let mut found = None;
let mut to_remove = None;
let ty_negated = ty.negate(self.db);
let mut ty_negated = None;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you consider using a OnceCell here? It might make the logic more readable and less repetitive

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the benefit of using OnceCell here is. The main advantage of OnceCell is that it doesn't require mut, but the mut isn't an issue here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, yeah, a OnceCell doesn't really make much sense here. The main thing I was wondering about was whether there was any way to reduce the repetition of having to do ty_negated.get_or_insert_with(|| ty.negate(db)) so many times. See #22082.

@@ -383,8 +383,10 @@ impl<'db> UnionBuilder<'db> {
}
if existing.is_subtype_of(self.db, ty) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to play around with this code a little to figure out how we even get to this branch, and why it's correct here for to_remove to be Option<usize> rather than Vec<usize>. It might be helpful to add a comment here (and similar comments to the if existing.is_subtype_of(db, ty) calls in the Type::IntLiteral() and Type::BytesLiteral branches below):

Suggested change
if existing.is_subtype_of(self.db, ty) {
// e.g. `existing` could be `Literal[""] & Any`,
// and `ty` could be `Literal[""]`
if existing.is_subtype_of(self.db, ty) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree those comments would be helpful.

I think the reason it's OK to just track one to_remove is also a bit subtle -- it's because there are a limited number of possible subtypes of a literal, and all the possible subtypes (e.g. Literal[1] & Any, Literal[1] & Unknown) are also redundant with each other, so it's not possible that we'd have more than one as an existing element.

}
if existing.is_subtype_of(self.db, ty) {
to_remove = Some(index);
continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I think this is correct

if insertion_point.is_none() {
insertion_point = Some(i);
} else {
elements.swap_remove(i);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this rationale makes sense, yes. And the fact that all our tests pass (plus no ecosystem impact) gives me confidence that this is correct.

@AlexWaygood
Copy link
Member

If you're looking for further improvements to union-building performance: we've known for a while that we need to do the same thing for enum-literal types that we do for bytes-literal, int-literal and string-literal types. Big unions of enum literals are currently very slow.

@MichaReiser
Copy link
Member Author

I'd feel slightly more comfortable if @carljm at least skimmed over the change, given that our mypy primer results aren't as useful anymore (are there changes? I can't tell)

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just one question I don't understand

@@ -383,8 +383,10 @@ impl<'db> UnionBuilder<'db> {
}
if existing.is_subtype_of(self.db, ty) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree those comments would be helpful.

I think the reason it's OK to just track one to_remove is also a bit subtle -- it's because there are a limited number of possible subtypes of a literal, and all the possible subtypes (e.g. Literal[1] & Any, Literal[1] & Unknown) are also redundant with each other, so it's not possible that we'd have more than one as an existing element.

Comment on lines 573 to 574
while i < self.elements.len() {
let element = &mut self.elements[i];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this safe? Can't we remove elements and end up going out of bounds here? (It seems maybe we can't, given it doesn't happen in ecosystem -- but I don't understand why)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the check on the line above. It checks i against the current length of self.elements in each iteration

if insertion_point.is_none() {
insertion_point = Some(i);
} else {
elements.swap_remove(i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's correct, as long as our is-redundant-with implementation obeys transitivity. A comment might be good.

@AlexWaygood
Copy link
Member

Curious to see what the pydantic benchmark number is after rebasing on main, now fa57253 has landed (I'm sure it's still an improvement, just curious how much of an improvement it is now)

@zanieb
Copy link
Member

zanieb commented Dec 19, 2025

I did a local benchmark with a rebase on main and got a 21% improvement (#22052 (comment))

@MichaReiser MichaReiser force-pushed the micha/union-builder-simplify branch from eecc160 to 3848ada Compare December 19, 2025 07:08
@MichaReiser
Copy link
Member Author

I reverted the to_remove change because I don't feel confident enough. The good news is that it isn't responsible for most of the perf improvement.

@MichaReiser MichaReiser merged commit e177cc2 into main Dec 19, 2025
42 checks passed
@MichaReiser MichaReiser deleted the micha/union-builder-simplify branch December 19, 2025 07:29
AlexWaygood added a commit that referenced this pull request Dec 19, 2025
AlexWaygood added a commit that referenced this pull request Dec 19, 2025
AlexWaygood added a commit that referenced this pull request Dec 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Potential performance improvement ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants