Skip to content

[ty] Fix bug in union builder optimization#22085

Closed
zanieb wants to merge 3 commits intomicha/union-builder-sub-unionsfrom
zb/fix-sub-union
Closed

[ty] Fix bug in union builder optimization#22085
zanieb wants to merge 3 commits intomicha/union-builder-sub-unionsfrom
zb/fix-sub-union

Conversation

@zanieb
Copy link
Member

@zanieb zanieb commented Dec 19, 2025

Just poking around...

Applied to #22071

@astral-sh-bot
Copy link

astral-sh-bot bot commented Dec 19, 2025

Diagnostic diff on typing conformance tests

No changes detected when running ty on typing conformance tests ✅

@astral-sh-bot
Copy link

astral-sh-bot bot commented Dec 19, 2025

mypy_primer results

Changes were detected when running on open source projects
tornado (https://github.com/tornadoweb/tornado)
- tornado/gen.py:255:62: error[invalid-argument-type] Argument to bound method `__init__` is incorrect: Expected `None | Awaitable[Unknown] | list[Awaitable[Unknown]] | dict[Any, Awaitable[Unknown]] | Future[Unknown]`, found `_T@next | _VT@next | _T@next`
+ tornado/gen.py:255:62: error[invalid-argument-type] Argument to bound method `__init__` is incorrect: Expected `None | Awaitable[Unknown] | list[Awaitable[Unknown]] | dict[Any, Awaitable[Unknown]] | Future[Unknown]`, found `_T@next | _T@next | _VT@next`

Tanjun (https://github.com/FasterSpeeding/Tanjun)
- tanjun/dependencies/data.py:347:12: error[invalid-return-type] Return type does not match returned value: expected `_T@cached_inject`, found `Coroutine[Any, Any, _T@cached_inject | Coroutine[Any, Any, _T@cached_inject]] | _T@cached_inject`
+ tanjun/dependencies/data.py:347:12: error[invalid-return-type] Return type does not match returned value: expected `_T@cached_inject`, found `_T@cached_inject | Coroutine[Any, Any, _T@cached_inject | Coroutine[Any, Any, _T@cached_inject]]`

xarray (https://github.com/pydata/xarray)
- xarray/core/dataarray.py:5737:16: error[invalid-return-type] Return type does not match returned value: expected `T_Xarray@map_blocks`, found `T_Xarray@map_blocks | DataArray | Dataset`
+ xarray/core/dataarray.py:5737:16: error[invalid-return-type] Return type does not match returned value: expected `T_Xarray@map_blocks`, found `DataArray | Dataset`
- xarray/core/dataset.py:8866:16: error[invalid-return-type] Return type does not match returned value: expected `T_Xarray@map_blocks`, found `T_Xarray@map_blocks | DataArray | Dataset`
+ xarray/core/dataset.py:8866:16: error[invalid-return-type] Return type does not match returned value: expected `T_Xarray@map_blocks`, found `DataArray | Dataset`

jax (https://github.com/google/jax)
+ jax/_src/tree_util.py:295:31: error[invalid-argument-type] Argument to bound method `register_node` is incorrect: Expected `(Hashable, Iterable[object], /) -> T@register_pytree_node`, found `(_AuxData@register_pytree_node, _Children@register_pytree_node, /) -> T@register_pytree_node`
+ jax/_src/tree_util.py:298:31: error[invalid-argument-type] Argument to bound method `register_node` is incorrect: Expected `(Hashable, Iterable[object], /) -> T@register_pytree_node`, found `(_AuxData@register_pytree_node, _Children@register_pytree_node, /) -> T@register_pytree_node`
+ jax/_src/tree_util.py:301:31: error[invalid-argument-type] Argument to bound method `register_node` is incorrect: Expected `(Hashable, Iterable[object], /) -> T@register_pytree_node`, found `(_AuxData@register_pytree_node, _Children@register_pytree_node, /) -> T@register_pytree_node`
- Found 2799 diagnostics
+ Found 2802 diagnostics

static-frame (https://github.com/static-frame/static-frame)
- static_frame/core/bus.py:671:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemLocReduces[Bus[Any], object_]`, found `InterGetItemLocReduces[Bus[Any] | Top[Index[Any]] | Top[Series[Any, Any]] | ... omitted 7 union elements, object_]`
+ static_frame/core/bus.py:671:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemLocReduces[Bus[Any], object_]`, found `InterGetItemLocReduces[Bus[Any] | TypeBlocks | Batch | ... omitted 7 union elements, object_]`
- static_frame/core/series.py:772:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[Series[Any, Any], TVDtype@Series]`, found `InterGetItemILocReduces[Series[Any, Any] | TypeBlocks | Batch | ... omitted 6 union elements, generic[object]]`
+ static_frame/core/series.py:772:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[Series[Any, Any], TVDtype@Series]`, found `InterGetItemILocReduces[Series[Any, Any] | Top[Index[Any]] | TypeBlocks | ... omitted 6 union elements, generic[object]]`
- static_frame/core/series.py:4072:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[SeriesHE[Any, Any], TVDtype@SeriesHE]`, found `InterGetItemILocReduces[SeriesHE[Any, Any] | TypeBlocks | Batch | ... omitted 7 union elements, generic[object]]`
+ static_frame/core/series.py:4072:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[SeriesHE[Any, Any], TVDtype@SeriesHE]`, found `InterGetItemILocReduces[SeriesHE[Any, Any] | Top[Index[Any]] | TypeBlocks | ... omitted 7 union elements, generic[object]]`

No memory usage changes detected ✅

@zanieb zanieb closed this Dec 19, 2025
@zanieb
Copy link
Member Author

zanieb commented Dec 19, 2025

This looks wrong, I'll keep poking in the background.

@MichaReiser
Copy link
Member

It looks "correct" in the sense that it fixes the bug in my PR

@zanieb
Copy link
Member Author

zanieb commented Dec 19, 2025

Oh does it? It looked like it caused some new problems too :)

@zanieb zanieb reopened this Dec 19, 2025
@MichaReiser
Copy link
Member

Oh does it? It looked like it caused some new problems too :)

It does. So I think you might be on to something but I honestly don't know enough about those union builder flags (or when we should set which flag)

The optimization condition checked if `recursively_defined` flags matched,
which would apply when both are Yes. However, unions with recursively_defined=Yes
may have been built during cycle_recovery where simplification is incomplete.

Change the condition to require both builder and union have recursively_defined=No,
ensuring the optimization only applies to unions that are definitely fully simplified.
@mtshiba
Copy link
Collaborator

mtshiba commented Dec 19, 2025

UnionBuilder::cycle_recovery is a flag required due to the technical constraint that query cycles cannot occur within cycle recovery functions. When this is true, a cheap union operation is performed without using is_redundant_with. This is fine for use within cycle normalization.

UnionBuilder::recursively_defined was introduced in #21683. Setting it to Yes will perform widening earlier, which will result in faster fixed-point convergence. The rule for recursively_defined is that this attribute is infectious, i.e. if you find a sub-union with recursively_defined = Yes, you should also mark the final union type as recursively_defined = Yes.
If you are seeing a too-many-cycle panic due to unbounded growth of the literal type, you probably have recursively_defined set to No when it should be Yes.

There is no direct relationship between the two. recursively_defined = Yes should be annotated for union types of recursively defined expressions. Specifically, expressions with a type like T | Divergent are considered recursively defined (ref.

// `Divergent` in a union type does not mean true divergence, so we skip it if not nested.
// e.g. T | Divergent == T | (T | (T | (T | ...))) == T
if ty == &div {
builder = builder.recursively_defined(RecursivelyDefined::Yes);
continue;
}
).

@zanieb
Copy link
Member Author

zanieb commented Dec 19, 2025

@MichaReiser CI is green now 🤔

@MichaReiser
Copy link
Member

@MichaReiser CI is green now 🤔

Unfortunately, it doesn't remove the prefect diagnostics anymore.

@MichaReiser MichaReiser force-pushed the micha/union-builder-sub-unions branch from 7b6d0a9 to cf2259d Compare December 25, 2025 15:56
@zanieb zanieb closed this Jan 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants