Skip to content

Conversation

@carljm
Copy link
Contributor

@carljm carljm commented Jun 27, 2025

Summary

This just replaces one temporary solution to recursive protocols (the SelfReference mechanism) with another one (track seen types when recursively descending in normalize and replace recursive references with Any). But this temporary solution can handle mutually-recursive types, not just self-referential ones, and it's sufficient for the primer ecosystem and some other projects we are testing on to no longer stack overflow.

The follow-up here will be to properly handle these self-references instead of replacing them with Any.

We will also eventually need cycle detection on more recursive-descent type transformations and tests.

Test Plan

Existing tests (including recursive-protocol tests) and primer.

Added mdtest for mutually-recursive protocols that stack-overflowed before this PR.

@carljm carljm added the ty Multi-file analysis & type inference label Jun 27, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Jun 27, 2025

mypy_primer results

Changes were detected when running on open source projects
black (https://github.com/psf/black)
- TOTAL MEMORY USAGE: ~142MB
+ TOTAL MEMORY USAGE: ~129MB

strawberry (https://github.com/strawberry-graphql/strawberry)
- error[invalid-argument-type] strawberry/schema_codegen/__init__.py:199:34: Argument to function `_get_directives` is incorrect: Expected `HasDirectives`, found `FieldDefinitionNode | InputValueDefinitionNode`
- error[invalid-argument-type] strawberry/schema_codegen/__init__.py:387:34: Argument to function `_get_directives` is incorrect: Expected `HasDirectives`, found `ObjectTypeDefinitionNode | ObjectTypeExtensionNode | InterfaceTypeDefinitionNode | InputObjectTypeDefinitionNode`
- Found 366 diagnostics
+ Found 364 diagnostics

mkosi (https://github.com/systemd/mkosi)
- TOTAL MEMORY USAGE: ~117MB
+ TOTAL MEMORY USAGE: ~129MB
-     memo fields = ~97MB
+     memo fields = ~106MB

freqtrade (https://github.com/freqtrade/freqtrade)
-     memo fields = ~304MB
+     memo fields = ~276MB

paasta (https://github.com/yelp/paasta)
-     memo fields = ~171MB
+     memo fields = ~156MB

scikit-learn (https://github.com/scikit-learn/scikit-learn)
- TOTAL MEMORY USAGE: ~652MB
+ TOTAL MEMORY USAGE: ~717MB

@carljm carljm marked this pull request as ready for review June 27, 2025 22:35
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

@MichaReiser
Copy link
Member

Does this unblock the work on protocols?

@carljm
Copy link
Contributor Author

carljm commented Jun 30, 2025

Will go ahead and merge this, but post-land review of the TypeVisitor change is welcome.

(The callback accepted by TypeVisitor::visit has to itself take the visitor as argument, because otherwise it would have to be closed-over at the callsite, and this would cause conflicting mutable borrows of the visitor, one for the visitor.visit call itself and the other for the closure.)

@carljm carljm merged commit 2ae0bd9 into main Jun 30, 2025
36 checks passed
@carljm carljm deleted the cjm/cyclicp branch June 30, 2025 19:07
@sharkdp
Copy link
Contributor

sharkdp commented Jul 1, 2025

post-land review of the TypeVisitor change is welcome

Looks good, thank you!

@AlexWaygood
Copy link
Member

The callback accepted by TypeVisitor::visit has to itself take the visitor as argument, because otherwise it would have to be closed-over at the callsite, and this would cause conflicting mutable borrows of the visitor, one for the visitor.visit call itself and the other for the closure.

You might be able to get around that by having the seen field on the visitor struct be RefCell<FxOrderSet<Type<'db>>> rather than FxOrderSet<Type<'db>>. Then the visit method would only need to take &self rather than &mut self.

Not a big deal either way, though!

iyakushev pushed a commit to iyakushev/ruff that referenced this pull request Jul 1, 2025
## Summary

This just replaces one temporary solution to recursive protocols (the
`SelfReference` mechanism) with another one (track seen types when
recursively descending in `normalize` and replace recursive references
with `Any`). But this temporary solution can handle mutually-recursive
types, not just self-referential ones, and it's sufficient for the
primer ecosystem and some other projects we are testing on to no longer
stack overflow.

The follow-up here will be to properly handle these self-references
instead of replacing them with `Any`.

We will also eventually need cycle detection on more recursive-descent
type transformations and tests.

## Test Plan

Existing tests (including recursive-protocol tests) and primer.

Added mdtest for mutually-recursive protocols that stack-overflowed
before this PR.
@carljm
Copy link
Contributor Author

carljm commented Jul 2, 2025

You might be able to get around that by having the seen field on the visitor struct be RefCell<FxOrderSet<Type<'db>>> rather than FxOrderSet<Type<'db>>. Then the visit method would only need to take &self rather than &mut self.

Yes, I considered interior mutability, but I don't think that's a good tradeoff. It sacrifices some performance (by adding the runtime reference tracking overhead of RefCell) for a fairly minor ergonomic win.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants