-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[ty] Remove Type::Tuple
#19669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ty] Remove Type::Tuple
#19669
Conversation
17431f1 to
7f9a24a
Compare
Diagnostic diff on typing conformance testsChanges were detected when running ty on typing conformance tests--- old-output.txt 2025-08-11 20:53:33.906552447 +0000
+++ new-output.txt 2025-08-11 20:53:33.968552586 +0000
@@ -865,8 +865,6 @@
tuples_type_compat.py:127:13: error[type-assertion-failure] Argument does not have asserted type `@Todo`
tuples_type_compat.py:129:13: error[type-assertion-failure] Argument does not have asserted type `tuple[int | str, int]`
tuples_type_compat.py:130:13: error[type-assertion-failure] Argument does not have asserted type `@Todo`
-tuples_type_compat.py:149:5: error[type-assertion-failure] Argument does not have asserted type `Sequence[int | float | complex | list[int]]`
-tuples_type_compat.py:152:5: error[type-assertion-failure] Argument does not have asserted type `Sequence[int | str]`
tuples_type_compat.py:153:5: error[type-assertion-failure] Argument does not have asserted type `Sequence[Never]`
tuples_type_compat.py:157:1: error[invalid-assignment] Object of type `tuple[Literal[1], Literal[""], Literal[""]]` is not assignable to `tuple[int, str]`
tuples_type_compat.py:162:1: error[invalid-assignment] Object of type `tuple[Literal[1], Literal[1], Literal[""]]` is not assignable to `tuple[int, *tuple[str, ...]]`
@@ -886,4 +884,4 @@
typeddicts_operations.py:60:1: error[type-assertion-failure] Argument does not have asserted type `str | None`
typeddicts_type_consistency.py:101:1: error[invalid-assignment] Object of type `Unknown | None` is not assignable to `str`
typeddicts_usage.py:40:24: error[invalid-type-form] The special form `typing.TypedDict` is not allowed in type expressions. Did you mean to use a concrete TypedDict or `collections.abc.Mapping[str, object]` instead?
-Found 887 diagnostics
+Found 885 diagnostics |
|
CodSpeed Instrumentation Performance ReportMerging #19669 will not alter performanceComparing Summary
|
CodSpeed WallTime Performance ReportMerging #19669 will not alter performanceComparing Summary
|
f3574de to
6def9b4
Compare
|
What might give us more insight is if you could run ty with |
20cd78f to
83ea2df
Compare
6da9eb6 to
08470a8
Compare
08470a8 to
96f70ee
Compare
109d3e4 to
1e21124
Compare
25627b3 to
85c8e81
Compare
1e21124 to
ef1857e
Compare
85c8e81 to
bb6b05d
Compare
|
| Lint rule | Added | Removed | Changed |
|---|---|---|---|
possibly-unbound-attribute |
0 | 73 | 0 |
invalid-argument-type |
4 | 8 | 4 |
invalid-assignment |
3 | 5 | 0 |
not-iterable |
0 | 5 | 0 |
invalid-return-type |
3 | 0 | 0 |
parameter-already-assigned |
0 | 3 | 0 |
unsupported-operator |
0 | 0 | 3 |
type-assertion-failure |
0 | 1 | 0 |
| Total | 10 | 95 | 7 |
ef1857e to
be789e2
Compare
9dbabef to
c88a350
Compare
I believe #19811 (which is stacked on top of this PR) buys back almost all the performance and memory regressions from this change, so I think @carljm's analysis in #19669 (comment) was correct: this PR just meant that we were eagerly constructing a lot more unions and |
df1357f to
1dc9a1b
Compare
677d378 to
3436d5d
Compare
3436d5d to
da3bc43
Compare
I've merged the changes from #19811 into this one. This PR now passes the codspeed CI check: there are still regressions on many benchmarks, but they are nearly all 1-2% regressions (pydantic and sympy show regressions of 3%). |
255f90a to
7db1dd2
Compare
| Type::NominalInstance(instance) => match instance.class(db).known(db) { | ||
| // enum.nonmember | ||
| Some(KnownClass::Nonmember) => return None, | ||
|
|
||
| // enum.member | ||
| Some(KnownClass::Member) => Some( | ||
| ty.member(db, "value") | ||
| .place | ||
| .ignore_possibly_unbound() | ||
| .unwrap_or(Type::unknown()), | ||
| ), | ||
|
|
||
| // enum.auto | ||
| Some(KnownClass::Auto) => { | ||
| auto_counter += 1; | ||
| Some(Type::IntLiteral(auto_counter)) | ||
| } | ||
|
|
||
| _ => None, | ||
| }, | ||
|
|
||
| _ => None, | ||
| }; | ||
|
|
||
| if let Some(special_case) = special_case { | ||
| special_case | ||
| } else { | ||
| let dunder_get = ty | ||
| .member_lookup_with_policy( | ||
| db, | ||
| "__get__".into(), | ||
| MemberLookupPolicy::NO_INSTANCE_FALLBACK, | ||
| ) | ||
| .place; | ||
|
|
||
| match dunder_get { | ||
| Place::Unbound | Place::Type(Type::Dynamic(_), _) => ty, | ||
|
|
||
| Place::Type(_, _) => { | ||
| // Descriptors are not considered members. | ||
| return None; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no semantic change here, I just micro-optimised this a bit to reduce the number of .class(db).is_known(db) calls,
7db1dd2 to
299a0c5
Compare
299a0c5 to
23b108f
Compare
carljm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome!!
|
|
||
| (Type::NominalInstance(left), Type::NominalInstance(right)) => left.class.cmp(&right.class), | ||
| (Type::NominalInstance(left), Type::NominalInstance(right)) => { | ||
| left.class(db).cmp(&right.class(db)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this might still cause us to materialize classes for tuples in many cases where we don't really need to? We could pretty easily do something a bit more involved here to avoid that...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... I wasn't sure about how far to go with that kind of optimisation generally. It felt like it made sense for methods already on NominalInstanceType, and it felt like it made sense to move commonly used methods like is_object() to NominalInstanceType (especially since for that one in particular, it also arguably improves ergonomics to move the method onto NominalInstanceType). But the premise underlying this PR is that we should basically be treating tuple-instance types the same as other NominalInstance types in 95% of cases -- adding new methods to NominalInstanceType that branch on the inner enum, purely for the purpose of optimisation, feels like it works against that principle to a certain extent?
TL;DR I don't plan to do this in this PR, but also wouldn't object to it being done if you feel like tackling it!
|
Final Codspeed report shows 1% speedups on micro and macrobenchmarks! 😃 https://codspeed.io/astral-sh/ruff/branches/alex%2Fremove-tuples?runnerMode=Instrumentation |
* main: Don't cache files with diagnostics (#19869) [ty] support recursive type aliases (#19805) [ty] Remove unsafe `salsa::Update` implementations in `tuple.rs` (#19880) [ty] Function argument inlay hints (#19269) [ty] Remove Salsa interning for `TypedDictType` (#19879) Fix `lint.future-annotations` link (#19876) [ty] Reduce memory usage of `TupleSpec` and `TupleType` (#19872) [ty] Track heap usage of salsa structs (#19790) Update salsa to pull in tracked struct changes (#19843) [ty] simplify CycleDetector::visit signature (#19873) [ty] use interior mutability in type visitors (#19871) [ty] Fix tool name is None when no ty path is given in ty_benchmark (#19870) [ty] Remove `Type::Tuple` (#19669) [ty] Short circuit `ReachabilityConstraints::analyze_single` for dynamic types (#19867)
(This PR is easiest to review commit-by-commit. The first commit achieves all the improvements to semantics; following commits are focussed on improving performance.)
Summary
This PR removes the
Tuplevariant from ourTypeenum. Instead,Tupletypes are now all represented asType::NominalInstances with a certainTupleSpecstored as part of their generic specialisation.Removing a
Typevariant leads to simpler code; the moreTypevariants we have, the more complex methods likeType::has_relation_toandType::is_disjoint_frombecome. The main motivation for making this change is not to do with code complexity, however. Instead, it's to refocus how we think about tuple types.Tuple types are obviously special in the Python type system in many ways. But one way in which they are exactly the same as all other
NominalInstancetype (and different toLiteraltypes such asStringLiteral,BytesLiteralandBooleanLiteral) is that the typetuple[int, str]does not only include runtime objects where the__class__is exactlytuple; it also includes instances of subclasses oftuple[int, str]. Explicit subclasses of specialised tuples are somewhat rare (though, as the ecosystem report on this PR shows, not as rare as you might think!) -- but tuple subclasses nonetheless appear an awful lot in Python code, sinceNamedTuples are popular, and everyNamedTupleclass is a tuple subclass. In order to achieve consistent and principled behaviour, all special-casing we apply totuple[int, str]should ideally also be applied to subclasses oftuple[int, str]; but this is not something we've done up till now, and it's highly awkward to achieve this if instances of the tuple subclass are represented internally asType::NominalInstances when the tuple supertype is represented internally as aType::Tuple. UsingType::NominalInstanceto represent both the tuple type and any subclasses of the tuple type means that it becomes easy to apply special casing to subclasses, and forces us to think explicitly about whether any special casing we add for tuples should apply to "just that tuple type" (though there's really no such thing, since the tuple type is a superset of all its subtypes) or to subtypes of that tuple type too.Removing
Type::Tupletherefore has the effect that all our tuple special casing is now applied to tuple subclasses as well as "exact tuple types":sys.version_info: rather than having dedicated branches in multiple places to handlesys.version_infocomparisons, we now just treatsys.version_infoas what it really is at runtime: an instance of a tuple subclass with a very particular tuple spec.Representing all tuple types using
Type::NominalInstancealso appears to have had the unexpected (but very welcome!) effect of improving our ability to solve TypeVars in several situations that showed up in the primer report (and for which I've added regression tests).#19560, which is stacked on top of this, extends our special-casing for tuples and tuple subclasses to named tuples by inserting precise tuple types into the MRO of
NamedTupleclasses (currently, we model allNamedTupleclasses as inheriting fromtuple[Unknown, ...], which avoids false positives but at the cost of many false negatives).Test Plan
Many mdtests added.
I haven't added any new tests specifically for precise comparison inference between instances of tuple subclasses, because this is already tested extensively (both explicitly and implicitly) by our support for
sys.version_infobranches. Our ability to infersys.version_info >= (3, 9)as eitherTrueorFalsenow just directly relies on us seeing that thesys._version_infoclass in typeshed is a tuple subclass (with a special-cased spec), and therefore taking the tuple-and-tuple-subclass path inTypeInferenceBuilder::infer_binary_type_comparison().Ecosystem analysis
See #19669 (comment) for a full analysis. Overall, I think it looks great. The typing conformance diff also shows that this PR brings us closer to conformance with the typing spec.
Performance impact
Earlier versions of this PR caused large performance regressions, and also regressed memory usage. The latest version of this PR still causes some performance regressions, but they are all 1-2%, which is much reduced. I think partly the remaining regressions ared due to the fact that we're just doing a lot more work than we used to. We're able to infer precise types in many instances where we previously inferred Unknown, and we're able to solve many TypeVars that we previously couldn't.
I've experimented with adding Salsa caching to the methods on
ClassTypefor figuring out what a class's tuple spec is. It appeared to speedupty_micro[many_string_assignments]a little, but it had negligible difference on the other benchmarks (and possibly regressed performance slightly on some benchmarks, including colour-science).