Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Add support for string annotations #14151

Merged
merged 5 commits into from
Nov 15, 2024
Merged

Conversation

dhruvmanila
Copy link
Member

@dhruvmanila dhruvmanila commented Nov 7, 2024

Summary

This PR adds support for parsing and inferring types within string annotations.

Implementation (attempt 1)

This is preserved in 6217f48.

The implementation here would separate the inference of string annotations in the deferred query. This requires the following:

  • Two ways of evaluating the deferred definitions - lazily and eagerly.
    • An eager evaluation occurs right outside the definition query which in this case would be in binding_ty and declaration_ty.
    • A lazy evaluation occurs on demand like using the definition_expression_ty to determine the function return type and class bases.
  • The above point means that when trying to get the binding type for a variable in an annotated assignment, the definition query won't include the type. So, it'll require going through the deferred query to get the type.

This has the following limitations:

  • Nested string annotations, although not necessarily a useful feature, is difficult to implement unless we convert the implementation in an infinite loop
  • Partial string annotations require complex layout because inferring the types for stringified and non-stringified parts of the annotation are done in separate queries. This means we need to maintain additional information

Implementation (attempt 2)

This is the final diff in this PR.

The implementation here does the complete inference of string annotation in the same definition query by maintaining certain state while trying to infer different parts of an expression and take decisions accordingly. These are:

  • Allow names that are part of a string annotation to not exists in the symbol table. For example, in x: "Foo", if the "Foo" symbol is not defined then it won't exists in the symbol table even though it's being used. This is an invariant which is being allowed only for symbols in a string annotation.
  • Similarly, lookup name is updated to do the same and if the symbol doesn't exists, then it's not bounded.
  • Store the final type of a string annotation on the string expression itself and not for any of the sub-expressions that are created after parsing. This is because those sub-expressions won't exists in the semantic index.

Design document: https://www.notion.so/astral-sh/String-Annotations-12148797e1ca801197a9f146641e5b71?pvs=4

Closes: #13796

Test Plan

@dhruvmanila dhruvmanila added the red-knot Multi-file analysis & type inference label Nov 7, 2024
@dhruvmanila dhruvmanila force-pushed the dhruv/string-annotation branch from 666e12e to 5872fa4 Compare November 13, 2024 07:05
Copy link
Contributor

github-actions bot commented Nov 13, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

@dhruvmanila dhruvmanila changed the title WIP: Add support for string annotations [red-knot] Add support for string annotations Nov 13, 2024
@dhruvmanila dhruvmanila force-pushed the dhruv/string-annotation branch 2 times, most recently from f1fb77c to f3ee666 Compare November 13, 2024 18:01
Comment on lines 333 to 343
deferred_state: DeferredState,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: add some documentation

Comment on lines -670 to -678
DefinitionKind::AnnotatedAssignment(_annotated_assignment) => {
// TODO self.infer_annotated_assignment_deferred(annotated_assignment.node());
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: add a comment on why annotated assignment are not deferred

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious to read this comment!

Could we add some tests that annotations on annotated assignments in stub files and when from __future__ import annotations is active, are in fact deferred? (That is, can contain forward references.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the test cases in assignment/annotations.md

crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved
Comment on lines -1071 to +1106
if self.are_all_types_deferred() {
for base in class.bases() {
self.infer_expression(base);
}
for base in class.bases() {
self.infer_expression(base);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diff is to remove the self.are_all_types_deferred() check because we're already in a deferred query here.

@dhruvmanila dhruvmanila marked this pull request as ready for review November 13, 2024 18:11
@dhruvmanila
Copy link
Member Author

Although there are a couple of TODOs that I want to address mostly around documentation, I'd appreciate some initial feedback for any obvious issues that I've missed.

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't complete review yet, but I have to run, so submitting the few comments I have so far in case I don't get back to it tonight.

@@ -387,7 +391,7 @@ impl<'db> TypeInferenceBuilder<'db> {

/// Are we currently inferring deferred types?
fn is_deferred(&self) -> bool {
matches!(self.region, InferenceRegion::Deferred(_))
matches!(self.region, InferenceRegion::Deferred(_)) || self.deferred_state.is_deferred()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I'm missing an issue here, but maybe inferring an InferenceRegion::Deferred should set self.deferred_state and then this check can be simplified?

You have a note above to add docs for deferred_state; I think how these two things relate (and why InferenceRegion::Deferred overrides self.deferred_state) is a good candidate for some clear docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok after reading the whole diff more carefully, I think I understand this now, but yeah it would be great to document it.

@@ -1,9 +1,191 @@
# String annotations

## Simple
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are fabulous! Thorough, clear, and correct, the testing trifecta :)

crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is excellent, great work! It turned out way less complicated than I thought it would be. I love that with a few small changes we can sidestep the whole issue of AST IDs and just store the resulting type on the string node. I think the result of this is that we won't be able to have hover types for individual parts of a string annotation, just the whole annotation, but personally I think that's fine; it is in fact just one expression, after all.

Comment on lines -670 to -678
DefinitionKind::AnnotatedAssignment(_annotated_assignment) => {
// TODO self.infer_annotated_assignment_deferred(annotated_assignment.node());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious to read this comment!

Could we add some tests that annotations on annotated assignments in stub files and when from __future__ import annotations is active, are in fact deferred? (That is, can contain forward references.)

crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved
crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved
@@ -387,7 +391,7 @@ impl<'db> TypeInferenceBuilder<'db> {

/// Are we currently inferring deferred types?
fn is_deferred(&self) -> bool {
matches!(self.region, InferenceRegion::Deferred(_))
matches!(self.region, InferenceRegion::Deferred(_)) || self.deferred_state.is_deferred()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok after reading the whole diff more carefully, I think I understand this now, but yeah it would be great to document it.

@MichaReiser
Copy link
Member

I think the result of this is that we won't be able to have hover types for individual parts of a string annotation, just the whole annotation, but personally I think that's fine; it is in fact just one expression, after all.

Do we know how we would extend this design to support resolving types of inner expressions? Do we know if pyright supports hovering for stringified type annotations? Not supporting expression-level-types might be problematic long term because it doesn't just affect hovering:

  • No hover support
  • No go-to-definition support, e.g. I can't jump to MyType in a: "str | MyType" which is somewhat annoying
  • Possibly: types in stringified type annotations won't show up in find-all-references
  • Ruff's lint rule can't run on stringified type annotations. That would be a large regression feature-wise. For example, I believe it would make it more complicated to implement https://docs.astral.sh/ruff/rules/never-union/ because we can only ask for the outer type but then are stuck, because we don't know if the Never type comes from the annotation itself (because it uses a Union) or is the result of a type alias, type var, or something else.

@dhruvmanila
Copy link
Member Author

Do we know if pyright supports hovering for stringified type annotations?

Yes, Pyright does support hover, goto definitions, references, etc. in string annotations.

Looking a bit deeper into Pyright, I think the reason they're able to do this is because they parse string annotations during the parsing stage and store it on the AST. But, the thing that I missed, and is not in the design document, is that the Pyright parser also takes into account typing.Literal and typing.Annotated with a limitation that it won't resolve the import and only consider it if it's exactly being imported from typing or typing_extensions (https://github.com/microsoft/pyright/blob/294b7afd2eaf23b0586bcc8563571bdff0c0d0a6/packages/pyright-internal/src/parser/parser.ts#L3610-L3621).

Do we know how we would extend this design to support resolving types of inner expressions?

I and Micha talked about this in our 1:1 today and I think it's fine to move forward with this implementation today and visit it at a later stage. We might have to spend some additional time in figuring out the LSP part though. It might also be useful to understand the user impact of this feature if and when this needs to be implemented to validate the time investment. Some of the ideas for the implementation would be (a) updating the parser / AST to accommodate the change (b) semantic index that's specific to string annotation and an additional layer that connects the semantic index from the file with these specific ones.

@carljm
Copy link
Contributor

carljm commented Nov 14, 2024

Yes, I think (b) describes how I'd envisioned we could tackle this, if/when we need to. I think it might also be possible to do something even simpler that doesn't do a full semantic index of the AST from the stringified annotation, but just adds a new mechanism for attaching a type directly to a text range instead of a node?

This separates the inference of string annotations in the deferred
region. But, this creates complications in annotations that are only
partially stringified e.g., `tuple[int, "Foo"]` where "Foo" is a forward
reference.

This commit exists so as to create a checkpoint in case some of the
ideas explored here are useful.
This is the second attempt for string annotation which infers the string
annotation types in the same definition query. This has the added
advantage of avoiding to go through two salsa queries. It does this by
maintaining a state on the builder and utilizes that to make certain
decisions throughout the inference process.
@dhruvmanila dhruvmanila force-pushed the dhruv/string-annotation branch from 39d0884 to 2d26568 Compare November 15, 2024 03:32
@dhruvmanila dhruvmanila force-pushed the dhruv/string-annotation branch from 2d26568 to d07579c Compare November 15, 2024 04:05
@dhruvmanila dhruvmanila enabled auto-merge (squash) November 15, 2024 04:07
@dhruvmanila dhruvmanila merged commit 9ec690b into main Nov 15, 2024
18 checks passed
@dhruvmanila dhruvmanila deleted the dhruv/string-annotation branch November 15, 2024 04:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[red-knot] support stringified annotations
3 participants