[red-knot] Add support for string annotations #14151

dhruvmanila · 2024-11-07T12:06:02Z

Summary

This PR adds support for parsing and inferring types within string annotations.

Implementation (attempt 1)

This is preserved in 6217f48.

The implementation here would separate the inference of string annotations in the deferred query. This requires the following:

Two ways of evaluating the deferred definitions - lazily and eagerly.
- An eager evaluation occurs right outside the definition query which in this case would be in binding_ty and declaration_ty.
- A lazy evaluation occurs on demand like using the definition_expression_ty to determine the function return type and class bases.
The above point means that when trying to get the binding type for a variable in an annotated assignment, the definition query won't include the type. So, it'll require going through the deferred query to get the type.

This has the following limitations:

Nested string annotations, although not necessarily a useful feature, is difficult to implement unless we convert the implementation in an infinite loop
Partial string annotations require complex layout because inferring the types for stringified and non-stringified parts of the annotation are done in separate queries. This means we need to maintain additional information

Implementation (attempt 2)

This is the final diff in this PR.

The implementation here does the complete inference of string annotation in the same definition query by maintaining certain state while trying to infer different parts of an expression and take decisions accordingly. These are:

Allow names that are part of a string annotation to not exists in the symbol table. For example, in x: "Foo", if the "Foo" symbol is not defined then it won't exists in the symbol table even though it's being used. This is an invariant which is being allowed only for symbols in a string annotation.
Similarly, lookup name is updated to do the same and if the symbol doesn't exists, then it's not bounded.
Store the final type of a string annotation on the string expression itself and not for any of the sub-expressions that are created after parsing. This is because those sub-expressions won't exists in the semantic index.

Design document: https://www.notion.so/astral-sh/String-Annotations-12148797e1ca801197a9f146641e5b71?pvs=4

Closes: #13796

Test Plan

Add various test cases in our markdown framework
Run red_knot on LibCST (contains a lot of string annotations, specifically https://github.com/Instagram/LibCST/blob/main/libcst/matchers/_matcher_base.py), FastAPI (good amount of annotated code including typing.Literal) and compare against the main branch output

github-actions · 2024-11-13T07:30:14Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

dhruvmanila · 2024-11-13T18:08:24Z

crates/red_knot_python_semantic/src/types/infer.rs

+    deferred_state: DeferredState,
+


TODO: add some documentation

dhruvmanila · 2024-11-13T18:08:51Z

crates/red_knot_python_semantic/src/types/infer.rs

-            DefinitionKind::AnnotatedAssignment(_annotated_assignment) => {
-                // TODO self.infer_annotated_assignment_deferred(annotated_assignment.node());
-            }


TODO: add a comment on why annotated assignment are not deferred

I'm curious to read this comment!

Could we add some tests that annotations on annotated assignments in stub files and when from __future__ import annotations is active, are in fact deferred? (That is, can contain forward references.)

Added the test cases in assignment/annotations.md

crates/red_knot_python_semantic/src/types/infer.rs

dhruvmanila · 2024-11-13T18:10:33Z

crates/red_knot_python_semantic/src/types/infer.rs

-        if self.are_all_types_deferred() {
-            for base in class.bases() {
-                self.infer_expression(base);
-            }
+        for base in class.bases() {
+            self.infer_expression(base);


The diff is to remove the self.are_all_types_deferred() check because we're already in a deferred query here.

dhruvmanila · 2024-11-13T18:11:46Z

Although there are a couple of TODOs that I want to address mostly around documentation, I'd appreciate some initial feedback for any obvious issues that I've missed.

crates/red_knot_python_semantic/src/types/infer.rs

carljm

Didn't complete review yet, but I have to run, so submitting the few comments I have so far in case I don't get back to it tonight.

carljm · 2024-11-13T23:46:00Z

crates/red_knot_python_semantic/src/types/infer.rs

@@ -387,7 +391,7 @@ impl<'db> TypeInferenceBuilder<'db> {

    /// Are we currently inferring deferred types?
    fn is_deferred(&self) -> bool {
-        matches!(self.region, InferenceRegion::Deferred(_))
+        matches!(self.region, InferenceRegion::Deferred(_)) || self.deferred_state.is_deferred()


Not sure if I'm missing an issue here, but maybe inferring an InferenceRegion::Deferred should set self.deferred_state and then this check can be simplified?

You have a note above to add docs for deferred_state; I think how these two things relate (and why InferenceRegion::Deferred overrides self.deferred_state) is a good candidate for some clear docs.

Ok after reading the whole diff more carefully, I think I understand this now, but yeah it would be great to document it.

carljm · 2024-11-13T23:46:30Z

crates/red_knot_python_semantic/resources/mdtest/annotations/string.md

@@ -1,9 +1,191 @@
 # String annotations

+## Simple


These tests are fabulous! Thorough, clear, and correct, the testing trifecta :)

crates/red_knot_python_semantic/src/types/infer.rs

carljm

This is excellent, great work! It turned out way less complicated than I thought it would be. I love that with a few small changes we can sidestep the whole issue of AST IDs and just store the resulting type on the string node. I think the result of this is that we won't be able to have hover types for individual parts of a string annotation, just the whole annotation, but personally I think that's fine; it is in fact just one expression, after all.

carljm · 2024-11-14T05:17:44Z

crates/red_knot_python_semantic/src/types/infer.rs

-            DefinitionKind::AnnotatedAssignment(_annotated_assignment) => {
-                // TODO self.infer_annotated_assignment_deferred(annotated_assignment.node());
-            }


I'm curious to read this comment!

Could we add some tests that annotations on annotated assignments in stub files and when from __future__ import annotations is active, are in fact deferred? (That is, can contain forward references.)

crates/red_knot_python_semantic/src/types/infer.rs

crates/red_knot_python_semantic/src/types/string_annotation.rs

carljm · 2024-11-14T05:40:56Z

crates/red_knot_python_semantic/src/types/infer.rs

@@ -387,7 +391,7 @@ impl<'db> TypeInferenceBuilder<'db> {

    /// Are we currently inferring deferred types?
    fn is_deferred(&self) -> bool {
-        matches!(self.region, InferenceRegion::Deferred(_))
+        matches!(self.region, InferenceRegion::Deferred(_)) || self.deferred_state.is_deferred()


Ok after reading the whole diff more carefully, I think I understand this now, but yeah it would be great to document it.

MichaReiser · 2024-11-14T06:44:15Z

I think the result of this is that we won't be able to have hover types for individual parts of a string annotation, just the whole annotation, but personally I think that's fine; it is in fact just one expression, after all.

Do we know how we would extend this design to support resolving types of inner expressions? Do we know if pyright supports hovering for stringified type annotations? Not supporting expression-level-types might be problematic long term because it doesn't just affect hovering:

No hover support
No go-to-definition support, e.g. I can't jump to MyType in a: "str | MyType" which is somewhat annoying
Possibly: types in stringified type annotations won't show up in find-all-references
Ruff's lint rule can't run on stringified type annotations. That would be a large regression feature-wise. For example, I believe it would make it more complicated to implement https://docs.astral.sh/ruff/rules/never-union/ because we can only ask for the outer type but then are stuck, because we don't know if the Never type comes from the annotation itself (because it uses a Union) or is the result of a type alias, type var, or something else.

dhruvmanila · 2024-11-14T11:33:48Z

Do we know if pyright supports hovering for stringified type annotations?

Yes, Pyright does support hover, goto definitions, references, etc. in string annotations.

Looking a bit deeper into Pyright, I think the reason they're able to do this is because they parse string annotations during the parsing stage and store it on the AST. But, the thing that I missed, and is not in the design document, is that the Pyright parser also takes into account typing.Literal and typing.Annotated with a limitation that it won't resolve the import and only consider it if it's exactly being imported from typing or typing_extensions (https://github.com/microsoft/pyright/blob/294b7afd2eaf23b0586bcc8563571bdff0c0d0a6/packages/pyright-internal/src/parser/parser.ts#L3610-L3621).

Do we know how we would extend this design to support resolving types of inner expressions?

I and Micha talked about this in our 1:1 today and I think it's fine to move forward with this implementation today and visit it at a later stage. We might have to spend some additional time in figuring out the LSP part though. It might also be useful to understand the user impact of this feature if and when this needs to be implemented to validate the time investment. Some of the ideas for the implementation would be (a) updating the parser / AST to accommodate the change (b) semantic index that's specific to string annotation and an additional layer that connects the semantic index from the file with these specific ones.

carljm · 2024-11-14T15:19:19Z

Yes, I think (b) describes how I'd envisioned we could tackle this, if/when we need to. I think it might also be possible to do something even simpler that doesn't do a full semantic index of the AST from the stringified annotation, but just adds a new mechanism for attaching a type directly to a text range instead of a node?

This separates the inference of string annotations in the deferred region. But, this creates complications in annotations that are only partially stringified e.g., `tuple[int, "Foo"]` where "Foo" is a forward reference. This commit exists so as to create a checkpoint in case some of the ideas explored here are useful.

This is the second attempt for string annotation which infers the string annotation types in the same definition query. This has the added advantage of avoiding to go through two salsa queries. It does this by maintaining a state on the builder and utilizes that to make certain decisions throughout the inference process.

dhruvmanila added the red-knot Multi-file analysis & type inference label Nov 7, 2024

dhruvmanila force-pushed the dhruv/string-annotation branch from 666e12e to 5872fa4 Compare November 13, 2024 07:05

dhruvmanila changed the title ~~WIP: Add support for string annotations~~ [red-knot] Add support for string annotations Nov 13, 2024

dhruvmanila force-pushed the dhruv/string-annotation branch 2 times, most recently from f1fb77c to f3ee666 Compare November 13, 2024 18:01

dhruvmanila commented Nov 13, 2024

View reviewed changes

dhruvmanila marked this pull request as ready for review November 13, 2024 18:11

dhruvmanila requested review from carljm, MichaReiser, AlexWaygood and sharkdp as code owners November 13, 2024 18:11

MichaReiser reviewed Nov 13, 2024

View reviewed changes

crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved

MichaReiser reviewed Nov 13, 2024

View reviewed changes

crates/red_knot_python_semantic/src/types/infer.rs Outdated Show resolved Hide resolved

carljm reviewed Nov 13, 2024

View reviewed changes

dhruvmanila mentioned this pull request Nov 14, 2024

[flake8-type-checking] Skip quoting annotation if it becomes invalid syntax (TCH001) #14285

Merged

carljm approved these changes Nov 14, 2024

View reviewed changes

dhruvmanila added 4 commits November 15, 2024 08:41

Add tests for string annotations

1be577c

Address review feedback

16f9c09

dhruvmanila force-pushed the dhruv/string-annotation branch from 39d0884 to 2d26568 Compare November 15, 2024 03:32

Update from rebase

d07579c

dhruvmanila force-pushed the dhruv/string-annotation branch from 2d26568 to d07579c Compare November 15, 2024 04:05

dhruvmanila enabled auto-merge (squash) November 15, 2024 04:07

dhruvmanila merged commit 9ec690b into main Nov 15, 2024
18 checks passed

dhruvmanila deleted the dhruv/string-annotation branch November 15, 2024 04:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[red-knot] Add support for string annotations #14151

[red-knot] Add support for string annotations #14151

dhruvmanila commented Nov 7, 2024 •

edited

Loading

github-actions bot commented Nov 13, 2024 •

edited

Loading

dhruvmanila Nov 13, 2024

dhruvmanila Nov 13, 2024

carljm Nov 14, 2024

dhruvmanila Nov 14, 2024

dhruvmanila Nov 13, 2024

dhruvmanila commented Nov 13, 2024

carljm left a comment

carljm Nov 13, 2024

carljm Nov 14, 2024

carljm Nov 13, 2024

carljm left a comment

carljm Nov 14, 2024

carljm Nov 14, 2024

MichaReiser commented Nov 14, 2024

dhruvmanila commented Nov 14, 2024

carljm commented Nov 14, 2024

[red-knot] Add support for string annotations #14151

[red-knot] Add support for string annotations #14151

Conversation

dhruvmanila commented Nov 7, 2024 • edited Loading

Summary

Implementation (attempt 1)

Implementation (attempt 2)

Test Plan

github-actions bot commented Nov 13, 2024 • edited Loading

ruff-ecosystem results

Linter (stable)

Linter (preview)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhruvmanila commented Nov 13, 2024

carljm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carljm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaReiser commented Nov 14, 2024

dhruvmanila commented Nov 14, 2024

carljm commented Nov 14, 2024

dhruvmanila commented Nov 7, 2024 •

edited

Loading

github-actions bot commented Nov 13, 2024 •

edited

Loading

`ruff-ecosystem` results