Use the new f-string tokens in string formatting#7586
Merged
dhruvmanila merged 1 commit intodhruv/pep-701from Sep 22, 2023
Merged
Use the new f-string tokens in string formatting#7586dhruvmanila merged 1 commit intodhruv/pep-701from
dhruvmanila merged 1 commit intodhruv/pep-701from
Conversation
Member
Author
|
Hmm, I've rebased the stack on the latest main so not sure why does |
Member
Author
|
(Looking into the formatter ecosystem failures) |
Member
|
Hm, I wonder if this is related to #7538? |
Member
Author
Are you referring to the ecosystem checks or pre-commit ( |
Member
|
The |
Member
Author
Oh, my bad. A few files got added when I was exploring |
dd95717 to
5e35a55
Compare
486a48d to
bf47707
Compare
Member
Author
This was referenced Sep 22, 2023
CodSpeed Performance ReportMerging #7586 will not alter performanceComparing Summary
|
MichaReiser
approved these changes
Sep 22, 2023
bf47707 to
c9cf545
Compare
dhruvmanila
added a commit
that referenced
this pull request
Sep 22, 2023
## Summary
This PR updates the string formatter to account for the new f-string
tokens.
The formatter uses the full lexer to handle comments around implicitly
concatenated strings. The reason it uses the lexer is because the AST
merges them into a single node so the boundaries aren't preserved.
For f-strings, it creates some complexity now that it isn't represented
as a `String` token. A single f-string will atleast emit 3 tokens
(`FStringStart`, `FStringMiddle`, `FStringEnd`) and if it contains
expressions, then it'll emit the respective tokens for them. In our
case, we're currently only interested in the outermost f-string range
for which I've introduced a new `FStringRangeBuilder` which keeps builds
the outermost f-string range by considering the start and end tokens and
the nesting level.
Note that this doesn't support in any way nested f-strings which is out
of scope for this PR. This means that if there are nested f-strings,
especially the ones using the same quote, the formatter will escape the
inner quotes:
```python
f"hello world {
x
+
f\"nested {y}\"
}"
```
## Test plan
```
cargo test --package ruff_python_formatter
```
dhruvmanila
added a commit
that referenced
this pull request
Sep 26, 2023
## Summary
This PR updates the string formatter to account for the new f-string
tokens.
The formatter uses the full lexer to handle comments around implicitly
concatenated strings. The reason it uses the lexer is because the AST
merges them into a single node so the boundaries aren't preserved.
For f-strings, it creates some complexity now that it isn't represented
as a `String` token. A single f-string will atleast emit 3 tokens
(`FStringStart`, `FStringMiddle`, `FStringEnd`) and if it contains
expressions, then it'll emit the respective tokens for them. In our
case, we're currently only interested in the outermost f-string range
for which I've introduced a new `FStringRangeBuilder` which keeps builds
the outermost f-string range by considering the start and end tokens and
the nesting level.
Note that this doesn't support in any way nested f-strings which is out
of scope for this PR. This means that if there are nested f-strings,
especially the ones using the same quote, the formatter will escape the
inner quotes:
```python
f"hello world {
x
+
f\"nested {y}\"
}"
```
## Test plan
```
cargo test --package ruff_python_formatter
```
dhruvmanila
added a commit
that referenced
this pull request
Sep 27, 2023
## Summary
This PR updates the string formatter to account for the new f-string
tokens.
The formatter uses the full lexer to handle comments around implicitly
concatenated strings. The reason it uses the lexer is because the AST
merges them into a single node so the boundaries aren't preserved.
For f-strings, it creates some complexity now that it isn't represented
as a `String` token. A single f-string will atleast emit 3 tokens
(`FStringStart`, `FStringMiddle`, `FStringEnd`) and if it contains
expressions, then it'll emit the respective tokens for them. In our
case, we're currently only interested in the outermost f-string range
for which I've introduced a new `FStringRangeBuilder` which keeps builds
the outermost f-string range by considering the start and end tokens and
the nesting level.
Note that this doesn't support in any way nested f-strings which is out
of scope for this PR. This means that if there are nested f-strings,
especially the ones using the same quote, the formatter will escape the
inner quotes:
```python
f"hello world {
x
+
f\"nested {y}\"
}"
```
## Test plan
```
cargo test --package ruff_python_formatter
```
dhruvmanila
added a commit
that referenced
this pull request
Sep 27, 2023
## Summary This PR updates the `Q000`, and `Q001` rules to consider the new f-string tokens. The docstring rule (`Q002`) doesn't need to be updated because f-strings cannot be used as docstrings. I tried implementing the nested f-string support but there are still some edge cases in my current implementation so I've decided to pause it for now and I'll pick it up sometime soon. So, for now this doesn't support nested f-strings. ### Implementation The implementation uses the same `FStringRangeBuilder` introduced in #7586 to build up the outermost f-string range. The reason to use the same implementation is because this is a temporary solution until we add support for nested f-strings. ## Test Plan `cargo test`
dhruvmanila
added a commit
that referenced
this pull request
Sep 28, 2023
## Summary
This PR updates the string formatter to account for the new f-string
tokens.
The formatter uses the full lexer to handle comments around implicitly
concatenated strings. The reason it uses the lexer is because the AST
merges them into a single node so the boundaries aren't preserved.
For f-strings, it creates some complexity now that it isn't represented
as a `String` token. A single f-string will atleast emit 3 tokens
(`FStringStart`, `FStringMiddle`, `FStringEnd`) and if it contains
expressions, then it'll emit the respective tokens for them. In our
case, we're currently only interested in the outermost f-string range
for which I've introduced a new `FStringRangeBuilder` which keeps builds
the outermost f-string range by considering the start and end tokens and
the nesting level.
Note that this doesn't support in any way nested f-strings which is out
of scope for this PR. This means that if there are nested f-strings,
especially the ones using the same quote, the formatter will escape the
inner quotes:
```python
f"hello world {
x
+
f\"nested {y}\"
}"
```
## Test plan
```
cargo test --package ruff_python_formatter
```
dhruvmanila
added a commit
that referenced
this pull request
Sep 28, 2023
## Summary This PR updates the `Q000`, and `Q001` rules to consider the new f-string tokens. The docstring rule (`Q002`) doesn't need to be updated because f-strings cannot be used as docstrings. I tried implementing the nested f-string support but there are still some edge cases in my current implementation so I've decided to pause it for now and I'll pick it up sometime soon. So, for now this doesn't support nested f-strings. ### Implementation The implementation uses the same `FStringRangeBuilder` introduced in #7586 to build up the outermost f-string range. The reason to use the same implementation is because this is a temporary solution until we add support for nested f-strings. ## Test Plan `cargo test`
dhruvmanila
added a commit
that referenced
this pull request
Sep 29, 2023
## Summary
This PR updates the string formatter to account for the new f-string
tokens.
The formatter uses the full lexer to handle comments around implicitly
concatenated strings. The reason it uses the lexer is because the AST
merges them into a single node so the boundaries aren't preserved.
For f-strings, it creates some complexity now that it isn't represented
as a `String` token. A single f-string will atleast emit 3 tokens
(`FStringStart`, `FStringMiddle`, `FStringEnd`) and if it contains
expressions, then it'll emit the respective tokens for them. In our
case, we're currently only interested in the outermost f-string range
for which I've introduced a new `FStringRangeBuilder` which keeps builds
the outermost f-string range by considering the start and end tokens and
the nesting level.
Note that this doesn't support in any way nested f-strings which is out
of scope for this PR. This means that if there are nested f-strings,
especially the ones using the same quote, the formatter will escape the
inner quotes:
```python
f"hello world {
x
+
f\"nested {y}\"
}"
```
## Test plan
```
cargo test --package ruff_python_formatter
```
dhruvmanila
added a commit
that referenced
this pull request
Sep 29, 2023
## Summary This PR updates the `Q000`, and `Q001` rules to consider the new f-string tokens. The docstring rule (`Q002`) doesn't need to be updated because f-strings cannot be used as docstrings. I tried implementing the nested f-string support but there are still some edge cases in my current implementation so I've decided to pause it for now and I'll pick it up sometime soon. So, for now this doesn't support nested f-strings. ### Implementation The implementation uses the same `FStringRangeBuilder` introduced in #7586 to build up the outermost f-string range. The reason to use the same implementation is because this is a temporary solution until we add support for nested f-strings. ## Test Plan `cargo test`
dhruvmanila
added a commit
that referenced
this pull request
Sep 29, 2023
## Summary
This PR updates the string formatter to account for the new f-string
tokens.
The formatter uses the full lexer to handle comments around implicitly
concatenated strings. The reason it uses the lexer is because the AST
merges them into a single node so the boundaries aren't preserved.
For f-strings, it creates some complexity now that it isn't represented
as a `String` token. A single f-string will atleast emit 3 tokens
(`FStringStart`, `FStringMiddle`, `FStringEnd`) and if it contains
expressions, then it'll emit the respective tokens for them. In our
case, we're currently only interested in the outermost f-string range
for which I've introduced a new `FStringRangeBuilder` which keeps builds
the outermost f-string range by considering the start and end tokens and
the nesting level.
Note that this doesn't support in any way nested f-strings which is out
of scope for this PR. This means that if there are nested f-strings,
especially the ones using the same quote, the formatter will escape the
inner quotes:
```python
f"hello world {
x
+
f\"nested {y}\"
}"
```
## Test plan
```
cargo test --package ruff_python_formatter
```
dhruvmanila
added a commit
that referenced
this pull request
Sep 29, 2023
## Summary This PR updates the `Q000`, and `Q001` rules to consider the new f-string tokens. The docstring rule (`Q002`) doesn't need to be updated because f-strings cannot be used as docstrings. I tried implementing the nested f-string support but there are still some edge cases in my current implementation so I've decided to pause it for now and I'll pick it up sometime soon. So, for now this doesn't support nested f-strings. ### Implementation The implementation uses the same `FStringRangeBuilder` introduced in #7586 to build up the outermost f-string range. The reason to use the same implementation is because this is a temporary solution until we add support for nested f-strings. ## Test Plan `cargo test`
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
This PR updates the string formatter to account for the new f-string tokens.
The formatter uses the full lexer to handle comments around implicitly concatenated strings. The reason it uses the lexer is because the AST merges them into a single node so the boundaries aren't preserved.
For f-strings, it creates some complexity now that it isn't represented as a
Stringtoken. A single f-string will atleast emit 3 tokens (FStringStart,FStringMiddle,FStringEnd) and if it contains expressions, then it'll emit the respective tokens for them. In our case, we're currently only interested in the outermost f-string range for which I've introduced a newFStringRangeBuilderwhich keeps builds the outermost f-string range by considering the start and end tokens and the nesting level.Note that this doesn't support in any way nested f-strings which is out of scope for this PR. This means that if there are nested f-strings, especially the ones using the same quote, the formatter will escape the inner quotes:
Test plan