Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace verbatim text with NOT_YET_IMPLEMENTED #4904

Merged
merged 1 commit into from
Jun 7, 2023

Conversation

MichaReiser
Copy link
Member

@MichaReiser MichaReiser commented Jun 6, 2023

Summary

This PR replaces the verbatim_text builder with a not_yet_implemented builder that emits NOT_YET_IMPLEMENTED_<NodeKind> for not yet implemented nodes.

The motivation for this change is that partially formatting compound statements can result in incorrectly indented code, which is a syntax error:

def func_no_args():
  a; b; c
  if True: raise RuntimeError
  if False: ...
  for i in range(10):
    print(i)
    continue

Get's reformatted to

def func_no_args():
    a; b; c
    if True: raise RuntimeError
    if False: ...
    for i in range(10):
    print(i)
    continue

because our formatter does not yet support for statements and just inserts the text from the source.

Downsides

Using an identifier will not work in all situations. For example, an identifier is invalid in an Arguments position. That's why I kept verbatim_text around and e.g. use it in the Arguments formatting logic where incorrect indentations are impossible (to my knowledge). Meaning, verbatim_text we can opt in to verbatim_text when we want to iterate quickly on nodes that we don't want to provide a full implementation yet and using an identifier would be invalid.

Upsides

Running this on main discovered stability issues with the newline handling that were previously "hidden" because of the verbatim formatting. I guess that's an upside :)

Test Plan

None?

@MichaReiser
Copy link
Member Author

MichaReiser commented Jun 6, 2023

@MichaReiser MichaReiser added internal An internal refactor or improvement formatter Related to the formatter labels Jun 6, 2023
x = 0O777 .real
x = 0.000000006 .hex()
x = -100.0000J
NOT_YET_IMPLEMENTED_StmtAssign
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An added benefit of this change is that it is easier to spot what formatting is implemented and what is incorrectly formatted because it is not yet supported.

@MichaReiser MichaReiser marked this pull request as draft June 6, 2023 18:16
@MichaReiser MichaReiser force-pushed the replace-verbatim-with-not-yet-implemented branch from a7c1782 to a55ccd8 Compare June 6, 2023 18:23
@MichaReiser MichaReiser changed the base branch from main to comments-newline-handling June 6, 2023 18:23
@MichaReiser MichaReiser marked this pull request as ready for review June 6, 2023 18:23


three_leading_newlines = 80
NOT_YET_IMPLEMENTED_StmtAssign
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this is stupid... but it is how it is.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 6, 2023

PR Check Results

Ecosystem

✅ ecosystem check detected no changes.

Benchmark

Linux

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.00      6.0±0.05ms     6.7 MB/sec    1.00      6.0±0.02ms     6.8 MB/sec
formatter/numpy/ctypeslib.py               1.00   1177.2±2.60µs    14.1 MB/sec    1.00   1173.2±1.40µs    14.2 MB/sec
formatter/numpy/globals.py                 1.00    132.5±0.44µs    22.3 MB/sec    1.00    132.7±1.14µs    22.2 MB/sec
formatter/pydantic/types.py                1.00      2.6±0.00ms     9.8 MB/sec    1.00      2.6±0.01ms     9.8 MB/sec
linter/all-rules/large/dataset.py          1.00     14.8±0.05ms     2.7 MB/sec    1.01     15.0±0.07ms     2.7 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.00      3.6±0.02ms     4.6 MB/sec    1.00      3.6±0.02ms     4.6 MB/sec
linter/all-rules/numpy/globals.py          1.00    365.1±2.05µs     8.1 MB/sec    1.00    365.5±1.67µs     8.1 MB/sec
linter/all-rules/pydantic/types.py         1.00      6.2±0.01ms     4.1 MB/sec    1.01      6.2±0.02ms     4.1 MB/sec
linter/default-rules/large/dataset.py      1.00      7.3±0.01ms     5.6 MB/sec    1.00      7.3±0.01ms     5.5 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00   1529.5±4.95µs    10.9 MB/sec    1.00   1530.3±4.98µs    10.9 MB/sec
linter/default-rules/numpy/globals.py      1.00    165.2±0.32µs    17.9 MB/sec    1.00    165.0±0.62µs    17.9 MB/sec
linter/default-rules/pydantic/types.py     1.00      3.3±0.00ms     7.8 MB/sec    1.01      3.3±0.03ms     7.7 MB/sec

Windows

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.01      6.8±0.07ms     6.0 MB/sec    1.00      6.7±0.07ms     6.1 MB/sec
formatter/numpy/ctypeslib.py               1.00  1305.2±20.25µs    12.8 MB/sec    1.00  1302.9±21.46µs    12.8 MB/sec
formatter/numpy/globals.py                 1.01    144.0±4.69µs    20.5 MB/sec    1.00    143.2±3.99µs    20.6 MB/sec
formatter/pydantic/types.py                1.00      2.9±0.05ms     8.7 MB/sec    1.00      2.9±0.04ms     8.7 MB/sec
linter/all-rules/large/dataset.py          1.02     17.0±0.18ms     2.4 MB/sec    1.00     16.6±0.18ms     2.4 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.01      4.2±0.05ms     4.0 MB/sec    1.00      4.1±0.04ms     4.0 MB/sec
linter/all-rules/numpy/globals.py          1.00    486.7±4.86µs     6.1 MB/sec    1.01   491.7±11.42µs     6.0 MB/sec
linter/all-rules/pydantic/types.py         1.01      7.0±0.11ms     3.6 MB/sec    1.00      7.0±0.09ms     3.6 MB/sec
linter/default-rules/large/dataset.py      1.00      8.3±0.07ms     4.9 MB/sec    1.00      8.3±0.14ms     4.9 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00  1753.1±20.63µs     9.5 MB/sec    1.00  1752.9±18.94µs     9.5 MB/sec
linter/default-rules/numpy/globals.py      1.01    198.1±3.59µs    14.9 MB/sec    1.00    195.8±3.19µs    15.1 MB/sec
linter/default-rules/pydantic/types.py     1.01      3.8±0.04ms     6.8 MB/sec    1.00      3.7±0.03ms     6.9 MB/sec

@MichaReiser
Copy link
Member Author

@MichaReiser started a stack merge that includes this pull request via Graphite.

Base automatically changed from comments-newline-handling to main June 7, 2023 12:49
@MichaReiser
Copy link
Member Author

Graphite rebased this pull request as part of a merge.

@MichaReiser MichaReiser force-pushed the replace-verbatim-with-not-yet-implemented branch from 59c3e35 to 48ce57f Compare June 7, 2023 12:49
@MichaReiser MichaReiser merged commit bcf745c into main Jun 7, 2023
@MichaReiser MichaReiser deleted the replace-verbatim-with-not-yet-implemented branch June 7, 2023 12:57
@MichaReiser
Copy link
Member Author

@MichaReiser merged this pull request with Graphite.

konstin pushed a commit that referenced this pull request Jun 13, 2023
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR replaces the `verbatim_text` builder with a `not_yet_implemented` builder that emits `NOT_YET_IMPLEMENTED_<NodeKind>` for not yet implemented nodes. 

The motivation for this change is that partially formatting compound statements can result in incorrectly indented code, which is a syntax error:

```python
def func_no_args():
  a; b; c
  if True: raise RuntimeError
  if False: ...
  for i in range(10):
    print(i)
    continue
```

Get's reformatted to

```python
def func_no_args():
    a; b; c
    if True: raise RuntimeError
    if False: ...
    for i in range(10):
    print(i)
    continue
```

because our formatter does not yet support `for` statements and just inserts the text from the source. 

## Downsides

Using an identifier will not work in all situations. For example, an identifier is invalid in an `Arguments ` position. That's why I kept `verbatim_text` around and e.g. use it in the `Arguments` formatting logic where incorrect indentations are impossible (to my knowledge). Meaning, `verbatim_text` we can opt in to `verbatim_text` when we want to iterate quickly on nodes that we don't want to provide a full implementation yet and using an identifier would be invalid. 

## Upsides

Running this on main discovered stability issues with the newline handling that were previously "hidden" because of the verbatim formatting. I guess that's an upside :)

## Test Plan

None?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
formatter Related to the formatter internal An internal refactor or improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants