Pragma comment handling #6670

MichaReiser · 2023-08-18T07:27:08Z

MichaReiser
Aug 18, 2023
Maintainer

Formatting Pragma Comments

Various static analysis tools use pragma comments to attribute code. One such pragma comment is Ruff's noqa comment. Formatting pragma comments faces the same challenge as regular comments: There's no grammar rule specifying what a comment comments. It is only possible to rely on heuristics. While good heuristics are important for regular comments to preserve their context, pragma comments also have semantical meaning that may change when a comment is moved. The main challenges when it comes to placing pragma comments are:

What they apply to depends on the comment's semantics
```
(a + call() # noqa: XXX
 	+ c)
```
Whether noqa: XXX suppresses a violation of a or call() depends on the details of the rule XXX. The only safe (and reasonably fast) assumption the formatter can make is that the pragma comment suppresses all code on the same line. Type hint comments type: T and nocoverage comments have different semantics: they annotate the preceding statement or clause.
There's the tension between: 1) precisely preserving the pragma comment's semantics, and 2) formatting the code and respecting the preferred line width. Perfectly solving for 1) means that the formatter needs to preserve the line breaks of affected lines, meaning they unnecessarily split over multiple lines or exceed the configured line width. Perfectly solving for 2) means that pragma comments may move and change (or lose) their semantics.

Thought: I believe there are a few reasons why misplaced pragma comments annoy people more than regular comments:

I need them in situations where other tooling falls short. I may already have spent significant time figuring out how to fix the type error, but it is taking too much time, and I run out of patience. That's why I fall back to the last tool at my disposal and slap a suppression comment at the end of the line. But now the line is too long, and the formatter reformats the whole line, moving my carefully placed suppression comment. That's it, I'm running out of patience. I desperately move the suppression comment. Dear formatter, pretty please, don't mess with my suppression comment again. I can't handle it anymore. I just want to move on with my life. The worst outcome is if the formatter prevents me from placing a pragma comment in a specific position, but the other tool doesn't allow it in any other. I then have to refactor my code for the sake of the tooling that's supposed to make me more productive.

Almost all pragma comments in the Python ecosystem are trailing line comments, whereas I prefer using (statement level) own line comments almost exclusively for documentation. Own line comments are straightforward to place for formatters. Unfortunately, we can't fix this (yet).

I often don't feel as opinionated about the placement of documentation comments as long as they preserve context. I also have the option to make it an own line comment or decide not to care if I disagree with the formatter's decision. I don't have this luxury with pragma comments. They need to be in a specific position, or some other tool won't stop annoying me.

Principles

Suppression Range

Prefer reducing the range of pragma comments rather than extending them because tools report missing suppression. Some tools even support adding and removing unnecessary suppressions (including Ruff). Extended suppression ranges go undetected and can suppress real issues in the future that may lead to production incidents (The very thing our users try to avoid by using our static analysis tooling).

# Inut
def test(
    a, # noqa: XXX
    b
)

# Ruff
def test(
    a, # noqa: XXX
    b,
)

# Black
def test(a, b) # noqa: XXX

The way to achieve this is by eagerly splitting if a node has a trailing (suppression comment). This requires more vertical space than collapsing the logical line if it fits into the line width. This also prevents multiple suppression comments from collapsing, which not all tools support. The following type: ignore comment no longer applies because a # noqa comment precedes it.

a = (
    b # noqa 
    + c # type: ignore
) 

# Avoid
a = b + c # noqa type: ignore

Black handles type: comments to prevent this specific behavior. However, it may be a problem for other pragma comments.

Another way to phrase this principle is that we want to respect the user's comment placements. The user deliberately added the noqa comment after b. We should respect that decision and preserve it.

Formatting

Favor manual comment placement over disabling or limiting formatting.

People use formatters to avoid discussing formatting decisions. Disabling formatting, or reduced formatting resurfaces the very same problem that people try to avoid by using a formatter. This isn't the same as us not caring about pragma comments or not wanting to use any pragma comment specific formatting. It only means that we don't want to use solutions that disable or limit formatting in a significant way.

Proposal

Eagerly split when a line contains end-of-line comments to ensure (pragma) comments remain close to the node they attribute. This prevents increasing suppression ranges and respects the user's comment placement decisions. Ruff already does this today for all comments.
Allow pragma comments to exceed the line width. This is similar to Pyink and is a deviation from black. It can result in differences to blacked code. The motivation is that adding a pragma comment shouldn't result in formatting changes because it is frustrating (see thought section in the intro or this Black issue from Guido). IMO, this doesn't conflict with improving readability by limiting the line width because pragma comments are mainly intended for tooling rather than programmers. Note: Collapsed comments of the form # documentation # noqa: fully count towards the line width, but # noqa: XXX # documentation will not, because it would require the formatter to understand the semantics.
Don't implement advanced heuristics to avoid moving suppression comments after the closing parentheses as part of our first release (see example 3). Special casing pragma comments can confuse users (why did my comment move up?), result in comment reordering, and add significant complexity to the implementation. We may want to explore more advanced heuristics in the future.

Alternatives

Disable line breaks for type: comments similar to Black [PR]. I understand its motivation but believe that limiting formatting for entire lines removes the advantage of using a formatter. Not counting pragma comments towards the line width balances "reducing the frustration when placing pragma comments" with "the benefits of using a formatter" better IMO. I admit that this approach doesn't solve the issue when you format a new project or extend the width of an existing line. But moving the comment manually is a reasonable ask in these situations.

Examples

Ruff's default behavior is to keep end-of-line comments on the same line as the node they annotate. In-between operator comments have different formatting, and Ruff can move them to another line.

Case 1: Suppress single argument/expression

Ruff's philosophy on trailing same-line comments is to keep them close to the node they annotate. This prevents increasing the scope of pragma comments.

# Input
def test(
    a, 
    b, # noqa: RUFXXX
	c, 
    d
): pass

# Ruff
def test(
    a, 
    b, # noqa: RUFXXX
	c, 
    d
): pass

# Black (Same formatting for `type: ignore` comment)
def test(a, b, c, d):  # noqa: RUFXXX
    pass

Black's formatting has the disadvantage that it extends the suppression from b to all arguments. This can be worked around by manually adding a magic trailing comma and moving the comment back. But this requires more manual work and may go unnoticed when formatting the whole project.

Black issue

Case 2: Breaking overlong lines

# Input
def test(aaaaaaaaaaaa, bbbbbbbbbbbbbbbbbbb, ccccccccccccccccccc, ddddddddddddddddddddddddd): # noqa: RUFXXX
    pass

# Ruff
def test(
    aaaaaaaaaaaa, bbbbbbbbbbbbbbbbbbb, ccccccccccccccccccc, ddddddddddddddddddddddddd
):  # noqa: RUFXXX
    pass

# Black
def test(
    aaaaaaaaaaaa, bbbbbbbbbbbbbbbbbbb, ccccccccccccccccccc, dddddddddd
):  # noqa: RUFXXX
    pass

def test(aaaaaaaaaaaa, bbbbbbbbbbbbbbbbbbb, ccccccccccccccccccc, ddddddddddddddddddddddddd): # type: ignore
    pass

Both formatters move the noqa suppression to the end. The result is that the noqa comment no longer suppresses any violations because it only applies to ):.

Black special cases type: ignore at the end of physical lines to prevent these lines from splitting. Although, this doesn't seem to work for case 3 or the following example.

def test(
    a, b, # type: ignore
    c): pass

# gets formatted to 
def test(a, b, c):  # type: ignore
    pass

I believe it is because Black only tests for the pragma comment at the end of logical lines, the end after having split the line. But that means you must know Black's internal formatting to understand in which positions Black preserves type: ignore comments.

Implementation Note: I think it would be possible to implement similar handling in specific locations by removing all soft_line_breaks and replacing soft_line_break_or_space with a regular space. It could be more complicated for binary-like expressions (and other expression-like nodes) where the suppression should apply to all nodes on the same line, making it less predictable for users.

Moving the noqa comment past the ): is admittedly the worst position to move the suppression comment because it never suppresses anything (may not be true for other pragma comments). It's difficult to place the comment better for function headers, but the comment could be moved to a more meaningful position as proposed in this Black issue comment

check_call(
    program,
    lambda: lib._some_private_module(param1, param2, to_make_it_long), # pylint: disable=protected-access
    arg2)

# formatted
check_call(
    program,
    lambda: lib._some_private_module(  # pylint: disable=protected-access
        param1, param2, to_make_it_long
    ),  
    arg2,
)

This is only a heuristic, and we may need to exclude type: ignore comments because they must come at the statement's end.

Implementation Note: We could implement this for all parenthesized expressions by formatting any trailing pragma comment twice: Once as trailing ] comment if the group fits and once as trailing [ comment if the group breaks.

Case 3: Overlong with multiple items

# Input 
value = (
    checked()
    + ignored() + ignored()  # noqa: RUFXXX
    + checked()
    + checked()
    + checked()
)

# Ruff
value = (
    checked()
    + ignored()
    + ignored()  # noqa: RUFXXX
    + checked()
    + checked()
    + checked()
)


# Black (same for type: ignore)
value = (
    checked()
    + ignored()
    + ignored()  # noqa: RUFXXX
    + checked()
    + checked()
    + checked()
)

Both formatters expand the arguments and keep the noqa attached to the second ignored() call and break the binary expression over multiple lines, excluding ignored from the suppression.

Case 4: Suppression after open parentheses

# Input
blah_blah_blah = [ # pyright: ignore
1, 2, 3, 4, 5, ...
]

# Ruff
blah_blah_blah = [  # pyright: ignore
    1,
    2,
    3,
    4,
    5,
    ...
]

#. Black
blah_blah_blah = [  # pyright: ignore
    1,
    2,
    3,
    4,
    5,
    ...
]

Both formatters preserve the comment position.

Case 5: Ruff Code fix

Edge case of using --add-noqa with the formatter:

# Before Fix (function fits)
def test(aaaaaaaaaaa, bbbbbbbbbbbbb, ccccccccccccccc, dddddddddddddddddd): 
    pass

# Ruff adds suppression comment
def test(aaaaaaaaaaa, bbbbbbbbbbbbb, ccccccccccccccc, dddddddddddddddddd): # noqa: RUFXXX 
    pass

# Formatter
def test(
    aaaaaaaaaaa, bbbbbbbbbbbbb, ccccccccccccccc, dddddddddddddddddd
): # noqa: RUFXXX 
    pass

# Rerun Ruff
def test(
    aaaaaaaaaaa, bbbbbbbbbbbbb, ccccccccccccccc, dddddddddddddddddd # noqa: RUFXXX 
): 
    pass

# Rerun formatter
def test(
    aaaaaaaaaaa, 
    bbbbbbbbbbbbb, 
    ccccccccccccccc, 
    dddddddddddddddddd, # noqa: RUFXXX 
): 
    pass

# Run ruff one more time
def test(
    aaaaaaaaaaa, 
    bbbbbbbbbbbbb, # noqa: RUFXXX 
    ccccccccccccccc, 
    dddddddddddddddddd 
): 
    pass

We can avoid this (and similar experiences where users manually add noqa comments) by not accounting noqa comments to the line width. Related Dartfmt issue. They decided against it because they can't feasibly implement this in their formatter.

Another solution would be that ruff tries to apply the noqa comment to the smallest range necessary and make it split lines where allowed. Unfortunately, this isn't an easy task because inserting a line break may a) not be allowed or b) require parenthesizing the outer expression.

Other Tools

Handling of line-based pragma comments is something all formatters I know struggle with. Rust doesn't have this problem because its suppression comments are node directed, allowing the formatter to reason about their semantical meaning.

Prettier: Recommends using leading own line comments for suppressions but gives no guarantees
Black: See above: special handling for type comments.
Dartfmt: It doesn't seem to have any special handling for pragma comments
Rust: Doesn't have this problem because they use syntax-aware suppression comments.

Pragma comments in the Python ecosystem

Type comments

`type: T`

Have the form type: T and were introduced by PEP484. They annotate the type of a variable.

Type comments should be put on the statement's last line containing the variable definition. They can also be placed on with statements and for statements, right after the colon.

x = []                # type: List[Employee]
x, y, z = [], [], []  # type: List[int], List[int], List[str]
x, y, z = [], [], []  # type: (List[int], List[int], List[str])
a, b, *c = range(5)   # type: float, float, List[float]
x = [1, 2]            # type: List[int]

with frobnicate() as foo:  # type: int
  # Here foo is an int
  ...

for x, y in points:  # type: float, float
  # Here x and y are floats

Type comments are specific to assignment, with, and for statements.

Type information are available at runtime for introspection. Changing the placement of any of these comments may change the runtime semantics of a program.

Multiline annotations

This seems to be the correct annotation according to pycharm

a = non_existing( 
  bcd
) # type: str

the following doesn't provide the right type-hint:

a = non_existing( # type: str
      bcd
) 
```

`type: ignore`

The PEP484 further introduces type: ignore comments.

The # type: ignore comment should be put on the line that the error refers to:

import http.client
errors = {
	'not_found': http.client.NOT_FOUND  # type: ignore
}

They have to be placed at the end of a physical line.

In some cases, linting tools or other comments may be needed on the same line as a type comment. In these cases, the type comment should be before other comments and linting markers:

# type: ignore # <comment or other marker>

Pyright

File level comments

Own-line comments that control the strictness of Pyright. These shouldn't be a problem because they must appear on their own line. The documentation isn't specific about where file-level comments are valid but assuming they are only used in-between statements (e.g. not between expressions) seems reasonable.

Line-Level Suppression comments

Pyright supports type: ignore and pyright: ignore. Pyright ignore supports suppressing specific errors.

Pyright also supports a # pyright: ignore comment at the end of a line to suppress all Pyright diagnostics on that line. This can be useful if you use multiple type checkers on your source base and want to limit suppression of diagnostics to Pyright only.

Noqa

Noqa comments are special because Ruff uses them itself.

The flake8 documentation isn't very specific about their semantics. That's why I rely on Charlie's understanding.

They suppress violations on the same (physical) line
Except for multiline strings, where they apply for the whole string.

Pylint

Yes, this feature has been added in Pylint 0.11. This may be done by adding “#pylint: disable=some-message,another-one” at the desired block level or at the end of the desired line of code
source

Coverage.py

Coverage.py will look for comments marking clauses for exclusion. In this code, the “if debug” clause is excluded from reporting:

Clause-based ignore comments are unproblematic because we preserve them.

Resources

Related issues:
- pyright: ignore
- noqa handling
- unstable formatting
- Pylint inline comments
- Type ignore
- Overlong lines because of type ignore comments
- dartfmt (Labeled as meh e)
- Prettier comments recommendation
- Prettier issue
Relevant Black code and here

MichaReiser · 2023-08-18T07:41:10Z

MichaReiser
Aug 18, 2023
Maintainer Author

@roshanjrajan-zip I would love your reaction to this proposal since you expressed that you ran into issues with Black's pragma comment handling.

1 reply

roshanjrajan-zip Sep 1, 2023

This is great! I am really excited about the pragma comment solution! Was wondering if we needed to extend the pyright behavior to mypy or other pragma comments users (pylint is also in the same boat but ideally ruff would replace this...).

zanieb · 2023-08-18T18:21:08Z

zanieb
Aug 18, 2023
Maintainer

Not only is this an amazing overview, I happen to agree with you entirely and approve this proposal 🥳

0 replies

konstin · 2023-08-21T20:32:30Z

konstin
Aug 21, 2023
Maintainer

main concern: how do we teach this? i.e. how would a user no why ruff suddenly decided to bail on enforcing the line length limit on this specific line?

1 reply

MichaReiser Aug 22, 2023
Maintainer Author

I mainly see this as an issue for users that migrate from Black to Ruff because Ruff may collapse clause headers with pragma comments that Black didn't. We should document how Ruff handles comments and why we exclude pragma comments. The good thing, the rules are simple enough.

I'm not as much concerned about the day-to-day usage because:

I found no issue in the Black repository with overlong lines and type: ignore comments. Meaning, users don't seem to notice
There are multiple issues that ask for excluding other pragma comments from the line length because they got annoyed by the reformatting. Therefore, I assume this proposal captures and matches the user's expectations.

There are many other cases where I often struggle to understand why Black doesn't break an expression even if it could:

 class Test:
     def test():
         if False:
             if True:
                 if False:
                     a = aLongIdentifierChain # with some comment that explains the assignment but exceeds the line width
   ```
 The comment here exceeds the line width but Black won't parenthesize the assignment. This is okay because Black only tries to respect the line width. It doesn't guarantee that no line exceeds the line width. My point, Black's formatting rules aren't perfectly clear and users shouldn't need to understand each formatting rule, for as long as it roughly matches their expectations.

charliermarsh · 2023-08-22T03:21:18Z

charliermarsh
Aug 22, 2023
Maintainer

This is such a good write-up. It's really impressive. I like the proposal a lot.

The flake8 documentation isn't very specific about their semantics. That's why I rely on Charlie's understanding.

I believe that your description is correct. (There is at least one deviation between Ruff and Flake8: Ruff requires that a # noqa: E501 line-length suppression be at the end of a multiline string, whereas Flake8 will respect it either at the end or on the overlong line within the multiline string.)

Separately: are there any lessons we should takeaway for any suppression comment redesigns we explore in the future? (For Ruff's suppression comments, at least.)

1 reply

MichaReiser Aug 22, 2023
Maintainer Author

I believe that your description is correct. (There is at least one deviation between Ruff and Flake8: Ruff requires that a # noqa: E501 line-length suppression be at the end of a multiline string, whereas Flake8 will respect it either at the end or on the overlong line within the multiline string.)

Do you have an example? How can you have a comment inside of a multiline string? Or is it specific to implicit concatenated multiline strings?

Separately: are there any lessons we should takeaway for any suppression comment redesigns we explore in the future? (For Ruff's suppression comments, at least.)

Don't use trailing end-of-line comments. I don't have too strong evidence because I haven't looked as much into the other comments, but they seem the hardest to preserve. My impression (and prettier's recommendation) is to use leading own line comments, ideally, such that are node-based (they suppress the next node and not the next line). Although node based suppression comments have the downside to be less intuitive. How many users now the AST structure of their program? What happens if Ruff changes the AST structure?

JonathanPlasse · 2023-09-19T08:14:19Z

JonathanPlasse
Sep 19, 2023

There is also # pragma: no cover used by py.coverage should it be included?

1 reply

MichaReiser Sep 19, 2023
Maintainer Author

The main purpose for excluding pragma comments from the width is to avoid the situation where a re-format misplaces the pragma comment, making it necessary for you to move it back into a valid position manually. This shouldn't happen for no cover because the comment is node-based and not line-based: It disables coverage for the whole clause-body and it is only important that it stays after the clause header's :.

# Input
def test(a, b, c, d, e): 
	...
# Manually adding the pragma
def test(a, b, c, d, e):  # pragma: no-cover
	...
# After save (assuming that the function now exceeds the line width so that all arguments must be formatted on their own line
def test(
	a, 
	b, 
	c, 
	d,
	e
):  # pragma: no-cover

# Removing an argument and hitting save
def test(
	a, b, c,  d,
):  # pragma: no-cover

The pragma comment remains valid in all cases.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pragma comment handling #6670

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Pragma comment handling #6670

MichaReiser Aug 18, 2023 Maintainer

Formatting Pragma Comments

Principles

Suppression Range

Formatting

Proposal

Alternatives

Examples

Case 1: Suppress single argument/expression

Case 2: Breaking overlong lines

Case 3: Overlong with multiple items

Case 4: Suppression after open parentheses

Case 5: Ruff Code fix

Other Tools

Pragma comments in the Python ecosystem

Type comments

type: T

type: ignore

Pyright

File level comments

Line-Level Suppression comments

Noqa

Pylint

Coverage.py

Resources

Replies: 5 comments · 4 replies

MichaReiser Aug 18, 2023 Maintainer Author

roshanjrajan-zip Sep 1, 2023

zanieb Aug 18, 2023 Maintainer

konstin Aug 21, 2023 Maintainer

MichaReiser Aug 22, 2023 Maintainer Author

charliermarsh Aug 22, 2023 Maintainer

MichaReiser Aug 22, 2023 Maintainer Author

JonathanPlasse Sep 19, 2023

MichaReiser Sep 19, 2023 Maintainer Author

MichaReiser
Aug 18, 2023
Maintainer

`type: T`

`type: ignore`

Replies: 5 comments 4 replies

MichaReiser
Aug 18, 2023
Maintainer Author

zanieb
Aug 18, 2023
Maintainer

konstin
Aug 21, 2023
Maintainer

MichaReiser Aug 22, 2023
Maintainer Author

charliermarsh
Aug 22, 2023
Maintainer

MichaReiser Aug 22, 2023
Maintainer Author

JonathanPlasse
Sep 19, 2023

MichaReiser Sep 19, 2023
Maintainer Author