-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Latest black
changes ASTs
#2150
Comments
Black does change the AST in a few places. This was so even before the most recent release; the previous release already made some changes to docstrings. We should update that sentence in the documentation. |
That's disappointing. The fact that I could trust Black to never change the behavior of a program was one of its greatest selling points, and made me feel comfortable applying Black indiscriminately without needing to check its work. Now that I know it may change my program's behavior, I'll be much more cautious applying it to new projects. |
We do still have the safety check (in the |
Well, I'm not sure what to say beyond that I'm very dismayed to learn this. The fact that Black would prove that it did not change my program's behavior was an important - probably the most important - feature to me, and I'm disappointed to hear that it has been lost. Tools like Sphinx read docstrings, and now I can't know whether running Black on my library has changed what Sphinx will generate without diffing and checking its work. That's a really serious loss. Better formatting is not a win if I can't trust that it didn't break anything in the process. |
That's a fair point! We at least should have updated the documentation, and I would consider adding an option to disable docstring formatting. |
Do you have a concrete problem with the changes to docstrings or are you here only to voice your disappointment and dismay? The changes made to docstrings are pretty conservative and have to do with indentation and leading/trailing whitespace. |
I came here because I noticed that the new version of Black had made a change that resulted in a different AST, which as far as I could tell from the docs and from a PyCon lightning talk I remembered, it's never supposed to do. I originally thought this was just a bug, but after finding the 21.4b0 changelog I saw that this was intentional, and @JelleZijlstra has clarified that this wasn't the first time - other AST changes have been allowed before, and When I learned that, I switched to expressing my disappointment and dismay at a feature I consider so important being quietly lost. I have been confident using and advocating for Black because I could trust that it would never change a module's behavior. This lets me apply it indiscriminately to any project without checking its work, since it would check for me that the AST was the same and therefore the behavior almost certainly will be as well. But above, I've got a 3 line program that exhibits different behavior before and after running Black on it. I find this scary.
I didn't say that this would break Sphinx, but I said that it could change what Sphinx will generate. After some experimentation, it doesn't seem to - it seems as though Sphinx is stripping the leading whitespace before rendering the docstring to HTML or to LaTeX. I didn't know that going into this. I'm still not entirely convinced that there isn't some Sphinx plugin whose behavior could change because it sees the docstring in an intermediate state before the leading whitespace has been stripped, though. |
@godlygeek But above, I've got a 3 line program that exhibits different behavior before and after running Black on it. I find this scary. - as for now you haven't proved any different --> behaviour <--. If there's no case when additional leading whitespace changes anything - it means the behaviour is the same (even if AST not). |
The user-observable behavior of that program is that something is printed to stdout. The thing that's printed to stdout is different after running Black. |
Another example of somewhere where this might cause issues is PLY, which has a parsing DSL that uses docstrings as regular expressions to match tokens - the addition of an extra leading space to a PLY token regex like: def t_DOUBLE_QUOTED_STRING(t):
'"[^"]*"'
t.value = t.value[1:-2]
return t would absolutely change the program's behavior. |
@godlygeek I could argue about the first point, but definitely not about the latter. But actually... it's not the case, because this docstring is not reformatted. And playing a bit more - it seems to be even more strange: For: def t_DOUBLE_QUOTED_STRING(t):
'"[^"]*"'
t.value = t.value[1:-2]
return t
def foo():
'''"something" is what this does.'''
pass
def foo2():
''' "something" '''
pass
def foo3():
""" 'something' is what this does."""
pass we get: def t_DOUBLE_QUOTED_STRING(t):
'"[^"]*"'
t.value = t.value[1:-2]
return t
def foo():
""" "something" is what this does."""
pass
def foo2():
'''"something"'''
pass
def foo3():
"""'something' is what this does."""
pass For Playground with the code is here. |
@godlygeek to be precise - your example should look this way (in yours we don't have docstrings at all): def t_LEFT_QUOTED_STRING(t):
""""[^"]*"""
t.value = t.value[1:-2]
return t and output: def t_LEFT_QUOTED_STRING(t):
""" "[^"]*"""
t.value = t.value[1:-2]
return t |
Well, we do have a docstring, just not a triple-quoted string - but fair enough; I didn't realize that single quoted strings wouldn't be affected by this, but should have. Thanks for the correction! |
This comment has been minimized.
This comment has been minimized.
Let's keep this issue focused on the fact that we're changing the AST for docstrings. Feel free to open new issues for any unexpected behavior in the docstring formatting logic. |
None of the examples above are bugs.
Note that string normalization doesn't replace single strings with triple strings. |
Focusing back on the question of adding these leading/trailing spaces to docstrings, Black does change: def t_QUOTED_STRING(t):
""""[^"]*\""""
t.value = t.value[1:-1]
return t to def t_QUOTED_STRING(t):
""" "[^"]*\" """
t.value = t.value[1:-1]
return t Which, in the context of PLY, definitely does result in different behavior for the program. Granted that case is contrived, but it's still an example of a place where modifying the docstring would change the behavior of a working program, and supports my argument that changes to docstrings are not necessarily safe. |
I'm somewhat sympathetic to @godlygeek despite the esoteric use case. I see why Black is not configurable. Given the fact that most of the time docstring modifications are not only safe but actually very welcome, and that Black doesn't really provide much configuration options, the existing behavior is surely here to stay. That being said, I share the concern that modifying behavior even slightly is on a different level to strict formatting. This could be addressed at least in a few ways:
Again, having more configuration runs against Black's philosophy, but I think this could be a proper place to give in a bit, because it isn't really a formatting preference. Would this kind of non-default toggle suffice, Matt? Łukasz and others, would this be an acceptable choice for the library? |
I'd prefer if the unsafe stuff was opt-in, rather than needing to opt into extra safety - but I suspect I won't win that battle. So in the interest of a compromise that everyone can agree on, a way to opt out of changes that affect the AST seems like a reasonable middle ground. |
But then (AST strict mode) Black wouldn't be able to e.g. add the last comma when exploding a list (and as well remove it when the list can fit single line again + magic commas are turned off): array = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22] Output: array = [
0,
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
] So disabling all AST-unsafe operations in one flag just to avoid docstrings manipulation doesn't seem to be actually practical. From the other side, to just skip docstrings we would need an additional option - and as mentioned, case by case, we would get further and further away from the initial Black's philosophy.
@ambv But why single quotes are not replaced with double quotes like in |
Changing the trailing comma does not change the AST. |
Oh right, of course it doesn't, stupid me. Sorry, no more comments before morning coffee ;) |
No, it does not. From https://ply.readthedocs.io/en/latest/ply.html#specification-of-tokens (emphasis mine):
|
While there's still been not been any use case pointed out where modifying trailing/leading spaces in the doc-string has resulted in a meaningful behaviour change... I do think it'd be nice if black formatted as: - """"[^"]*"""
+ '''"[^"]*''' It's likely clearer this way, and doesn't require modifying the string in this case. It's edge-casey, but it'd be nice. In a similar vein, it'd be nice to have the following changed to the above style (note the addition space getting trimmed): - """ "[^"]*"""
+ '''"[^"]*''' This might be a separate request though. |
#2168 gives a concrete example where changing docstrings is problematic. |
Looks like the behaviour of adding a space is the problematic part, in both issues. |
For context, sphinx/autodoc uses empty docstrings to disable documenting specific functions. Thus |
What does Sphinx/autodoc do if there is no docstring at all? |
Is there anything left to do here? We added documentation that better explains Black's behavior here and we made a formatting change to allow for empty docstrings (since people brought up concrete problems with those). |
I think practically speaking, there isn't. If we are worried about other possible use cases and the maintainers would "OK" having some sort of AST strict mode for docstrings (or even all code [likely not though]), that could be discussed. But I doubt it's worth the effort without concrete problems. |
When using pdoc3 (and maybe other similar tools for generating documentation from docstrings), a double trailing whitespace at the end of a line is used for soft breaks (markdown syntax). As a result I am a still using black 19.10 for now, as I did not find a way to preserve trailing whitespaces in docstrings with recent versions. |
Hey, is still being considered? I would like to see an option to have black not change the ast at all, given that this can cause problems with automatic documentation and anything that uses docstrings. I'm of the opinion that there should be an extra safe ast mode. |
I agreed.
…On Wed, Sep 1, 2021, 13:05 aru ***@***.***> wrote:
Hey, is still being considered? I would like to see an option to have
black not change the ast *at all*, given that this can cause problems
with automatic documentation and anything that uses docstrings. I'm of the
opinion that there should be an extra safe ast mode.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2150 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ARZEIIVEIMHUV5CUYLNVSEDT7ZFMDANCNFSM43T3E4BQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
@onerandomusername Could you elaborate on the problems? #2168 fixed the only actual non-philosophical issue brought up in this discussion, so if you’re still facing issues, it’d be helpful if you could elaborate on what they are. |
Here is another: #144 (comment)
Black had a golden opportunity to be the model of interoperability in this space. "black transformations will not [1] modify your AST" is an unambiguous, well-defined contract, that would have allowed every other formatting, testing, or docs tool to operate independently alongside it with the confidence only clear expectations can provide. Bummer. [1] or even "will not, by default" or at the very least "will not, if you explicitly ask for AST-safety" |
As seen in #4276 linked above, I noticed that black breaks some ascii art I have in docstring. This is making me reluctant to adopt black. I wouldn't wanna read through all docstrings to find broken information. A configuration for this would be very valuable. |
Describe the bug
The README says:
But after #1740, docstrings beginning with a double quote have an extra space prepended to them. This produces a different AST than the file had before, but
--safe
doesn't complain about this.Is Black no longer promising to produce identical ASTs?
To Reproduce
Expected behavior:
I believe that
black
is never supposed to change the AST, and therefore rewriting the docstring to contain this extra space is not a valid transformation forblack
to perform.Environment:
black, version 21.4b0
Does this bug also happen on master?
Yes
Additional context
The text was updated successfully, but these errors were encountered: