Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permit standalone form feed characters at the module level #4021

Merged
merged 6 commits into from
Nov 21, 2023

Conversation

tungol
Copy link
Contributor

@tungol tungol commented Nov 3, 2023

Description

This allows the use of form feeds in code formatted with Black. Addresses #1100

The short form of the motivation is that form feeds represent a page break, and
are used by some editors to organize code. Issue #1100 has a little more discussion.

This enforces a couple rules: a form feed is only allowed at the module level,
on a line by itself. Multiple form feeds in the same prefix are merged with each
other. Any form feed is moved to the last empty line of the prefix it's in,
which just seemed tidy to me.

Black already retained form feeds in two situations: internal to a comment,
or in a string. I didn't change either of those, so that this MR won't actually change any code
already formatted with black. I went back and forth a couple times on whether to
remove form feeds inside a comment: it'd probably be reasonable to do so?

A form feed literal inside a string is probably better off as '\f', but I figured that that's
out of line for what black wants to do. Let me know if the snark in the documentation
is too much.

The biggest change here if you don't care about form feeds is that I reworked comment
generation so that generate_comments() will actually consume the prefix of leaves passed to it.
Doing so simplifies normalize_prefix(), and I verified that there are no code paths to normalize_prefix()
that don't go through LineGenerator.visit_default() and generate_comments() first.

I pulled in the _splitlines_no_ff() function from the standard library Lib/ast.py. Diffs that involve a
form feed didn't work right when the splitlines() string function was used. For performance
reasons, it has a compiled regex outside of it's function body, and I wasn't sure whether to keep
that regex near the function, move it to the top of the file, or maybe the constants file? Or if we're not
that fussed about performance of diff generation, I can just move the regex into the function body.
I didn't create a test for that change; let me know if one is desired.

I did have to make a small change to blib2to3/pgen2/driver.py, which I didn't relish, but
it would eat form feeds after a dedent if I didn't. I didn't update the changelog in src/blib2to3/README
for that: should I?

I put this behind a preview flag. It doesn't change any code already formatted with black, though.

Finally, in EmptyLineTracker, I noticed that there's no situation in the test suite where current_line.leaves
is empty. I'm not sure what it would mean if we did end up with a line that had no leaves there.
But every other time direct access to the leaves was needed had a check for whether current_line.leaves
had any contents, so I checked too. I imagine that's probably for mypy's sake, but I wasn't totally sure.

Checklist - did you ...

  • Add an entry in CHANGES.md if necessary?
  • Add / update tests if necessary?
  • Add new / update outdated documentation?

Copy link

github-actions bot commented Nov 4, 2023

diff-shades reports zero changes comparing this PR (2dbf793) to main (c54c213).


What is this? | Workflow run | diff-shades documentation

Copy link
Collaborator

@JelleZijlstra JelleZijlstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution. Technically this looks good, but I'm not sure support for form feeds is worth the additional complexity. There hasn't been a lot of interest in the issue (only a few people commented over four years), and it was noted that several other languages either don't support formfeeds at all or use formatters that remove them. So I think Black might be doing the right thing in consistently removing a character that is rarely used and will most likely confuse most users if they encounter it. I'd welcome opinions from other contributors, though.

@tungol
Copy link
Contributor Author

tungol commented Nov 8, 2023

Well, I can't say I'm too surprised that you feel that way. It's a niche feature, certainly. For languages besides Python, I think it's mostly C and Lisp. There's an open ticket for rustfmt to optionally not strip form feeds, but it hasn't attracted much attention either.

I worked to keep the implementation as simple and unobtrusive as possible. Looking at it again now, I can simplify a little more, and further limit the number of places in the code where it's necessary to reference form feeds. I think it's an improvement, and I'll push it up. I also think that the complexity issue is not really your main concern, so I don't expect that to move the needle much.

I think it's worth noting that VSCode displays the character, so most developers these days would be able to see a form feed if they encountered one. I think that's less confusing than it being invisible, at least.

In the end, I think the change is pretty unobtrusive in regards to both the code and the style, and that out of the people who might be confused by it, only a small number are going to notice that anything's different. But I also understand if it doesn't fit into the Black vision.

All that being said, if it ends up that you don't accept the MR, I think there's a couple bits worth saving:

  • The use of _splitlines_no_ff() in generating the diffs is a bugfix, regardless of the rest of the MR. The diff generated to remove a form feed is incorrect without that change, and not compatible with other diff-related tools.

  • I think some of the refactoring around the normalize_prefix() function is pretty good, and would still be an improvement without the form feed stuff. But that's a matter of taste, so I'm not sure how you or the other maintainers would feel about that part.

@@ -996,8 +997,6 @@ def bracket_split_build_line(
result.inside_brackets = True
result.depth += 1
if leaves:
# Since body is a new indent level, remove spurious leading whitespace.
normalize_prefix(leaves[0], inside_brackets=True)
Copy link
Contributor Author

@tungol tungol Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record on this one: removing this didn't change anything in the test suite. I traced it back, and I got all the way to the initial commit before I found a version of black where removing this made a test fail. git bisect showed that it was changes in whitespace handling in commit da3c2f3 that made this superfluous.

@JelleZijlstra
Copy link
Collaborator

Thanks for the explanation!

@hauntsaninja any opinion on this one? I'm a little hesitant but the change is likely small enough that we can accept it.

(@tungol could you fix the merge conflict?)

@tungol
Copy link
Contributor Author

tungol commented Nov 8, 2023

@JelleZijlstra merge conflicts resolved

@tungol
Copy link
Contributor Author

tungol commented Nov 8, 2023

Looks like I broke something with the documentation somehow? Looking into it.

@JelleZijlstra
Copy link
Collaborator

I think you need to remove the normalize_prefix function from the "Developer reference" docs section.

@tungol
Copy link
Contributor Author

tungol commented Nov 8, 2023

That basically got reworked into comments.normalize_trailing_prefix, so I swapped it with that one, and added the new nodes.make_simple_prefix in there as well

move changelog entry up to the new unreleased section
Copy link
Collaborator

@JelleZijlstra JelleZijlstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do it.

docs/the_black_code_style/future_style.md Outdated Show resolved Hide resolved
@JelleZijlstra JelleZijlstra merged commit 89e28ea into psf:main Nov 21, 2023
33 of 34 checks passed
@tungol tungol deleted the formfeed3 branch November 21, 2023 06:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants