Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using rustfmt or other formatter #410

Open
ariard opened this issue Dec 3, 2019 · 18 comments
Open

Using rustfmt or other formatter #410

ariard opened this issue Dec 3, 2019 · 18 comments

Comments

@ariard
Copy link

ariard commented Dec 3, 2019

With the same end-goal of refactoring some part of the code base, improving code readability would be great too

Per @valentinewallace idea.

@valentinewallace
Copy link
Contributor

awesome, thanks @ariard . @TheBlueMatt i'm sure you've considered this, any reason we're not using rustfmt or something similar?

@TheBlueMatt
Copy link
Collaborator

Personally I'm generally very much not a fan of formatters in that sometimes you want to do custom things for readability (eg if you want to manually align lines so you can easily see all the lines are the same modulo one or two chars) and they break those...is there a way to apply a formatter to just new code (ie format a commit) as a tool for those who don't want to format their code manually/want to format it their differently and get it auto-changed on checkin.

Still, if people feel really strongly here, I wont stand in the way toooo much.

@ariard
Copy link
Author

ariard commented Dec 4, 2019

...there a way to apply a formatter to just new code (ie format a commit) as a tool for those who don't want to format their code manually/want to format it their differently and get it auto-changed on checkin.

Good suggestion, I'm even on this but if people less familiar with codebase really like it I think it's worth to find a middle ground

@valentinewallace
Copy link
Contributor

is there a way to apply a formatter to just new code (ie format a commit)

I could see that leading to some slightly weird-looking situations, like one newly-committed line of code split into 80-character parts by the formatter, surrounded by 200-character lines on both sides 😛

Maybe a middle ground could be running the formatter every few months, so it's not too much of a burden to manually undo some of its changes (i.e. like this one, which as Matt points out, is not ideal)? Idk, it does improve readability a lot (to me) in most places.

@ariard
Copy link
Author

ariard commented Dec 6, 2019

Concept ACK, we can still stop to use formatter if it's too much a burden, or can't we exclude some lines of it, there is no per-line granularity?

@jkczyz
Copy link
Contributor

jkczyz commented Dec 7, 2019

Concept ACK, we can still stop to use formatter if it's too much a burden, or can't we exclude some lines of it, there is no per-line granularity?

I'm also in favor of using the rustfmt to enforce consistent conventions. It can be disabled in special cases using #[rustfmt::skip]. We should settle on how to configure it before applying it across the codebase, though.

@TheBlueMatt
Copy link
Collaborator

This is blocked on either rust-lang/rustfmt#4306 (maybe) or rust-lang/rustfmt#4146 (hopefully), as we found in 425 that rustfmt just can't generate readable code for lots of things.

@casey
Copy link
Contributor

casey commented Sep 29, 2020

My personal opinion is that it's probably not worth waiting for those issues to be fixed before integrating rustfmt. I've had issues with every code formatter that I've ever used, including rustfmt, but those issues don't outweigh the benefits, which include not having to format code manually, not needing to devote time to style issues in pull requests, enforcing a consistent style, and harmony with community best practices, since the vast majority of Rust projects use rustfmt.

@TheBlueMatt
Copy link
Collaborator

I don't know if its worth spending a ton of time debating this, but there is a tradeoff here between code readability and formatting overhead. Indeed, formatting overhead sucks, but, at least personally, I'll take code readability 100x over any formatting overhead. Of course good auto-formatting should also result in more legible code (since it is all very uniform), but in this case I think its hard to argue that rustfmt results in more readable code for many parts of our codebase (unless you have a 50-char-wide terminal, I suppose).

@MaxFangX
Copy link
Contributor

MaxFangX commented May 4, 2022

I strongly believe that rust-lightning needs hard rustfmt checks in the CI to improve readability by enforcing a (1) maximum line width and (2) breaking up complex control flows into more digestible chunks.

The formatting can be done incrementally, of course, but as a newer contributor I've personally found rust-lightning to be needlessly difficult to work with when it comes to code semantics and readability, and would like to tackle this issue early.

Here's what I've experienced:

Maximum line width

This is what peer_handler.rs::do_read_event looks like in my editor:

Screen Shot 2022-05-04 at 11 29 26 AM

I think there's several things wrong with this:

  • Almost every other line is wrapping all the way across to the left side of the pane. This might be acceptable if it only occurred with strings, but there is actual logic and function parameters that are overflowing to the left side, making it difficult to pair up the parameters with the functions or macros that they're being called on. In general, when there are multiple function calls with slightly differing parameters, we want to be able to tell what parameters are being passed into what function at a glance.
  • At just over 100 characters wide, this editor window isn't exactly narrow. The point of having a max width is so that contributors can have multiple parts of the codebase open in different panes, without lines frequently overflowing back onto the left side, and without wasting ~50% of the horizontal terminal space that is taken up by the up to 14(!!) indents on the left side. To be able to read this code without overflowing often requires more than 50% extra space over a 100 character-wide editor pane. Line 913 in peer_handler.rs for example is 211 characters wide due to the use of three levels of nested function calls + 8 levels of indentation all on one line.

let act_two = try_potential_handleerror!(peer.channel_encryptor.process_act_one_with_keys(&peer.pending_read_buffer[..], &self.our_node_secret, self.get_ephemeral_key())).to_vec();

Overly complex control flows.

My personal experience working with the codebase has been:

  • Spending more time understanding the control flow of the functions I encounter rather than the actual logic of what's happening
  • Being unable to visually pair where an opening { or ( ends, so that once again I am annoyingly trying to figure out what constitutes a complete object rather than reasoning about how that object is used

This codebase is severely lacking intermediate variables. For example, with line 913 above, instead of having a triply nested function/macro call m!(f(f())).f() nested in a match nested in an if nested in a while nested in a match nested in a variable assignment, the values of most of these blocks should be assigned to an intermediate variable, which is optimized away anyway at compile time but which allows each of these blocks to be engaged with independently, rather than burdening our cognition with a bunch of control flow complexity which really doesn't need to be there.

Another result of this extreme nesting is that closing parentheses ) and brackets } are so far away from their opening counterparts that the pair of them is not visible even on an editor pane that is 100 lines tall. I can understanding this being the case for e.g. a singular match { with dozens of branches, but across many part of the codebase, nesting to 10+ indents is entirely unnecessary and only makes reading the code more difficult.

For example, try visually finding the closing } for the variable assignment on this line by following the link and reading the code on the GitHub UI:

let pause_read = match peers.peers.get_mut(peer_descriptor) {

While opening and closing delimiters can be matched with the help of an IDE, this makes PR reviews via the GitHub UI more difficult.

Reducing formatting overhead with cargo watch

I disagree that rustfmt adds significant overhead (assuming I'm correctly understanding Matt's usage of the term "formatting overhead").

With cargo watch you can automatically check for code format changes every time you save your changes:

cargo watch -x "fmt -- --check"

And only when it spits out a diff do you actually run cargo fmt.

Cargo watch can be combined with automatic clippy checks, running tests, builds etc. I've been using this recently for example:

cargo watch -x "clippy --no-deps && cargo test --package lightning-net && cargo fmt -- --check && build --release"

Anyways, strict rustfmt checks in the CI are standard across the Rust ecosystem, so I don't think the overhead of it is a blocker.

Closing thoughts and suggestions

While my "too complex" feedback extends to beyond the scope of what rustfmt can accomplish, setting the rustfmt's max_width to a reasonable 100 enforces a limit on how complex the control flow can get before the programmer is forced to break it down into more bite-sized chunks. A "Cognitive Complexity" clippy lint is in the works, but while that is still pending, I believe that we should take a first step by enabling a hard rustfmt check in the CI for all new code, using #[rustfmt::skip] on select parts of the existing code, and adopting the principles outlined in the document the cognitive complexity lint is based on as part of code style guidelines.

I believe that the rustfmt deficiencies mentioned in rust-lang/rustfmt#4146 and rust-lang/rustfmt#4306 are significantly easier to tolerate than the issues I've outlined above.

Sorry about the length of the post. I just wanted to provide a concrete argument based on actual code, but that required a lot more words than I expected. Cheers!

@TheBlueMatt
Copy link
Collaborator

TheBlueMatt commented May 5, 2022

I'm totally with you on overly-complex control flow in a number of places in the codebase - indeed, its a known problem and something that is being (very slowly) cleaned up as files are changed and new things are added. Sadly, in your case, the PeerManager is one of the less-touched bits of the codebase, and thus tends to get a bit less love. That said, #1023 does reduce some of the massive control flow in that struct in a few places.

As you point out, complex control flow, too many arguments to functions, too few variables, etc are all issues that rustfmt does not help with. Indeed, utilizing the clippy lint is a better way to go about this.

However, the specific issues you point out regarding long control flow and things falling off the screen causing you to not be able to match the end of a function call or scope while reading is made worse by rustfmt, not better. Given long functions with complicated args, rustfmt makes an absolute mess of the code and makes these problems substantially worse by placing every argument and method call on its own line, which when run on PeerManager makes several scopes a thousand lines long instead of many hundreds.

On the flip side, if large scopes and long functions are cleaned up more aggressively, rustfmt starts to provide substantially less value - now things are readable on their own without rustfmt. Again, here, rustfmt has a tendency to make things worse, not better - if the control flow were more readable to begin with, then rustfmt just makes you lose context, which as you note leads to bugs and missing issues in review quite often.

@MaxFangX
Copy link
Contributor

MaxFangX commented May 5, 2022

Point taken regarding rustfmt's tendency to make the vertical length problem even worse - the conclusion I draw from that however is that it is a sign to de-nest the control flow even more aggressively, and to split out any easily isolated blocks of logic into their own functions so that those can be engaged with independently. ​If the control flow is almost completely linear, then long functions and scopes are no longer a problem, since keeping only a few if/match/loop scopes in your head at once does not pose much of a cognitive burden.

At the same time, I can understand how complicated functions involving a large number of tightly coupled parameters may not be amenable to de-nesting. If I have some extra time on a night or weekend, I'll take a stab at reorganizing the PeerManager code to see if I can demonstrate what I mean. Maybe you are right that it is not really possible to use rustfmt and increase the readability of the code in some places without also causing the function to expand to absurd lengths.

Overall, I think that if done correctly, a strict adherence to rustfmt does not cause code to lose context. I think that in addition to providing a relatively consistent code style across all the places where it is applied, rustfmt incentivizes the programmer to code in moderately sized, manageable chunks rather than trying to cram a bunch of logic into singular functions.

@TheBlueMatt
Copy link
Collaborator

the conclusion I draw from that however is that it is a sign to de-nest the control flow even more aggressively, and to split out any easily isolated blocks of logic into their own functions so that those can be engaged with independently.

In general, yes, totally agree. Of course there is a limit - you want to be able to see context, I've seen many critical issues get missed in review because someone was reviewing on GitHub and only had 3 lines of context, or even because someone didn't scroll to the top/bottom of a function and missed something. Just splitting things out into separate functions doesn't always ensure you have more context - function boundaries if well defined are great, but also have substantial loss of context, especially utility functions that are there to split the code up. As with everything, there is a balance to be had.

If I have some extra time on a night or weekend, I'll take a stab at reorganizing the PeerManager code to see if I can demonstrate what I mean.

Take a look at #1023 - parts of the code that you were explicitly pointing to above are already cleaner there, though the focus of that PR isn't just cleaning up the code its also to improve the parallelization. As always, PRs welcome.

Overall, I think that if done correctly, a strict adherence to rustfmt does not cause code to lose context. I think that in addition to providing a relatively consistent code style across all the places where it is applied, rustfmt incentivizes the programmer to code in moderately sized, manageable chunks rather than trying to cram a bunch of logic into singular functions.

Frankly I find that most average rust projects, even with ones that have relatively short functions with 3-4 arguments, rustfmt makes code unreadable. Nearly every time I touch a "normal" Rust project I find myself frustrated by how little context I can see on my screen, even with a vertical 4k monitor with upwards of 150-200 lines in the terminal. I find myself spending substantial amount of time scrolling up and down just to see the top of a really simple function, which makes bugs way more likely than not, in my experience.

@jkczyz
Copy link
Contributor

jkczyz commented May 6, 2022

FWIW, for new files and new code where possible, I've been trying to keep within the 100 character limit. For example:

https://github.com/lightningdevkit/rust-lightning/blob/main/lightning/src/routing/scoring.rs
https://github.com/lightningdevkit/rust-lightning/tree/main/lightning-block-sync/src (all files)
https://github.com/lightningdevkit/rust-lightning/blob/main/lightning-invoice/src/payment.rs

I loosely followed these conventions for blocks with long parameter lists: https://github.com/rust-dev-tools/fmt-rfcs/blob/master/guide/items.md

That said, I also share the same concern with @TheBlueMatt about vertical code particularly within functions, though I have a higher tolerance lol. I would be curious to see if rustfmt changes any of the above files much or could be configured to match the conventions to some degree. I believe they mostly fit everything within a 100 characters with a few exceptions like recently added logging. So any other formatting may be more opinionated.

@dunxen
Copy link
Contributor

dunxen commented May 11, 2022

That said, I also share the same concern with @TheBlueMatt about vertical code particularly within functions, though I have a higher tolerance lol.

Yeah, and I do agree that this is pretty ugly that rustfmt thought was pretty: #1474 (comment)

I can see the current state of "configurability" with rustfmt.toml today and see if we could closely match something that is recommended in the codebase today, and then we could look at only applying that to diffs so it's not super noisy? If we don't like it, we can just go abandon it for now until things become more configurable.

For example, "compressed" function arguments would match what we like and obey the 100 char limit.

@TheBlueMatt
Copy link
Collaborator

See-also rust-bitcoin/rust-bitcoin#959 (maybe we let them go first, assuming it ever happens?) and the two rustfmt issues linked towards the top of this issue that i think are still blocking this.

@dunxen
Copy link
Contributor

dunxen commented May 11, 2022

See-also rust-bitcoin/rust-bitcoin#959 (maybe we let them go first, assuming it ever happens?) and the two rustfmt issues linked towards the top of this issue that i think are still blocking this.

Thanks for pointing me to this. Will also see if we need to open up any rustfmt issues directly / put up some PRs. Just don't want to turn it into an xkcd.com/1319

@douglaz
Copy link
Contributor

douglaz commented Feb 23, 2023

I don't know if its worth spending a ton of time debating this, but there is a tradeoff here between code readability and formatting overhead. Indeed, formatting overhead sucks, but, at least personally, I'll take code readability 100x over any formatting overhead.

I was like that a long time ago, but after developing a severe case of Repetitive Strain Injury (RSI) now I will readily accept a code that is "just" 90% okay if the formatting overhead is near 0.

Just try once to code with voice and you'll understand how an automatic formatter is a godsend.

But I agree that this must be done carefully in an existing codebase. It's certainly possible that some code may become unreadable after applying rustfmt and needs to be rewritten/refactored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants