doc: add clarification on logical unit for each commit#285193
doc: add clarification on logical unit for each commit#285193DanielSidhion wants to merge 1 commit intoNixOS:masterfrom
Conversation
| - Create a commit for each logical unit at the Nixpkgs level. | ||
| This means that you should rarely break multiple changes in the same file into different logical units, because that's too fine-grained at the Nixpkgs level. | ||
| Some cases where you'd do so are when each change is linked to a different change in Nixpkgs as a whole. | ||
| For example, if you're refactoring multiple files due to the same change, this is considered a logical unit, and should be in the same commit. | ||
| However, fixing different packages due to unrelated issues should ideally be in different commits, even if they're included in the same PR. |
There was a problem hiding this comment.
This might just be me, but this wording blurs my understanding of a logical unit. Previously, I would've thought that a logical unit meant a single change with a single intent, and that change or intent would be representable as a single short sentence in the commit message.
This however, seems to claim that any extra changes to the package would also be considered part of one commit. So if you have a pull request with the following commits, currently split up into what I'd previously call logical units:
package: 1.0.0 -> 1.1.0
(also changes some dependencies that are a direct result
of the package being upgraded)
package: clean up expression
package: add <name> as maintainer
package: add version test
Then that should be reduced to just package: 1.0.0 -> 1.1.0, and would still be considered a logical unit?
In my opinion, this makes the git history much more unclear and hard to read, and almost defeats the purpose of having this convention in the first place. At least, I'd not name it a logical unit, but rather a "per-package commit" or something of that sort.
There was a problem hiding this comment.
That being said, I do realise that it can be unnecessarily painstaking for both the PR author and the reviewer to go forth and back about splitting commits and ensuring everything is ordered in what I'd previously call logical units, despite all the changes being otherwise mergeable.
There was a problem hiding this comment.
Thank you for starting this discussion, since I could definitely use the help making that clearer.
It might make more sense to reword that item with the outcome we're trying to create: not having superfluous commits in the PR, to avoid overhead when reviewing commits separately and/or having to squash all commits when merging the PR.
The same conventions later mention the following:
If you have commits pkg-name: oh, forgot to insert whitespace: squash commits in this case. Use git rebase -i.
Based on your example, I could see cases where package: clean up expression should remain as a single commit or all 3 bottom commits should also become a single commit.
There are too many combinations of what a committer/merger thinks should be merged with a squash commit or not, and what the PR author thinks is a logical unit, and what reviewers think is a logical unit and/or want to review.
After thinking more about this, I think it's impossible to clarify all of these cases directly, but it might still be possible to ask something like "think about whether you're creating reviewing overhead with too many commits, or whether a committer might be forced to squash your PR when merging, and try to avoid those cases".
There was a problem hiding this comment.
IMO mega-commits (except treewide) should be avoided. Large changes should be split into multiple commits in a way that first and foremost minimize the review effort. Each change can then also be reverted in isolation, and carry more specific and useful context in their commit message body. (The commit messages by nix-update --commit are a good example of why bump commits should be separate)
Example where the diff in the /files endpoint completely fails to show the true change compared to reviewing each commit: #271476
What we want to discourage are fixup commits like package: address review comments (unless the review is understood to take a long time and a squash will no matter what have to be a part of the final touch-ups anyway) or package: do x followed by a package: undo x and do y which got added after review. Those should be squashed
There was a problem hiding this comment.
There are too many combinations of what a committer/merger thinks should be merged with a squash commit or not
I think most effective way to reduce back-and-forth between reviewers and contributors would be to give contributors room for "ideally one, but up to 3 commits per package/module/unit per PR", before a reviewer have grounds to request a squash. This would allow a commit for both the intended change, a bump and a cleanup commit. "3" is subject to bikeshedding
There was a problem hiding this comment.
What we want to discourage are fixup commits like package: address review comments
Actually I'd like to contest even that. I've done PRs (e.g. #278805) with very carefully crafted commit histories, where each commit is a logical unit. After the review, it would be very tedious to amend each commit with its own changes, potentially requiring conflict resolution at every step. So just adding a commit that addresses the review altogether saves a lot of time and sanity.
It might be best for this section to not give any concrete guidelines, but rather just list good and bad examples.
There was a problem hiding this comment.
Actually I'd like to contest even that. I've done PRs (e.g. #278805) with very carefully crafted commit histories, where each commit is a logical unit. After the review, it would be very tedious to amend each commit with its own changes, potentially requiring conflict resolution at every step. So just adding a commit that addresses the review altogether saves a lot of time and sanity.
I see the following problems with that:
- Another reviewer comes in later and reviews the code commit-by-commit - they won't know about the changes at the end, which makes it incredibly hard.
- Same thing when tracking down the history of changes via git blame, it's incredibly useful to have those related changes not split up in different commits.
- Reverting a single commit is much harder than needed, because you need a different commit as well, but only a part of it. Annoying.
- Partial backports are becoming harder as well.
Yes, you'll need to rebase and resolve conflicts. But that's done once. Other people then have to deal with this forever, the git history is in stone.
TLDR: Reviewer/reader convenience trumps author/writer convenience.
There was a problem hiding this comment.
Fair! We can also encourage using e.g. https://github.com/tummychow/git-absorb to make rebasing easier.
| - Create a commit for each logical unit at the Nixpkgs level. | ||
| This means that you should rarely break multiple changes in the same file into different logical units, because that's too fine-grained at the Nixpkgs level. | ||
| Some cases where you'd do so are when each change is linked to a different change in Nixpkgs as a whole. | ||
| For example, if you're refactoring multiple files due to the same change, this is considered a logical unit, and should be in the same commit. | ||
| However, fixing different packages due to unrelated issues should ideally be in different commits, even if they're included in the same PR. |
There was a problem hiding this comment.
From my experience reviewing, commits are either correct or too big, most of the time. And when the commits are split up too much, it's much easier for everyone to say "please squash X and Y" instead of having them split up a commit, which is always annoying.
So imho, the "logical unit" explanation should focus on "the smallest possible logical unit" and on "making commits more reviewable".
With the changes written right now, it feels like it goes in the other direction - it encourages to "put everything in the same file into one commit".
Since it's probably the thing happening most often, I feel like we should make an example about "updating a package" and "cleaning up / refactoring / modernizing". Very often, those are put together into the same commit, which makes it much harder than needed to see which changes are actually required for the update and which are only supposed to be improvements on top etc. - also I think this kind of an example, most can relate to easily.
|
Ref #400934 |
Description of changes
The contributing guide doesn't really clarify what a "logical unit" is when it comes to how to separate commits. This is an attempt at doing so, although I could definitely use help being more specific.
The PR also adds a sentence to clarify that PR titles should follow the same titling conventions from commits.
Things done
nix.conf? (See Nix manual)sandbox = relaxedsandbox = truenix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/)Add a 👍 reaction to pull requests you find important.