Skip to content

doc: add clarification on logical unit for each commit#285193

Closed
DanielSidhion wants to merge 1 commit intoNixOS:masterfrom
DanielSidhion:clarify-conventions
Closed

doc: add clarification on logical unit for each commit#285193
DanielSidhion wants to merge 1 commit intoNixOS:masterfrom
DanielSidhion:clarify-conventions

Conversation

@DanielSidhion
Copy link
Member

Description of changes

The contributing guide doesn't really clarify what a "logical unit" is when it comes to how to separate commits. This is an attempt at doing so, although I could definitely use help being more specific.

The PR also adds a sentence to clarify that PR titles should follow the same titling conventions from commits.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.05 Release Notes (or backporting 23.05 and 23.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux. labels Jan 31, 2024
Comment on lines +513 to +517
- Create a commit for each logical unit at the Nixpkgs level.
This means that you should rarely break multiple changes in the same file into different logical units, because that's too fine-grained at the Nixpkgs level.
Some cases where you'd do so are when each change is linked to a different change in Nixpkgs as a whole.
For example, if you're refactoring multiple files due to the same change, this is considered a logical unit, and should be in the same commit.
However, fixing different packages due to unrelated issues should ideally be in different commits, even if they're included in the same PR.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might just be me, but this wording blurs my understanding of a logical unit. Previously, I would've thought that a logical unit meant a single change with a single intent, and that change or intent would be representable as a single short sentence in the commit message.

This however, seems to claim that any extra changes to the package would also be considered part of one commit. So if you have a pull request with the following commits, currently split up into what I'd previously call logical units:

package: 1.0.0 -> 1.1.0
         (also changes some dependencies that are a direct result
          of the package being upgraded)
package: clean up expression
package: add <name> as maintainer
package: add version test

Then that should be reduced to just package: 1.0.0 -> 1.1.0, and would still be considered a logical unit?

In my opinion, this makes the git history much more unclear and hard to read, and almost defeats the purpose of having this convention in the first place. At least, I'd not name it a logical unit, but rather a "per-package commit" or something of that sort.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That being said, I do realise that it can be unnecessarily painstaking for both the PR author and the reviewer to go forth and back about splitting commits and ensuring everything is ordered in what I'd previously call logical units, despite all the changes being otherwise mergeable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for starting this discussion, since I could definitely use the help making that clearer.

It might make more sense to reword that item with the outcome we're trying to create: not having superfluous commits in the PR, to avoid overhead when reviewing commits separately and/or having to squash all commits when merging the PR.

The same conventions later mention the following:

If you have commits pkg-name: oh, forgot to insert whitespace: squash commits in this case. Use git rebase -i.

Based on your example, I could see cases where package: clean up expression should remain as a single commit or all 3 bottom commits should also become a single commit.

There are too many combinations of what a committer/merger thinks should be merged with a squash commit or not, and what the PR author thinks is a logical unit, and what reviewers think is a logical unit and/or want to review.
After thinking more about this, I think it's impossible to clarify all of these cases directly, but it might still be possible to ask something like "think about whether you're creating reviewing overhead with too many commits, or whether a committer might be forced to squash your PR when merging, and try to avoid those cases".

Copy link
Member

@pbsds pbsds Jan 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO mega-commits (except treewide) should be avoided. Large changes should be split into multiple commits in a way that first and foremost minimize the review effort. Each change can then also be reverted in isolation, and carry more specific and useful context in their commit message body. (The commit messages by nix-update --commit are a good example of why bump commits should be separate)

Example where the diff in the /files endpoint completely fails to show the true change compared to reviewing each commit: #271476

What we want to discourage are fixup commits like package: address review comments (unless the review is understood to take a long time and a squash will no matter what have to be a part of the final touch-ups anyway) or package: do x followed by a package: undo x and do y which got added after review. Those should be squashed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are too many combinations of what a committer/merger thinks should be merged with a squash commit or not

I think most effective way to reduce back-and-forth between reviewers and contributors would be to give contributors room for "ideally one, but up to 3 commits per package/module/unit per PR", before a reviewer have grounds to request a squash. This would allow a commit for both the intended change, a bump and a cleanup commit. "3" is subject to bikeshedding

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we want to discourage are fixup commits like package: address review comments

Actually I'd like to contest even that. I've done PRs (e.g. #278805) with very carefully crafted commit histories, where each commit is a logical unit. After the review, it would be very tedious to amend each commit with its own changes, potentially requiring conflict resolution at every step. So just adding a commit that addresses the review altogether saves a lot of time and sanity.

It might be best for this section to not give any concrete guidelines, but rather just list good and bad examples.

Copy link
Contributor

@wolfgangwalther wolfgangwalther Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I'd like to contest even that. I've done PRs (e.g. #278805) with very carefully crafted commit histories, where each commit is a logical unit. After the review, it would be very tedious to amend each commit with its own changes, potentially requiring conflict resolution at every step. So just adding a commit that addresses the review altogether saves a lot of time and sanity.

I see the following problems with that:

  • Another reviewer comes in later and reviews the code commit-by-commit - they won't know about the changes at the end, which makes it incredibly hard.
  • Same thing when tracking down the history of changes via git blame, it's incredibly useful to have those related changes not split up in different commits.
  • Reverting a single commit is much harder than needed, because you need a different commit as well, but only a part of it. Annoying.
  • Partial backports are becoming harder as well.

Yes, you'll need to rebase and resolve conflicts. But that's done once. Other people then have to deal with this forever, the git history is in stone.

TLDR: Reviewer/reader convenience trumps author/writer convenience.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair! We can also encourage using e.g. https://github.com/tummychow/git-absorb to make rebasing easier.

@FliegendeWurst FliegendeWurst added 8.has: documentation This PR adds or changes documentation awaiting_changes labels Nov 30, 2024
@wegank wegank added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jan 2, 2025
@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Apr 15, 2025
Comment on lines +513 to +517
- Create a commit for each logical unit at the Nixpkgs level.
This means that you should rarely break multiple changes in the same file into different logical units, because that's too fine-grained at the Nixpkgs level.
Some cases where you'd do so are when each change is linked to a different change in Nixpkgs as a whole.
For example, if you're refactoring multiple files due to the same change, this is considered a logical unit, and should be in the same commit.
However, fixing different packages due to unrelated issues should ideally be in different commits, even if they're included in the same PR.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my experience reviewing, commits are either correct or too big, most of the time. And when the commits are split up too much, it's much easier for everyone to say "please squash X and Y" instead of having them split up a commit, which is always annoying.

So imho, the "logical unit" explanation should focus on "the smallest possible logical unit" and on "making commits more reviewable".

With the changes written right now, it feels like it goes in the other direction - it encourages to "put everything in the same file into one commit".

Since it's probably the thing happening most often, I feel like we should make an example about "updating a package" and "cleaning up / refactoring / modernizing". Very often, those are put together into the same commit, which makes it much harder than needed to see which changes are actually required for the update and which are only supposed to be improvements on top etc. - also I think this kind of an example, most can relate to easily.

@h7x4
Copy link
Member

h7x4 commented Apr 23, 2025

Ref #400934

@nixpkgs-ci nixpkgs-ci bot added the 2.status: merge conflict This PR has merge conflicts with the target branch label Jul 14, 2025
@nixpkgs-ci nixpkgs-ci bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Oct 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2.status: merge conflict This PR has merge conflicts with the target branch 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md 8.has: documentation This PR adds or changes documentation 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants