Skip to content

fetchFromGitHub: converge arguments that determines useFetchGit#456226

Merged
ShamrockLee merged 3 commits intoNixOS:masterfrom
ShamrockLee:fetchgithub-fetchgit
Dec 24, 2025
Merged

fetchFromGitHub: converge arguments that determines useFetchGit#456226
ShamrockLee merged 3 commits intoNixOS:masterfrom
ShamrockLee:fetchgithub-fetchgit

Conversation

@ShamrockLee
Copy link
Contributor

@ShamrockLee ShamrockLee commented Oct 27, 2025

Converge fetchgit-related fetchFromGitHub arguments specification.
Before this change, arguments like fetchSubmodules are listed in four places:

  • fetchFromGitHub's argument set pattern.
  • useFetchGit determination.
  • passthruFun's exclusion list.
  • Arguments to pass down to fetchgit.

These lists get out of sync over time. Before this change, specifying any of the following results in called with unexpected argument error:

  • leaveDotGit = false
  • fetchLFS = false
  • rootDir = ""
  • sparseCheckout = [ ]

This PR converges the specification and handling of these argument into the let-in block, and add back the additional __functionArgs with a helper function adjustFunctionArgs.

This PR depends on PR #462032 for cleaner implementation.

Things done

  • Built on platform:
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • Tested, as applicable:
  • Ran nixpkgs-review on this PR. See nixpkgs-review usage.
  • Tested basic functionality of all binary files, usually in ./result/bin/.
  • Nixpkgs Release Notes
    • Package update: when the change is major or breaking.
  • NixOS Release Notes
    • Module addition: when adding a new NixOS module.
    • Module update: when the change is significant.
  • Fits CONTRIBUTING.md, pkgs/README.md, maintainers/README.md and other READMEs.

Add a 👍 reaction to pull requests you find important.

@nixpkgs-ci nixpkgs-ci bot added 10.rebuild-linux: 11-100 This PR causes between 11 and 100 packages to rebuild on Linux. 10.rebuild-darwin: 11-100 This PR causes between 11 and 100 packages to rebuild on Darwin. 6.topic: fetch Fetchers (e.g. fetchgit, fetchsvn, ...) labels Oct 27, 2025
@ShamrockLee ShamrockLee changed the title fetchFromGitHub: converge fetchgit-specific arguments fetchFromGitHub: converge arguments that determines useFetchGit Oct 28, 2025
@ShamrockLee ShamrockLee force-pushed the fetchgithub-fetchgit branch 3 times, most recently from 0f69bce to 8fe9d9e Compare October 28, 2025 13:13
@nixpkgs-ci nixpkgs-ci bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux. and removed 10.rebuild-linux: 11-100 This PR causes between 11 and 100 packages to rebuild on Linux. 10.rebuild-darwin: 11-100 This PR causes between 11 and 100 packages to rebuild on Darwin. labels Oct 28, 2025
@ShamrockLee
Copy link
Contributor Author

This PR now achieves zero rebuilds, indicating that the implementation is likely correct.

@ShamrockLee ShamrockLee marked this pull request as ready for review October 28, 2025 13:25
@nix-owners nix-owners bot requested a review from philiptaron October 28, 2025 13:26
@nixpkgs-ci nixpkgs-ci bot added 9.needs: reviewer This PR currently has no reviewers requested and needs attention. and removed 9.needs: reviewer This PR currently has no reviewers requested and needs attention. labels Oct 28, 2025
@philiptaron
Copy link
Contributor

I think the same implementation is in fetchFromGitLab also. Could you take a look if it's been copy-and-pasted into that fetcher, possibly with small mutations? Doesn't have to be fixed in the same PR.

Copy link
Contributor

@philiptaron philiptaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the performance results, this is a 1% time regression due to the additional complexity in this extremely hot code path.

Is there a way to shave that down without losing the correctness gains?

@ShamrockLee
Copy link
Contributor Author

ShamrockLee commented Oct 28, 2025

Looking at the performance results, this is a 1% time regression due to the additional complexity in this extremely hot code path.

How do you get the "1% time regression" statistics from the performance result? (Is it the CPU time difference?)

Is there a way to shave that down without losing the correctness gains?

I prepared a more verbose version, clearing useFetchGitArgs from the passthurAttrs exclusion list and place the other three lists side-by-side. I want to know how to inspect the performance before pushing it up.

Update:
Is it $0.1437 / 24.9144 \approx 0.0058 \approx 1\%$?

Unchanged values

metric value
sizes.Attr1 16
sizes.Bindings2 24
sizes.Env3 8
sizes.Value4 16

Updated values

metric mean_before mean_after mean_diff mean_pct_change p_value t_stat
time.cpu5 24.9144 25.0581 0.1437 0.5234 0.0766 1.8074
time.gc6 2.0047 1.9893 -0.0153 -0.8287 0.6015 -0.5256
time.gcFraction7 0.0774 0.0761 -0.0013 -1.3549 0.2126 -1.2621
gc.cycles8 8.4423 8.4615 0.0192 0.3059 0.6591 0.4437
gc.heapSize9 1746383163.0769 1761224546.4615 14841383.3846 0.5139 0.0298 2.2354
gc.totalBytes10 3335312170.4615 3344562552.9231 9250382.4615 0.2866 - 14.7469
envs.bytes11 588356853.5385 590655201.2308 2298347.6923 0.4072 - 14.7680
list.bytes12 83221370.0000 83875598.4615 654228.4615 0.8106 - 14.7723
sets.bytes13 1262040346.6154 1266150724.1538 4110377.5385 0.3418 - 14.7311
symbols.bytes14 1498794.0769 1499042.6923 248.6154 0.0176 - 383.1976
values.bytes15 797273941.2308 799022144.9231 1748203.6923 0.2328 - 14.7752
envs.number16 29550061.2885 29693621.4615 143560.1731 0.5071 - 14.7631
nrAvoided17 36011846.5577 36140048.2115 128201.6538 0.3649 - 14.7796
nrExprs18 1591866.6538 1591957.6538 91.0000 0.0060 - inf
nrFunctionCalls19 26580852.0192 26710839.7500 129987.7308 0.5091 - 14.7593
nrLookups20 13024245.7115 13058710.5000 34464.7885 0.2832 - 14.7367
nrOpUpdateValuesCopied21 39877018.4038 39899766.5000 22748.0962 0.0601 - 14.6064
nrOpUpdates22 4555007.8846 4582691.7500 27683.8654 0.6139 - 14.7267
nrPrimOpCalls23 15487426.7692 15534848.7308 47421.9615 0.3109 - 14.7525
nrThunks24 37628193.5192 37730703.2115 102509.6923 0.2853 - 14.7762
sets.number25 6443515.0192 6485331.7885 41816.7692 0.6843 - 14.7215
symbols.number26 120905.2885 120920.0577 14.7692 0.0126 - 250.3374
values.number27 49829621.3269 49938884.0577 109262.7308 0.2328 - 14.7752
envs.elements28 43994545.4038 44138278.6923 143733.2885 0.3403 - 14.7729
list.concats29 2124602.0577 2131355.0192 6752.9615 0.3339 - 14.7599
list.elements30 10402671.2500 10484449.8077 81778.5577 0.8106 - 14.7723
sets.elements31 69212249.1346 69406422.5769 194173.4423 0.2951 - 14.7342

Footnotes

  1. Size in bytes of the Attr type.

  2. Size in bytes of the Bindings type.

  3. Size in bytes of the Env type.

  4. Size in bytes of the Value type.

  5. Number of seconds of CPU time accounted by the OS to the Nix evaluator process. On UNIX systems, this comes from getrusage(RUSAGE_SELF).

  6. Number of seconds of CPU time accounted by the Boehm garbage collector to performing GC.

  7. What fraction of the total CPU time is accounted towards performing GC.

  8. Number of times garbage collection has been performed.

  9. Size in bytes of the garbage collector heap.

  10. Size in bytes of all allocations in the garbage collector.

  11. Size in bytes of all Env objects allocated by the Nix evaluator. These are almost exclusively created by nix-env.

  12. Size in bytes of all lists allocated by the Nix evaluator.

  13. Size in bytes of all attrsets allocated by the Nix evaluator.

  14. Size in bytes of all items in the Nix evaluator symbol table.

  15. Size in bytes of all values allocated by the Nix evaluator.

  16. The count of all Env objects allocated.

  17. The number of thunks avoided being created.

  18. The number of expression objects ever created.

  19. The number of function calls ever made.

  20. The number of lookups into an attrset ever made.

  21. The number of attrset values copied in the process of merging attrsets.

  22. The number of attrsets merge operations (//) performed.

  23. The number of function calls to primops (Nix builtins) ever made.

  24. The number of thunks ever made. A thunk is a delayed computation, represented by an expression reference and a closure.

  25. The number of attrsets ever made.

  26. The number of symbols ever added to the symbol table.

  27. The number of values ever made.

  28. The number of values contained within an Env object.

  29. The number of list concatenation operations (++) performed.

  30. The number of values contained within a list.

  31. The number of values contained within an attrset.

@ShamrockLee ShamrockLee force-pushed the fetchgithub-fetchgit branch 3 times, most recently from be0add3 to 1eaf608 Compare October 28, 2025 19:35
@ShamrockLee
Copy link
Contributor Author

I remember the CPU time was once better after change in my last force-push, but it goes even worse than the fully-converged version after pushing the formatting changes. Is the result significant?

Unchanged values

metric value
list.concats1 4402174
sizes.Attr2 16
sizes.Bindings3 24
sizes.Env4 8
sizes.Value5 16

Updated values

metric mean_before mean_after mean_diff mean_pct_change p_value t_stat
time.cpu6 24.8909 25.0528 0.1619 0.7715 0.2074 1.2771
time.gc7 2.0013 1.9662 -0.0350 -0.9057 0.2118 -1.2645
time.gcFraction8 0.0767 0.0752 -0.0015 -1.7244 0.0835 -1.7655
gc.cycles9 8.4038 8.4038 - 0.0389 1.0000 -
gc.heapSize10 1751222744.6154 1753803854.7692 2581110.1538 0.0260 0.1319 1.5313
gc.totalBytes11 3335563133.8462 3338222480.3077 2659346.4615 0.0823 - 14.7472
envs.bytes12 588428991.0769 588809652.4615 380661.3846 0.0675 - 14.8061
list.bytes13 83228920.6154 83499041.0769 270120.4615 0.3347 - 14.7561
sets.bytes14 1262109149.2308 1263358087.3846 1248938.1538 0.1037 - 14.7483
symbols.bytes15 1498806.8269 1498860.8269 54.0000 0.0038 - inf
values.bytes16 797349980.6154 797893392.3077 543411.6923 0.0724 - 14.7973
envs.number17 29553669.1923 29580789.5577 27120.3654 0.0958 - 14.7930
nrAvoided18 36016602.5192 36030189.9808 13587.4615 0.0387 - 14.8079
nrExprs19 1591878.1538 1591915.1538 37.0000 0.0025 - inf
nrFunctionCalls20 26584120.9615 26604458.5577 20337.5962 0.0797 - 14.7934
nrLookups21 13025607.3077 13039214.9231 13607.6154 0.1119 - 14.8128
nrOpUpdateValuesCopied22 39877613.4423 39878971.8846 1358.4423 0.0035 - 12.6506
nrOpUpdates23 4555565.8077 4556013.8462 448.0385 0.0097 - 12.6075
nrPrimOpCalls24 15489757.8846 15510046.4038 20288.5192 0.1330 - 14.7684
nrThunks25 37632795.5000 37660005.7500 27210.2500 0.0757 - 14.8073
sets.number26 6444239.7308 6458242.8077 14003.0769 0.2291 - 14.7487
symbols.number27 120906.7115 120910.7115 4.0000 0.0034 - inf
values.number28 49834373.7885 49868337.0192 33963.2308 0.0724 - 14.7973
envs.elements29 43999954.6923 44020417.0000 20462.3077 0.0485 - 14.8230
list.elements30 10403615.0769 10437380.1346 33765.0577 0.3347 - 14.7561
sets.elements31 69215462.2308 69272516.2500 57054.0192 0.0866 - 14.7479

Footnotes

  1. The number of list concatenation operations (++) performed.

  2. Size in bytes of the Attr type.

  3. Size in bytes of the Bindings type.

  4. Size in bytes of the Env type.

  5. Size in bytes of the Value type.

  6. Number of seconds of CPU time accounted by the OS to the Nix evaluator process. On UNIX systems, this comes from getrusage(RUSAGE_SELF).

  7. Number of seconds of CPU time accounted by the Boehm garbage collector to performing GC.

  8. What fraction of the total CPU time is accounted towards performing GC.

  9. Number of times garbage collection has been performed.

  10. Size in bytes of the garbage collector heap.

  11. Size in bytes of all allocations in the garbage collector.

  12. Size in bytes of all Env objects allocated by the Nix evaluator. These are almost exclusively created by nix-env.

  13. Size in bytes of all lists allocated by the Nix evaluator.

  14. Size in bytes of all attrsets allocated by the Nix evaluator.

  15. Size in bytes of all items in the Nix evaluator symbol table.

  16. Size in bytes of all values allocated by the Nix evaluator.

  17. The count of all Env objects allocated.

  18. The number of thunks avoided being created.

  19. The number of expression objects ever created.

  20. The number of function calls ever made.

  21. The number of lookups into an attrset ever made.

  22. The number of attrset values copied in the process of merging attrsets.

  23. The number of attrsets merge operations (//) performed.

  24. The number of function calls to primops (Nix builtins) ever made.

  25. The number of thunks ever made. A thunk is a delayed computation, represented by an expression reference and a closure.

  26. The number of attrsets ever made.

  27. The number of symbols ever added to the symbol table.

  28. The number of values ever made.

  29. The number of values contained within an Env object.

  30. The number of values contained within a list.

  31. The number of values contained within an attrset.

@ShamrockLee ShamrockLee force-pushed the fetchgithub-fetchgit branch 4 times, most recently from ede6697 to fa5f955 Compare October 29, 2025 12:41
@ShamrockLee
Copy link
Contributor Author

@philiptaron I force pushed twice with only variable renaming changes, but got +0.8450% and -1.0678% CPU time changes before and after force-push, both claim to be p < 0.01.

I don't think the CI report's CPU time change is significant or the p-value trustworthy.

@nixpkgs-ci nixpkgs-ci bot added 10.rebuild-linux: 11-100 This PR causes between 11 and 100 packages to rebuild on Linux. 10.rebuild-darwin: 11-100 This PR causes between 11 and 100 packages to rebuild on Darwin. labels Nov 16, 2025
@ShamrockLee ShamrockLee force-pushed the fetchgithub-fetchgit branch 2 times, most recently from d7b6de8 to 8e24202 Compare November 17, 2025 12:29
@ShamrockLee ShamrockLee force-pushed the fetchgithub-fetchgit branch 3 times, most recently from aaf0666 to 746e443 Compare December 4, 2025 07:35
@ShamrockLee ShamrockLee marked this pull request as ready for review December 9, 2025 14:46
@nixpkgs-ci nixpkgs-ci bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux. and removed 10.rebuild-linux: 11-100 This PR causes between 11 and 100 packages to rebuild on Linux. 10.rebuild-darwin: 11-100 This PR causes between 11 and 100 packages to rebuild on Darwin. labels Dec 9, 2025
@philiptaron
Copy link
Contributor

nixpkgs-review result

Generated using nixpkgs-review.

Command: nixpkgs-review pr 456226
Commit: cd58ed0ab6e94624acd1757c5241d2c4cf4568ad

Copy link
Contributor

@philiptaron philiptaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit complicated, but I think this works. The convergence is needed.

@nixpkgs-ci nixpkgs-ci bot added the 12.approvals: 1 This PR was reviewed and approved by one person. label Dec 9, 2025
@ShamrockLee
Copy link
Contributor Author

ShamrockLee commented Dec 10, 2025

It's a bit complicated,

@philiptaron I borrowed some code from fetchGitProvider to simplify the expression a little bit (in the last commit). Would it be better now?

If it looks even more complex, I can drop the additional commit.

Copy link
Contributor

@philiptaron philiptaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a little testing; I think this holds together. @ShamrockLee I'm looking for you to self-merge when you think it's ready. I don't have a harness that's good enough at detecting FOD breaks to merge it myself. Bu you do have the ✅ from me.

@ShamrockLee
Copy link
Contributor Author

Thank you!

I don't have a harness that's good enough at detecting FOD breaks to merge it myself.

Regrettably, I currently rely on post-merging complaints to detect out-of-test-coverage behavioral changes. The good news is that this PR causes zero rebuilds, which is a lot safer to merge.

@ShamrockLee ShamrockLee added this pull request to the merge queue Dec 24, 2025
Merged via the queue into NixOS:master with commit c16f38e Dec 24, 2025
28 of 30 checks passed
@ShamrockLee ShamrockLee deleted the fetchgithub-fetchgit branch December 24, 2025 23:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.topic: fetch Fetchers (e.g. fetchgit, fetchsvn, ...) 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux. 12.approvals: 1 This PR was reviewed and approved by one person.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants