Optionally allow git filters to be applied when reading blobs#13993
Optionally allow git filters to be applied when reading blobs#13993GrahamDennis wants to merge 8 commits intoNixOS:masterfrom
Conversation
|
Discussed in team meeting today:
|
|
For awareness, I have built a second patch on top of this patch to "smooth over" the breaking change that was made between nix < 2.20 and nix >= 2.20 with the handling of
This is intended to support a migration at Day Job where we can:
This additional patch is available here (for visibility): 2.31-maintenance...GrahamDennis:nix:gdennis/backcompat-2.31.2 If it is desired or considered more broadly useful, I can add this as a separate PR on top of this PR (assuming a version of this PR is accepted). My assumption is that this use case is sufficiently niche to not be worth merging into nix. |
|
@roberth, do you plan on reviewing this PR still or should I take over? |
roberth
left a comment
There was a problem hiding this comment.
@xokdvium I'd appreciate that, as I don't have the bandwidth right now for an in depth review.
My main remaining worry is that current or future libgit2 may apply smudge/clean filters as part of the current function call. Those are a significant risk in terms of fetcher reproducibility and therefore evaluation reproducibility; the reason why we only opt in to well-behaved filters like Git LFS and possibly others in the future.
This risk should be mitigated with another test case that sets up a custom smudge/clean filter and asserting that the filter did not run.
My apologies for the delay.
| rm -rf $TEST_HOME/.cache/nix | ||
| export GIT_CONFIG_GLOBAL="$TEST_ROOT/gitconfig" | ||
| git config --global core.autocrlf true | ||
| narhash=$(nix eval --raw --impure --expr "(builtins.fetchGit { url = \"$repo\"; ref = \"master\"; }).narHash") |
There was a problem hiding this comment.
This may succeed due to the fetcher cache.
Maybe run this as the very first test case
Then at least the error will be caught by the second run (without autocrlf), where it produces a hash mismatch.
This should be possible to mitigate with --tarball-ttl 0, but I haven't checked.
| "publicKeys", | ||
| "url", "ref", "rev", "shallow", "submodules", "lfs", "exportIgnore", | ||
| "lastModified", "revCount", "narHash", "allRefs", "name", "dirtyRev", "dirtyShortRev", | ||
| "verifyCommit", "keytype", "publicKey", "publicKeys", "applyFilters", |
There was a problem hiding this comment.
As an aside, it's strange that this doesn't need a change to .clang-format.
| git_blob_filter_options opts = GIT_BLOB_FILTER_OPTIONS_INIT; | ||
|
|
||
| opts.attr_commit_id = state->oid; | ||
| opts.flags = GIT_BLOB_FILTER_ATTRIBUTES_FROM_COMMIT; |
There was a problem hiding this comment.
| opts.flags = GIT_BLOB_FILTER_ATTRIBUTES_FROM_COMMIT; | |
| opts.flags = GIT_BLOB_FILTER_ATTRIBUTES_FROM_COMMIT | GIT_BLOB_FILTER_NO_SYSTEM_ATTRIBUTES; |
For reproducibility, we don't want the filtering to depend on the user's configuration.
| [[ "$narhash" = "sha256-k7u7RAaF+OvrbtT3KCCDQA8e9uOdflUo5zSgsosoLzA=" ]] | ||
|
|
||
|
|
||
| # Ensure that NAR hash doesn't depend on user configuration. |
There was a problem hiding this comment.
Surprised this succeeds given that GIT_BLOB_FILTER_NO_SYSTEM_ATTRIBUTES isn't currently passed.
|
|
||
| nix eval --impure --expr "let attrs = builtins.fetchGit $empty; in assert attrs.lastModified != 0; assert attrs.rev != \"0000000000000000000000000000000000000000\"; assert attrs.revCount == 1; true" | ||
|
|
||
| # Test a repo with `eol=crlf`. |
There was a problem hiding this comment.
One thing we need to test here is what happens if a file has the text attribute but no eol attribute, e.g.
*.txt text
According to the git man page, it will use the system default, i.e. convert to CR/LF on Windows. That's bad because it means the NAR hash of a flake input will depend on the system type.
However, it's not clear to me what libgit2 does. It does have
#ifdef GIT_WIN32
GIT_EOL_NATIVE = GIT_EOL_CRLF,
#else
GIT_EOL_NATIVE = GIT_EOL_LF,
#endif
GIT_EOL_DEFAULT = GIT_EOL_NATIVE,
but from what I can tell this might not have an effect it GIT_BLOB_FILTER_NO_SYSTEM_ATTRIBUTES is passed. But we should add a test that checks that such as repo produces the expected NAR hash.
| opts.attr_commit_id = state->oid; | ||
| opts.flags = GIT_BLOB_FILTER_ATTRIBUTES_FROM_COMMIT; | ||
|
|
||
| int error = git_blob_filter(&filtered, blob.get(), path.rel_c_str(), &opts); |
There was a problem hiding this comment.
One maybe theoretical concern is that this applies all the builtin libgit2 filters. So if a new filters is added to libgit2 in the future, a version of Nix linked against the new libgit2 might produce a different result than when it's linked against the old one.
It may be better to use git_filter_list_apply_to_blob because that allows specifying an explicit list of filters. However, I don't see a way to construct a git_filter_list containing only the expected filters...
| [[ "$narhash" = "sha256-BBhuj+vOnwCUnk5az22PwAnF32KE1aulWAVfCQlbW7U=" ]] | ||
| narhash=$(nix eval --raw --impure --expr "(builtins.fetchGit { url = \"$repo\"; ref = \"master\"; applyFilters = true; }).narHash") | ||
| [[ "$narhash" = "sha256-k7u7RAaF+OvrbtT3KCCDQA8e9uOdflUo5zSgsosoLzA=" ]] | ||
| unset GIT_CONFIG_GLOBAL |
There was a problem hiding this comment.
We should probably have a test here for the other builtin filter, ident, which replaces $Id$ in source files.
Taken from NixOS#13993. Co-authored-by: Josh Spence <jspence@anduril.com> Co-authored-by: Graham Dennis <gdennis@anduril.com>
Motivation
Nix 2.20 introduced a regression (a breaking change) in which git repositories do not properly have filters applied (specifically filters that are configured via
.gitattributes). This manifests as "NAR hash mismatch" errors when users are upgrading from earlier versions and they have a dependency that uses.gitattributesto manipulate line endings.curlis the most notable example I am aware of which, for example, specifies in.gitattributesthat*.batfiles should have CRLF line endings.This change differs from #13428 in that the nix 2.20 behavior is retained as the default, with the previous behavior now being opt-in via the optional
applyFiltersattribute to git flake inputs. As such, this PR does not introduce a breaking change.Context
Fixes #11428, alternative to #13428.
The current nix behaviour seems completely broken when it comes to handling git filters.
The output of the above script on my machine is as follows:
This shows that the way Nix is handling git repositories is inconsistent with git, irrespective of the user and/or system git configuration.
Another manifestation of this bug is that the NAR hash for a git repository can change depending on the order of evaluations. This can be demonstrated by the following script:
The output of this script is as follows:
Add 👍 to pull requests you find important.
The Nix maintainer team uses a GitHub project board to schedule and track reviews.