-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support relative paths to linked working trees #5898
Conversation
The GetAllWorkTrees() function in our "git" package was introduced as part of the implementation of the "git lfs prune" command in PR git-lfs#742. This function examines the contents of a repository's ".git/worktrees" folder and returns an array of Worktree structures for any linked working trees it finds, plus one Worktree structure for the main working tree, if one exists. The "git lfs prune" command then checks these working trees' current Git references and indexes for any Git LFS objects it should retain in the Git LFS object cache. In subsequent commits in this PR we expect to add an alternative implementation of the GetAllWorkTrees() function which uses the "git worktree list" command, as this will ensure we remain compatible with some forthcoming changes in Git that will alter the contents of the "gitdir" files in ".git/worktrees" hierarchy. We will need to retain the legacy implementation of the GetAllWorkTrees() function as well, as prior to Git version 2.36.0 the "git worktree list" command is either not available or does not support the full set of options we will require. Before introducing our new version of the GetAllWorkTrees() function, we first expand the set of tests and checks we perform against it and the "git lfs prune" command. This will allow us to verify that both implementations return identical data under the same conditions. In particular, in our existing TestWorkTrees() Go test function we add a linked working tree which references a detached HEAD in the form of local tag instead of a branch, and confirm that its SHA is also used as its name, and that its type is returned as RefTypeOther. We also make use of the NoError() assertion to check the returned error value from the GetAllWorkTrees() function. As well, we add a new TestWorktreesBareRepo() Go test function which checks that when GetAllWorkTrees() is run in a bare repository it does not return a Worktree structure at all, as there is no main working tree. To create the bare repository we add a NewBareRepo() wrapper function to our "util" test utility package. In our t/t-prune-worktree.sh shell test script, we slightly expand the set of checks performed by our "prune worktree" test, just to clarify which objects are pruned by our "git lfs prune" command when a linked working tree has been removed but the "git worktree prune" command has not yet been run. Finally, we add a "prune worktree (bare main)" test to our test script which confirms that when the "git lfs prune" command is run in a working tree linked to a bare repository, no Git LFS objects are pruned, even if they match no retention configuration settings. This is because the bare main repository has no Git remotes, so all Git LFS objects are treated as never having been pushed to a remote and are therefore all ineligible for pruning.
The GetAllWorkTrees() function in our "git" package was introduced as part of the implementation of the "git lfs prune" command in PR git-lfs#742, along with the Worktree structure and various related functions, such as the pruneTaskGetRetainedWorktree() function run by the "git lfs prune" command. In all instances, we use the terms "Worktree" and "worktree" rather than the camel-case "WorkTree" or "workTree", except in the name of the GetAllWorkTrees() function and its associated TestWorkTrees() test function. Before we add another implementation of the GetAllWorkTrees() function in a subsequent commit in this PR, we first revise the function's name to match that of the structures it returns, which slightly improves the consistency of our naming scheme.
The GetAllWorktrees() function in our "git" package was introduced as part of the implementation of the "git lfs prune" command in PR git-lfs#742. This function examines the contents of a repository's ".git/worktrees" folder and returns an array of Worktree structures for any linked working trees it finds, plus one Worktree structure for the main working tree, if one exists. The "git lfs prune" command then checks these working trees' current Git references and indexes for any Git LFS objects it should retain in the Git LFS object cache. In subsequent commits in this PR we expect to add an alternative implementation of the GetAllWorktrees() function which uses the "git worktree list" command, as this will ensure we remain compatible with some forthcoming changes in Git that will alter the contents of the "gitdir" files in ".git/worktrees" hierarchy. We will need to retain the legacy implementation of the GetAllWorktrees() function as well, as prior to Git version 2.36.0 the "git worktree list" command is either not available or does not support the full set of options we will require. The "git worktree list" command reports when a linked working tree no longer exists and so its data in the ".git/worktrees" hierarchy could be pruned with the "git worktree prune" command. Our new implementation of the GetAllWorktrees() function will therefore be able to return this prunable state as a flag in the Worktree structure. The pruneTaskGetRetainedIndex() function run by our "git lfs prune" command calls the ScanIndex() method of the GitScanner structure in our "lfs" package, whose internal functions then create two DiffIndexScanner structures and invoke their Scan() methods repeatedly. That structure's initialization function, NewDiffIndexScanner(), uses the DiffIndex() function in our "git" package to start a "git diff-index" command and pipe its output into a buffer, which is then consumed by the DiffIndexScanner's Scan() method. As we run these "git diff-index" commands in each linked working tree, if it no longer actually exists, the commands simply fail. We silently ignore these failures because the DiffIndex() function uses the gitBufferedStdout() function to start the command, and that discards the stderr stream, we never call the Wait() method of the underlying Cmd structure from the "os/exec" package, so we also discard the exit code from the "git diff-index" command. However, we can avoid running unnecessary "git diff-index" commands entirely if we detect that a linked working tree no longer exists. As this will be straightforward in our new implementation of the GetAllWorktrees() function, we first update our existing version of the function to perform a check similar to one made by Git, and return a true flag value in our Worktree structure if the working tree is missing. We add a Prunable flag to the Worktree structure, and check it in the pruneTaskGetRetainedWorktree() function so we only run the pruneTaskGetRetainedIndex() function if a linked working tree's Prunable flag is false. We then expand our existing TestWorktrees() test function to validate that when a linked working tree is removed, the Prunable flag is set as we expect. Note that our "prune worktree" test in the t/t-prune-worktree.sh test script already performs checks of our "git lfs prune" command after a linked working tree is removed. These checks pass without the changes in this commit because the "git diff-index" commands we try to run in the working tree simply fail, and we then ignore those failures. With the changes in this commit, the behaviour of the "git lfs prune" command remains the same, but it no longer tries to run "git diff-index" in non-extant working trees.
The GetAllWorktrees() function in our "git" package was introduced as part of the implementation of the "git lfs prune" command in PR git-lfs#742. This function examines the contents of a repository's ".git/worktrees" folder and returns an array of Worktree structures for any linked working trees it finds, plus one Worktree structure for the main working tree, if one exists. The "git lfs prune" command then checks these working trees' current Git references and indexes for any Git LFS objects it should retain in the Git LFS object cache. As noted in previous commits in this PR, to check for Git LFS objects which should be retained, we run the "git diff-index" command in each extant linked working tree. Specifically, the commands are started by the DiffIndex() function in our "git" package, which is run by the NewDiffIndexScanner() function called by the internal functions invoked by the ScanIndex() method of the GitScanner structure in our "lfs" package. That method is run by the pruneTaskGetRetainedIndex() function of our "git lfs prune" command. In order to execute the "git diff-index" command in each linked working tree, we require the full, absolute paths of the working trees. At the moment, our GetAllWorktrees() function expects that it can retrieve these full paths from the "gitdir" files in each working tree's subdirectory under the ".git/worktrees" directory. However, forthcoming changes in Git, likely to be released in Git version 2.48.0, will alter the contents of these "gitdir" files to contain relative paths. As a result, we will no longer be able to rely on them to determine a full path to each linked working tree. In particular, commit git/git@717af91 causes Git to now write relative paths into the "gitdir" files, while still supporting any legacy absolute paths found in existing "gitdir" files. We can avoid the need to try to parse these relative paths and determine the correct full path by using the "git worktree list" command to retrieve the list of linked working trees, plus the main working tree, if any. This command returns full paths along with the current Git references associated with each working tree, plus several additional attributes. Prior to Git version 2.36.0, though, the "git worktree list" command either did not exist, lacked support for the "--porcelain" option, or lacked support for the "-z" option, so we continue to use our existing GetAllWorktrees() function's implementation when the available Git version is lower than 2.36.0. We therefore rename the existing version of our GetAllWorktrees() function to getAllWorktreesFromGitDir(), and invoke it at the top of our new version of the GetAllWorktrees() function if the Git version is not 2.36.0 or above. Our new implementation of the GetAllWorktrees() function simply calls the "git worktree list --porcelain -z" command and parses the output, creating the same array of Worktree structures as if we had parsed the contents of the ".git/worktrees" hierarchy directly, except that we can depend on the Git command to resolve any relative paths in "gitdir" files into absolute ones for us. We do need to call the Clean() function of the "path/filepath" package to ensure the directory paths are reformatted to match those returned by the Dir() function of the same package, since that is used by our legacy GetAllWorktrees() function. On Windows in particular the Clean() function will adjust the directory path separators to align with the default for the OS. Our parsing logic requires that we find at least an initial "worktree" line and a subsequent "HEAD" line in the output from the "git worktree list" command. Assuming those are found and have non-empty values, we can return a Worktree structure containing them. However, if we also find a "bare" attribute in a subsequent line, we do not return a Worktree structure for this entry, as we want to ignore bare main repositories in the same manner as our existing GetAllWorktrees() function does. Conveniently, the "git worktree list" command checks if the paths it returns actually exist, and if they do not, it includes a "prunable" attribute in its output. If we detect this attribute we then set the Prunable flag that we introduced into our Worktree structure in a prior commit in this PR.
I should note that the updated tests in this PR, both the Go and shell tests, run successfully using Git v2.35.0. That verifies that our existing Our CI jobs currently run the tests against Git v2.0.0 (which doesn't support multiple working trees at all) and Git v2.47.0, plus a build of Git from the upstream development project. The jobs with Git v2.0.0 therefore skip all the worktree-related tests and code, while the other two jobs both test our new implementation which uses However, none of these runs verify that our old approach is still used and works for Git versions between 2.5.0 (when multiple working trees were introduced) and 2.35.0, which is the last version without support for the |
This PR adds an alternative implementation of the function used by the
git lfs prune
command to locate all linked working trees, in order to ensure that Git LFS continues to be compatible with future versions of Git. This change addresses the recent CI job failures seen in ourBuild with latest Git
jobs; for instance, in this job run for PR #5876.Since the introduction of support in Git for multiple linked working trees, in version 2.5.0, paths to these working trees have been stored in the
.git/worktrees
internal file structure as full, absolute paths. At present, Git LFS reads some of these internal files in order to retrieve the full paths to all linked working trees, which it needs when running thegit lfs prune
command. Thegitscanner.ScanIndex()
method invoked bygit lfs prune
results in twogit diff-index
commands being run in each linked working tree. One of those, in particular, scans the staged changes in the working tree's Git index for pointers to Git LFS objects, as any such references mean that the corresponding Git LFS objects should be retained and not pruned.The next release of Git (version 2.48.0) is expected to include changes to the representation of linked working tree file paths in the
.git/worktrees
internal file structure. As of commit git/git@717af91 the paths in the.git/worktrees/*/gitdir
files will be stored as relative paths rather than absolute ones, while still supporting oldergitdir
files containing absolute paths.While we could continue to read those
gitdir
files and try to detect relative paths, and then join those to the full path of the main working tree, we can avoid this complexity and the fragility it implies by using thegit worktree list
command instead. Since Git version 2.36.0 this command has offered both a--porcelain
mode and a-z
option, which ensures any file paths with unusual characters can be easily parsed.We therefore introduce an alternative implementation of our
GetAllWorktrees()
function—which we rename fromGetAllWorkTrees()
, for consistency with the rest of our codebase—that uses thegit worktree list
command to retrieve a list of all linked working trees, plus the main working tree, if it exists (i.e., if the main repository is not bare).Before adding our new implementation, we first expand our Go and shell test suites to check a number of additional cases involving linked working trees, such as trees with detached
HEAD
s, bare main repositories, and working trees that have been deleted but for which Git still has internal metadata.Because the
git worktree list
command provides a convenientprunable
attribute which indicates whether a linked working tree actually still exists, we also update our legacy implementation to perform a similar check, which means thegit lfs prune
command can now quickly skip trying to rungit diff-index
in nonexistent working trees.This PR should be most easily reviewed on a commit-by-commit basis.