Skip to content

Commit 1a6b238

Browse files
authored
builtin/clone: allow remote helpers to detect repo (#4908)
In 18c9cb7 (builtin/clone: create the refdb with the correct object format, 2023-12-12), we have changed git-clone(1) so that it delays creation of the refdb until after it has learned about the remote's object format. This change was required for the reftable backend, which encodes the object format into the tables. So if we pre-initialized the refdb with the default object format, but the remote uses a different object format than that, then the resulting tables would have encoded the wrong object format. This change unfortunately breaks remote helpers which try to access the repository that is about to be created. Because the refdb has not yet been initialized at the point where we spawn the remote helper, we also don't yet have "HEAD" or "refs/". Consequently, any Git commands ran by the remote helper which try to access the repository would fail because it cannot be discovered. This is essentially a chicken-and-egg problem: we cannot initialize the refdb because we don't know about the object format. But we cannot learn about the object format because the remote helper may be unable to access the partially-initialized repository. Ideally, we would address this issue via capabilities. But the remote helper protocol is not structured in a way that guarantees that the capability announcement happens before the remote helper tries to access the repository. Instead, fix this issue by partially initializing the refdb up to the point where it becomes discoverable by Git commands. ----- Cherry-picked the commit 199f44c to provide the backport-PR requested by @dscho in #4843 ... Commit message unchanged with the exception of the `Signed-off-by:`. The contents of the commit are anyway unchanged. Hope this works for you.
2 parents 8c06394 + 729fd15 commit 1a6b238

File tree

3 files changed

+59
-1
lines changed

3 files changed

+59
-1
lines changed

builtin/clone.c

+46
Original file line numberDiff line numberDiff line change
@@ -926,6 +926,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
926926
struct ref *mapped_refs = NULL;
927927
const struct ref *ref;
928928
struct strbuf key = STRBUF_INIT;
929+
struct strbuf buf = STRBUF_INIT;
929930
struct strbuf branch_top = STRBUF_INIT, reflog_msg = STRBUF_INIT;
930931
struct transport *transport = NULL;
931932
const char *src_ref_prefix = "refs/heads/";
@@ -1125,6 +1126,50 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
11251126
git_dir = real_git_dir;
11261127
}
11271128

1129+
/*
1130+
* We have a chicken-and-egg situation between initializing the refdb
1131+
* and spawning transport helpers:
1132+
*
1133+
* - Initializing the refdb requires us to know about the object
1134+
* format. We thus have to spawn the transport helper to learn
1135+
* about it.
1136+
*
1137+
* - The transport helper may want to access the Git repository. But
1138+
* because the refdb has not been initialized, we don't have "HEAD"
1139+
* or "refs/". Thus, the helper cannot find the Git repository.
1140+
*
1141+
* Ideally, we would have structured the helper protocol such that it's
1142+
* mandatory for the helper to first announce its capabilities without
1143+
* yet assuming a fully initialized repository. Like that, we could
1144+
* have added a "lazy-refdb-init" capability that announces whether the
1145+
* helper is ready to handle not-yet-initialized refdbs. If any helper
1146+
* didn't support them, we would have fully initialized the refdb with
1147+
* the SHA1 object format, but later on bailed out if we found out that
1148+
* the remote repository used a different object format.
1149+
*
1150+
* But we didn't, and thus we use the following workaround to partially
1151+
* initialize the repository's refdb such that it can be discovered by
1152+
* Git commands. To do so, we:
1153+
*
1154+
* - Create an invalid HEAD ref pointing at "refs/heads/.invalid".
1155+
*
1156+
* - Create the "refs/" directory.
1157+
*
1158+
* - Set up the ref storage format and repository version as
1159+
* required.
1160+
*
1161+
* This is sufficient for Git commands to discover the Git directory.
1162+
*/
1163+
initialize_repository_version(GIT_HASH_UNKNOWN,
1164+
the_repository->ref_storage_format, 1);
1165+
1166+
strbuf_addf(&buf, "%s/HEAD", git_dir);
1167+
write_file(buf.buf, "ref: refs/heads/.invalid");
1168+
1169+
strbuf_reset(&buf);
1170+
strbuf_addf(&buf, "%s/refs", git_dir);
1171+
safe_create_dir(buf.buf, 1);
1172+
11281173
/*
11291174
* additional config can be injected with -c, make sure it's included
11301175
* after init_db, which clears the entire config environment.
@@ -1453,6 +1498,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
14531498
free(remote_name);
14541499
strbuf_release(&reflog_msg);
14551500
strbuf_release(&branch_top);
1501+
strbuf_release(&buf);
14561502
strbuf_release(&key);
14571503
free_refs(mapped_refs);
14581504
free_refs(remote_head_points_at);

setup.c

+8-1
Original file line numberDiff line numberDiff line change
@@ -1898,6 +1898,13 @@ void initialize_repository_version(int hash_algo,
18981898
char repo_version_string[10];
18991899
int repo_version = GIT_REPO_VERSION;
19001900

1901+
/*
1902+
* Note that we initialize the repository version to 1 when the ref
1903+
* storage format is unknown. This is on purpose so that we can add the
1904+
* correct object format to the config during git-clone(1). The format
1905+
* version will get adjusted by git-clone(1) once it has learned about
1906+
* the remote repository's format.
1907+
*/
19011908
if (hash_algo != GIT_HASH_SHA1 ||
19021909
ref_storage_format != REF_STORAGE_FORMAT_FILES)
19031910
repo_version = GIT_REPO_VERSION_READ;
@@ -1907,7 +1914,7 @@ void initialize_repository_version(int hash_algo,
19071914
"%d", repo_version);
19081915
git_config_set("core.repositoryformatversion", repo_version_string);
19091916

1910-
if (hash_algo != GIT_HASH_SHA1)
1917+
if (hash_algo != GIT_HASH_SHA1 && hash_algo != GIT_HASH_UNKNOWN)
19111918
git_config_set("extensions.objectformat",
19121919
hash_algos[hash_algo].name);
19131920
else if (reinit)

t/t5801/git-remote-testgit

+5
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,11 @@ url=$2
1212

1313
dir="$GIT_DIR/testgit/$alias"
1414

15+
if ! git rev-parse --is-inside-git-dir
16+
then
17+
exit 1
18+
fi
19+
1520
h_refspec="refs/heads/*:refs/testgit/$alias/heads/*"
1621
t_refspec="refs/tags/*:refs/testgit/$alias/tags/*"
1722

0 commit comments

Comments
 (0)