From 49edce4ff96d003e2b50202105107ef42e1f730e Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Fri, 20 Dec 2024 10:02:15 -0800 Subject: [PATCH 01/28] show-index: the short help should say the command reads from its input The short help text given by "git show-index -h" says $ git show-index -h usage: git show-index [--object-format=] --[no-]object-format specify the hash algorithm to use The command takes a pack .idx file from its standard input. The user has to _know_ this, as there is no indication from this output. Give a hint that the data to work on is fed from its standard input. Signed-off-by: Junio C Hamano --- Documentation/git-show-index.txt | 2 +- builtin/show-index.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/git-show-index.txt b/Documentation/git-show-index.txt index e49318a5a0aec7..7e574ea243e58a 100644 --- a/Documentation/git-show-index.txt +++ b/Documentation/git-show-index.txt @@ -9,7 +9,7 @@ git-show-index - Show packed archive index SYNOPSIS -------- [verse] -'git show-index' [--object-format=] +'git show-index' [--object-format=] < DESCRIPTION diff --git a/builtin/show-index.c b/builtin/show-index.c index f164c01bbea400..8678b741a47e67 100644 --- a/builtin/show-index.c +++ b/builtin/show-index.c @@ -7,7 +7,7 @@ #include "parse-options.h" static const char *const show_index_usage[] = { - "git show-index [--object-format=]", + "git show-index [--object-format=] < ", NULL }; From 0ad3d656521aa16a6496aa855bbde97160a2b2bc Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Mon, 30 Dec 2024 11:32:23 +0100 Subject: [PATCH 02/28] object-file: fix race in object collision check One of the tests in t5616 asserts that git-fetch(1) with `--refetch` triggers repository maintenance with the correct set of arguments. This test is flaky and causes us to fail sometimes: ++ git -c protocol.version=0 -c gc.autoPackLimit=0 -c maintenance.incremental-repack.auto=1234 -C pc1 fetch --refetch origin error: unable to open .git/objects/pack/pack-029d08823bd8a8eab510ad6ac75c823cfd3ed31e.pack: No such file or directory fatal: unable to rename temporary file to '.git/objects/pack/pack-029d08823bd8a8eab510ad6ac75c823cfd3ed31e.pack' fatal: could not finish pack-objects to repack local links fatal: index-pack failed error: last command exited with $?=128 The error message is quite confusing as it talks about trying to rename a temporary packfile. A first hunch would thus be that this packfile gets written by git-fetch(1), but removed by git-maintenance(1) while it hasn't yet been finalized, which shouldn't ever happen. And indeed, when looking closer one notices that the file that is supposedly of temporary nature does not have the typical `tmp_pack_` prefix. As it turns out, the "unable to rename temporary file" fatal error is a red herring and the real error is "unable to open". That error is raised by `check_collision()`, which is called by `finalize_object_file()` when moving the new packfile into place. Because t5616 re-fetches objects, we end up with the exact same pack as we already have in the repository. So when the concurrent git-maintenance(1) process rewrites the preexisting pack and unlinks it exactly at the point in time where git-fetch(1) wants to check the old and new packfiles for equality we will see ENOENT and thus `check_collision()` returns an error, which gets bubbled up by `finalize_object_file()` and is then handled by `rename_tmp_packfile()`. That function does not know about the exact root cause of the error and instead just claims that the rename has failed. This race is thus caused by b1b8dfde69 (finalize_object_file(): implement collision check, 2024-09-26), where we have newly introduced the collision check. By definition, two files cannot collide with each other when one of them has been removed. We can thus trivially fix the issue by ignoring ENOENT when opening either of the files we're about to check for collision. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- object-file.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/object-file.c b/object-file.c index b1a3463852c451..0293b93bbc5ce9 100644 --- a/object-file.c +++ b/object-file.c @@ -1982,13 +1982,15 @@ static int check_collision(const char *filename_a, const char *filename_b) fd_a = open(filename_a, O_RDONLY); if (fd_a < 0) { - ret = error_errno(_("unable to open %s"), filename_a); + if (errno != ENOENT) + ret = error_errno(_("unable to open %s"), filename_a); goto out; } fd_b = open(filename_b, O_RDONLY); if (fd_b < 0) { - ret = error_errno(_("unable to open %s"), filename_b); + if (errno != ENOENT) + ret = error_errno(_("unable to open %s"), filename_b); goto out; } From c1acf1a31761d0cfddc3ea6d39c92a6528cd9c5c Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Mon, 6 Jan 2025 10:24:25 +0100 Subject: [PATCH 03/28] object-file: rename variables in `check_collision()` Rename variables used in `check_collision()` to clearly identify which file is the source and which is the destination. This will make the next step easier to reason about when we start to treat those files different from one another. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- object-file.c | 40 ++++++++++++++++++++-------------------- 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/object-file.c b/object-file.c index 0293b93bbc5ce9..e2fa1be303cce3 100644 --- a/object-file.c +++ b/object-file.c @@ -1974,56 +1974,56 @@ static void write_object_file_prepare_literally(const struct git_hash_algo *algo hash_object_body(algo, &c, buf, len, oid, hdr, hdrlen); } -static int check_collision(const char *filename_a, const char *filename_b) +static int check_collision(const char *source, const char *dest) { - char buf_a[4096], buf_b[4096]; - int fd_a = -1, fd_b = -1; + char buf_source[4096], buf_dest[4096]; + int fd_source = -1, fd_dest = -1; int ret = 0; - fd_a = open(filename_a, O_RDONLY); - if (fd_a < 0) { + fd_source = open(source, O_RDONLY); + if (fd_source < 0) { if (errno != ENOENT) - ret = error_errno(_("unable to open %s"), filename_a); + ret = error_errno(_("unable to open %s"), source); goto out; } - fd_b = open(filename_b, O_RDONLY); - if (fd_b < 0) { + fd_dest = open(dest, O_RDONLY); + if (fd_dest < 0) { if (errno != ENOENT) - ret = error_errno(_("unable to open %s"), filename_b); + ret = error_errno(_("unable to open %s"), dest); goto out; } while (1) { ssize_t sz_a, sz_b; - sz_a = read_in_full(fd_a, buf_a, sizeof(buf_a)); + sz_a = read_in_full(fd_source, buf_source, sizeof(buf_source)); if (sz_a < 0) { - ret = error_errno(_("unable to read %s"), filename_a); + ret = error_errno(_("unable to read %s"), source); goto out; } - sz_b = read_in_full(fd_b, buf_b, sizeof(buf_b)); + sz_b = read_in_full(fd_dest, buf_dest, sizeof(buf_dest)); if (sz_b < 0) { - ret = error_errno(_("unable to read %s"), filename_b); + ret = error_errno(_("unable to read %s"), dest); goto out; } - if (sz_a != sz_b || memcmp(buf_a, buf_b, sz_a)) { + if (sz_a != sz_b || memcmp(buf_source, buf_dest, sz_a)) { ret = error(_("files '%s' and '%s' differ in contents"), - filename_a, filename_b); + source, dest); goto out; } - if (sz_a < sizeof(buf_a)) + if (sz_a < sizeof(buf_source)) break; } out: - if (fd_a > -1) - close(fd_a); - if (fd_b > -1) - close(fd_b); + if (fd_source > -1) + close(fd_source); + if (fd_dest > -1) + close(fd_dest); return ret; } From cfae50e40eb72d6116ad56c616b3322474df4a75 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Mon, 6 Jan 2025 10:24:26 +0100 Subject: [PATCH 04/28] object-file: don't special-case missing source file in collision check In 0ad3d65652 (object-file: fix race in object collision check, 2024-12-30) we have started to ignore ENOENT when opening either the source or destination file of the collision check. This was done to handle races more gracefully in case either of the potentially-colliding disappears. The fix is overly broad though: while the destination file may indeed vanish racily, this shouldn't ever happen for the source file, which is a temporary object file (either loose or in packfile format) that we have just created. So if any concurrent process would have removed that temporary file it would indicate an actual issue. Stop treating ENOENT specially for the source file so that we always bubble up this error. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- object-file.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/object-file.c b/object-file.c index e2fa1be303cce3..c1bd746d9e238e 100644 --- a/object-file.c +++ b/object-file.c @@ -1982,8 +1982,7 @@ static int check_collision(const char *source, const char *dest) fd_source = open(source, O_RDONLY); if (fd_source < 0) { - if (errno != ENOENT) - ret = error_errno(_("unable to open %s"), source); + ret = error_errno(_("unable to open %s"), source); goto out; } From d7fcbe2c56468ac780c689b02c6a9e056ce39c12 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Mon, 6 Jan 2025 10:24:27 +0100 Subject: [PATCH 05/28] object-file: retry linking file into place when occluding file vanishes Prior to 0ad3d65652 (object-file: fix race in object collision check, 2024-12-30), callers could expect that a successful return from `finalize_object_file()` means that either the file was moved into place, or the identical bytes were already present. If neither of those happens, we'd return an error. Since that commit, if the destination file disappears between our link(3p) call and the collision check, we'd return success without actually checking the contents, and without retrying the link. This solves the common case that the files were indeed the same, but it means that we may corrupt the repository if they weren't (this implies a hash collision, but the whole point of this function is protecting against hash collisions). We can't be pessimistic and assume they're different; that hurts the common case that the mentioned commit was trying to fix. But after seeing that the destination file went away, we can retry linking again. Adapt the code to do so when we see that the destination file has racily vanished. This should generally succeed as we have just observed that the destination file does not exist anymore, except in the very unlikely event that it gets recreated by another concurrent process again. Helped-by: Jeff King Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- object-file.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-) diff --git a/object-file.c b/object-file.c index c1bd746d9e238e..008ddf59a56a6e 100644 --- a/object-file.c +++ b/object-file.c @@ -1974,6 +1974,8 @@ static void write_object_file_prepare_literally(const struct git_hash_algo *algo hash_object_body(algo, &c, buf, len, oid, hdr, hdrlen); } +#define CHECK_COLLISION_DEST_VANISHED -2 + static int check_collision(const char *source, const char *dest) { char buf_source[4096], buf_dest[4096]; @@ -1990,6 +1992,8 @@ static int check_collision(const char *source, const char *dest) if (fd_dest < 0) { if (errno != ENOENT) ret = error_errno(_("unable to open %s"), dest); + else + ret = CHECK_COLLISION_DEST_VANISHED; goto out; } @@ -2037,8 +2041,11 @@ int finalize_object_file(const char *tmpfile, const char *filename) int finalize_object_file_flags(const char *tmpfile, const char *filename, enum finalize_object_file_flags flags) { - struct stat st; - int ret = 0; + unsigned retries = 0; + int ret; + +retry: + ret = 0; if (object_creation_mode == OBJECT_CREATION_USES_RENAMES) goto try_rename; @@ -2059,6 +2066,8 @@ int finalize_object_file_flags(const char *tmpfile, const char *filename, * left to unlink. */ if (ret && ret != EEXIST) { + struct stat st; + try_rename: if (!stat(filename, &st)) ret = EEXIST; @@ -2074,9 +2083,17 @@ int finalize_object_file_flags(const char *tmpfile, const char *filename, errno = saved_errno; return error_errno(_("unable to write file %s"), filename); } - if (!(flags & FOF_SKIP_COLLISION_CHECK) && - check_collision(tmpfile, filename)) + if (!(flags & FOF_SKIP_COLLISION_CHECK)) { + ret = check_collision(tmpfile, filename); + if (ret == CHECK_COLLISION_DEST_VANISHED) { + if (retries++ > 5) + return error(_("unable to write repeatedly vanishing file %s"), + filename); + goto retry; + } + else if (ret) return -1; + } unlink_or_warn(tmpfile); } From 8d24d56ce1da13caff82cfa8950413309e08da13 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Tue, 7 Jan 2025 02:05:01 -0500 Subject: [PATCH 06/28] test-lib: invert return value of check_test_results_san_file_empty We have a function to check whether LSan logged any leaks. It returns success for no leaks, and non-zero otherwise. This is the simplest thing for its callers, who want to say "if no leaks then return early". But because it's implemented as a shell pipeline, you end up with the awkward: ! find ... | xargs grep leaks | grep -v false-positives where the "!" is actually negating the final grep. Switch the return value (and name) to return success when there are leaks. This should make the code a little easier to read, and the negation in the callers still reads pretty naturally. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- t/test-lib-functions.sh | 2 +- t/test-lib.sh | 8 ++++---- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index 78e054ab503a65..c25cee0ad8651a 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -927,7 +927,7 @@ test_expect_success () { test -n "$test_skip_test_preamble" || say >&3 "expecting success of $TEST_NUMBER.$test_count '$1': $test_body" if test_run_ "$test_body" && - check_test_results_san_file_empty_ + ! check_test_results_san_file_has_entries_ then test_ok_ "$1" else diff --git a/t/test-lib.sh b/t/test-lib.sh index d1f62adbf82931..be3553e40e9052 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1169,12 +1169,12 @@ test_atexit_handler () { teardown_malloc_check } -check_test_results_san_file_empty_ () { - test -z "$TEST_RESULTS_SAN_FILE" && return 0 +check_test_results_san_file_has_entries_ () { + test -z "$TEST_RESULTS_SAN_FILE" && return 1 # stderr piped to /dev/null because the directory may have # been "rmdir"'d already. - ! find "$TEST_RESULTS_SAN_DIR" \ + find "$TEST_RESULTS_SAN_DIR" \ -type f \ -name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null | xargs grep ^DEDUP_TOKEN | @@ -1182,7 +1182,7 @@ check_test_results_san_file_empty_ () { } check_test_results_san_file_ () { - if check_test_results_san_file_empty_ + if ! check_test_results_san_file_has_entries_ then return fi && From b9a9df93a3f5580c7f7b8cc099aad1c204ced8a4 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Tue, 7 Jan 2025 02:07:52 -0500 Subject: [PATCH 07/28] test-lib: simplify lsan results check We want to know if there are any leaks logged by LSan in the results directory, so we run "find" on the containing directory and pipe it to xargs. We can accomplish the same thing by just globbing in the shell and passing the result to grep, which has a few advantages: - it's one fewer process to run - we can glob on the TEST_RESULTS_SAN_FILE pattern, which is what we checked at the beginning of the function, and is the same glob used to show the logs in check_test_results_san_file_ - this correctly handles the case where TEST_OUTPUT_DIRECTORY has a space in it. For example doing: mkdir "/tmp/foo bar" TEST_OUTPUT_DIRECTORY="/tmp/foo bar" make SANITIZE=leak test would yield a lot of: grep: /tmp/foo: No such file or directory grep: bar/test-results/t0006-date.leak/trace.test-tool.582311: No such file or directory when there are leaks. We could do the same thing with "xargs --null", but that isn't portable. We are now subject to command-line length limits, but that is also true of the globbing cat used to show the logs themselves. This hasn't been a problem in practice. We do need to use "grep -s" for the case that the glob does not expand (i.e., there are not any log files at all). This option is in POSIX, and has been used in t7407 for several years without anybody complaining. This also also naturally handles the case where the surrounding directory has already been removed (in which case there are likewise no files!), dropping the need to comment about it. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- t/test-lib.sh | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/t/test-lib.sh b/t/test-lib.sh index be3553e40e9052..898c2267b8a400 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1172,12 +1172,7 @@ test_atexit_handler () { check_test_results_san_file_has_entries_ () { test -z "$TEST_RESULTS_SAN_FILE" && return 1 - # stderr piped to /dev/null because the directory may have - # been "rmdir"'d already. - find "$TEST_RESULTS_SAN_DIR" \ - -type f \ - -name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null | - xargs grep ^DEDUP_TOKEN | + grep -s ^DEDUP_TOKEN "$TEST_RESULTS_SAN_FILE".* | grep -qv sanitizer::GetThreadStackTopAndBottom } From 164a2516eb622fdf032ce526ec97e79a53bf2893 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Tue, 7 Jan 2025 02:08:31 -0500 Subject: [PATCH 08/28] test-lib: add a few comments to LSan log checking Commit b119a687d4 (test-lib: ignore leaks in the sanitizer's thread code, 2025-01-01) added code to suppress a false positive in the leak checker. But if you're just reading the code, the obscure grep call is a bit of a head-scratcher. Let's add a brief comment explaining what's going on (and anybody digging further can find this commit or that one for all the details). Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- t/test-lib.sh | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/t/test-lib.sh b/t/test-lib.sh index 898c2267b8a400..9f27a49995160e 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1172,6 +1172,11 @@ test_atexit_handler () { check_test_results_san_file_has_entries_ () { test -z "$TEST_RESULTS_SAN_FILE" && return 1 + # Lines marked with DEDUP_TOKEN show unique leaks. We only care that we + # found at least one. + # + # But also suppress any false positives caused by bugs or races in the + # sanitizer itself. grep -s ^DEDUP_TOKEN "$TEST_RESULTS_SAN_FILE".* | grep -qv sanitizer::GetThreadStackTopAndBottom } From 0b432748507a12b92677653104b18834d83cfb10 Mon Sep 17 00:00:00 2001 From: M Hickford Date: Thu, 9 Jan 2025 22:45:20 +0000 Subject: [PATCH 09/28] credential-cache: respect authtype capability Previously, credential-cache populated authtype regardless whether "get" request had authtype capability. As documented in git-credential.txt, authtype "should not be sent unless the appropriate capability ... is provided". Add test. Without this change, the test failed because "credential fill" printed an incomplete credential with only protocol and host attributes (the unexpected authtype attribute was discarded by credential.c). Signed-off-by: M Hickford Signed-off-by: Junio C Hamano --- builtin/credential-cache--daemon.c | 4 ++-- t/lib-credential.sh | 15 +++++++++++++++ 2 files changed, 17 insertions(+), 2 deletions(-) diff --git a/builtin/credential-cache--daemon.c b/builtin/credential-cache--daemon.c index 4952b225477227..9cdc3077ecfa02 100644 --- a/builtin/credential-cache--daemon.c +++ b/builtin/credential-cache--daemon.c @@ -141,9 +141,9 @@ static void serve_one_client(FILE *in, FILE *out) fprintf(out, "username=%s\n", e->item.username); if (e->item.password) fprintf(out, "password=%s\n", e->item.password); - if (credential_has_capability(&c.capa_authtype, CREDENTIAL_OP_HELPER) && e->item.authtype) + if (credential_has_capability(&c.capa_authtype, CREDENTIAL_OP_RESPONSE) && e->item.authtype) fprintf(out, "authtype=%s\n", e->item.authtype); - if (credential_has_capability(&c.capa_authtype, CREDENTIAL_OP_HELPER) && e->item.credential) + if (credential_has_capability(&c.capa_authtype, CREDENTIAL_OP_RESPONSE) && e->item.credential) fprintf(out, "credential=%s\n", e->item.credential); if (e->item.password_expiry_utc != TIME_MAX) fprintf(out, "password_expiry_utc=%"PRItime"\n", diff --git a/t/lib-credential.sh b/t/lib-credential.sh index 58b9c740605890..cc6bf9aa5f3717 100644 --- a/t/lib-credential.sh +++ b/t/lib-credential.sh @@ -566,6 +566,21 @@ helper_test_authtype() { EOF ' + test_expect_success "helper ($HELPER) gets authtype and credential only if request has authtype capability" ' + check fill $HELPER <<-\EOF + protocol=https + host=git.example.com + -- + protocol=https + host=git.example.com + username=askpass-username + password=askpass-password + -- + askpass: Username for '\''https://git.example.com'\'': + askpass: Password for '\''https://askpass-username@git.example.com'\'': + EOF + ' + test_expect_success "helper ($HELPER) stores authtype and credential with username" ' check approve $HELPER <<-\EOF capability[]=authtype From 447cdec2e9b2e45b14ea26f7fcaf054f7ded12bc Mon Sep 17 00:00:00 2001 From: Bence Ferdinandy Date: Sun, 12 Jan 2025 17:51:22 +0100 Subject: [PATCH 10/28] fetch set_head: fix non-mirror remotes in bare repositories In b1b713f722 (fetch set_head: handle mirrored bare repositories, 2024-11-22) it was implicitly assumed that all remotes will be mirrors in a bare repository, thus fetching a non-mirrored remote could lead to HEAD pointing to a non-existent reference. Make sure we only overwrite HEAD if we are in a bare repository and fetching from a mirror. Otherwise, proceed as normally, and create refs/remotes//HEAD instead. Signed-off-by: Bence Ferdinandy Reported-by: Christian Hesse Signed-off-by: Junio C Hamano --- builtin/fetch.c | 15 ++++++++------- t/t5505-remote.sh | 10 ++++++++++ t/t5510-fetch.sh | 13 +++++++++++++ 3 files changed, 31 insertions(+), 7 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index fe2b26c74aecab..625d45be8ba565 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -1618,9 +1618,9 @@ static void report_set_head(const char *remote, const char *head_name, } static int set_head(const struct ref *remote_refs, int follow_remote_head, - const char *no_warn_branch) + const char *no_warn_branch, int mirror) { - int result = 0, create_only, is_bare, was_detached; + int result = 0, create_only, baremirror, was_detached; struct strbuf b_head = STRBUF_INIT, b_remote_head = STRBUF_INIT, b_local_head = STRBUF_INIT; const char *remote = gtransport->remote->name; @@ -1655,9 +1655,9 @@ static int set_head(const struct ref *remote_refs, int follow_remote_head, if (!head_name) goto cleanup; - is_bare = is_bare_repository(); - create_only = follow_remote_head == FOLLOW_REMOTE_ALWAYS ? 0 : !is_bare; - if (is_bare) { + baremirror = is_bare_repository() && mirror; + create_only = follow_remote_head == FOLLOW_REMOTE_ALWAYS ? 0 : !baremirror; + if (baremirror) { strbuf_addstr(&b_head, "HEAD"); strbuf_addf(&b_remote_head, "refs/heads/%s", head_name); } else { @@ -1665,7 +1665,7 @@ static int set_head(const struct ref *remote_refs, int follow_remote_head, strbuf_addf(&b_remote_head, "refs/remotes/%s/%s", remote, head_name); } /* make sure it's valid */ - if (!is_bare && !refs_ref_exists(refs, b_remote_head.buf)) { + if (!baremirror && !refs_ref_exists(refs, b_remote_head.buf)) { result = 1; goto cleanup; } @@ -1925,7 +1925,8 @@ static int do_fetch(struct transport *transport, } } if (set_head(remote_refs, transport->remote->follow_remote_head, - transport->remote->no_warn_branch)) + transport->remote->no_warn_branch, + transport->remote->mirror)) ; /* * Way too many cases where this can go wrong diff --git a/t/t5505-remote.sh b/t/t5505-remote.sh index 519f7973e31287..c75cfe968f30e1 100755 --- a/t/t5505-remote.sh +++ b/t/t5505-remote.sh @@ -589,6 +589,16 @@ test_expect_success 'add --mirror setting HEAD' ' ) ' +test_expect_success 'non-mirror fetch does not interfere with mirror' ' + mkdir headnotmain && + ( + cd headnotmain && + git init --bare -b notmain && + git remote add -f other ../two && + test "$(git symbolic-ref HEAD)" = "refs/heads/notmain" + ) +' + test_expect_success 'add --mirror=fetch' ' mkdir mirror-fetch && git init -b main mirror-fetch/parent && diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh index 2d9587059f5083..cfa63ae0867241 100755 --- a/t/t5510-fetch.sh +++ b/t/t5510-fetch.sh @@ -84,6 +84,19 @@ test_expect_success "fetch test remote HEAD" ' branch=$(git rev-parse refs/remotes/origin/main) && test "z$head" = "z$branch"' +test_expect_success "fetch test remote HEAD in bare repository" ' + cd "$D" && + git init --bare barerepo && + cd barerepo && + git remote add upstream ../two && + git fetch upstream && + git rev-parse --verify refs/remotes/upstream/HEAD && + git rev-parse --verify refs/remotes/upstream/main && + head=$(git rev-parse refs/remotes/upstream/HEAD) && + branch=$(git rev-parse refs/remotes/upstream/main) && + test "z$head" = "z$branch"' + + test_expect_success "fetch test remote HEAD change" ' cd "$D" && cd two && From 71e19a003197960cec38d30e71b49d182bcf8510 Mon Sep 17 00:00:00 2001 From: Elijah Newren Date: Mon, 13 Jan 2025 17:13:36 +0000 Subject: [PATCH 11/28] object-name: fix resolution of object names containing curly braces Given a branch name of 'foo{bar', commands like git cat-file -p foo{bar:README.md should succeed (assuming that branch had a README.md file, of course). However, the change in cce91a2caef9 (Change 'master@noon' syntax to 'master@{noon}'., 2006-05-19) presumed that curly braces would always come after an '@' or '^' and be paired, causing e.g. 'foo{bar:README.md' to entirely miss the ':' and assume there's no object being referenced. In short, git would report: fatal: Not a valid object name foo{bar:README.md Change the parsing to only make the assumption of paired curly braces immediately after either a '@' or '^' character appears. Add tests for this, as well as for a few other test cases that initial versions of this patch broke: * 'foo@@{...}' * 'foo^{/${SEARCH_TEXT_WITH_COLON}}:${PATH}' Note that we'd prefer not duplicating the special logic for "@^" characters here, because if get_oid_basic() or interpret_nth_prior_checkout() or get_oid_basic() or similar gain extra methods of using curly braces, then the logic in get_oid_with_context_1() would need to be updated as well. But it's not clear how to refactor all of these to have a simple common callpoint with the specialized logic. Reported-by: Gabriel Amaral Helped-by: Michael Haggerty Signed-off-by: Elijah Newren Signed-off-by: Junio C Hamano --- object-name.c | 8 +++++--- t/t1006-cat-file.sh | 31 ++++++++++++++++++++++++++++++- 2 files changed, 35 insertions(+), 4 deletions(-) diff --git a/object-name.c b/object-name.c index 88d1313028cf0b..f43fdba24017ea 100644 --- a/object-name.c +++ b/object-name.c @@ -2052,12 +2052,14 @@ static enum get_oid_result get_oid_with_context_1(struct repository *repo, return -1; } for (cp = name, bracket_depth = 0; *cp; cp++) { - if (*cp == '{') + if (strchr("@^", *cp) && cp[1] == '{') { + cp++; bracket_depth++; - else if (bracket_depth && *cp == '}') + } else if (bracket_depth && *cp == '}') { bracket_depth--; - else if (!bracket_depth && *cp == ':') + } else if (!bracket_depth && *cp == ':') { break; + } } if (*cp == ':') { struct object_id tree_oid; diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh index ff9bf213aa2c73..398865d6ebe9c6 100755 --- a/t/t1006-cat-file.sh +++ b/t/t1006-cat-file.sh @@ -240,7 +240,8 @@ test_expect_success "setup" ' git config extensions.objectformat $test_hash_algo && git config extensions.compatobjectformat $test_compat_hash_algo && echo_without_newline "$hello_content" > hello && - git update-index --add hello + git update-index --add hello && + git commit -m "add hello file" ' run_blob_tests () { @@ -602,6 +603,34 @@ test_expect_success FUNNYNAMES '--batch-check, -Z with newline in input' ' test_cmp expect actual ' +test_expect_success 'setup with curly braches in input' ' + git branch "foo{bar" HEAD && + git branch "foo@" HEAD +' + +test_expect_success 'object reference with curly brace' ' + git cat-file -p "foo{bar:hello" >actual && + git cat-file -p HEAD:hello >expect && + test_cmp expect actual +' + +test_expect_success 'object reference with at-sign' ' + git cat-file -p "foo@@{0}:hello" >actual && + git cat-file -p HEAD:hello >expect && + test_cmp expect actual +' + +test_expect_success 'setup with commit with colon' ' + git commit-tree -m "testing: just a bunch of junk" HEAD^{tree} >out && + git branch other $(cat out) +' + +test_expect_success 'object reference via commit text search' ' + git cat-file -p "other^{/testing:}:hello" >actual && + git cat-file -p HEAD:hello >expect && + test_cmp expect actual +' + test_expect_success 'setup blobs which are likely to delta' ' test-tool genrandom foo 10240 >foo && { cat foo && echo plus; } >foo-plus && From 191f0c8db22267cab55472961524c70a1d692025 Mon Sep 17 00:00:00 2001 From: Elijah Newren Date: Mon, 13 Jan 2025 17:13:37 +0000 Subject: [PATCH 12/28] object-name: be more strict in parsing describe-like output From Documentation/revisions.txt: '', e.g. 'v1.7.4.2-679-g3bee7fb':: Output from `git describe`; i.e. a closest tag, optionally followed by a dash and a number of commits, followed by a dash, a 'g', and an abbreviated object name. which means that output of the format ${REFNAME}-${INTEGER}-g${HASH} should parse to fully expanded ${HASH}. This is fine. However, we currently don't validate any of ${REFNAME}-${INTEGER}, we only parse -g${HASH} and assume the rest is valid. That is problematic, since it breaks things like git cat-file -p branchname:path/to/file/named/i-gaffed which, when commit (or tree or blob) affed exists, will not return us information about the file we are looking for but will instead erroneously tell us about object affed. A few additional notes: - This is a slight backward incompatibility break, because we used to allow ${GARBAGE}-g${HASH} as a way to spell ${HASH}. However, a backward incompatible break is necessary, because there is no other way for someone to be more specific and disambiguate that they want the blob master:path/to/who-gabbed instead of the object abbed. - There is a possibility that check_refname_format() rules change in the future. However, we can only realistically loosen the rules for what that function accepts rather than tighten. If we were to tighten the rules, some real world repositories may already have refnames that suddenly become unacceptable and we break those repositories. As such, any describe-like syntax of the form ${VALID_FOR_A_REFNAME}-${INTEGER}-g${HASH} that is valid with the changes in this commit will remain valid in the future. - The fact that check_refname_format() rules could loosen in the future is probably also an important reason to make this change. If the rules loosen, there might be additional cases within ${GARBAGE}-g${HASH} that become ambiguous in the future. While abbreviated hashes can be disambiguated by abbreviating less, it may well be that these alternative object names have no way of being disambiguated (much like pathnames cannot be). Accepting all random ${GARBAGE} thus makes it difficult for us to allow future extensions to object naming. So, tighten up the parsing to make sure ${REFNAME} and ${INTEGER} are present in the string, and would be considered a valid ref and non-negative integer. Also, add a few tests for git describe using object names of the form ${REVISION_NAME}${MODIFIERS} since an early version of this patch failed on constructs like git describe v2.48.0-rc2-161-g6c2274cdbc^0 Reported-by: Gabriel Amaral Signed-off-by: Elijah Newren Signed-off-by: Junio C Hamano --- object-name.c | 55 ++++++++++++++++++++++++++++++++++++++++++++- t/t6120-describe.sh | 24 ++++++++++++++++++++ 2 files changed, 78 insertions(+), 1 deletion(-) diff --git a/object-name.c b/object-name.c index f43fdba24017ea..945d5bdef25217 100644 --- a/object-name.c +++ b/object-name.c @@ -1272,6 +1272,58 @@ static int peel_onion(struct repository *r, const char *name, int len, return 0; } +/* + * Documentation/revisions.txt says: + * '', e.g. 'v1.7.4.2-679-g3bee7fb':: + * Output from `git describe`; i.e. a closest tag, optionally + * followed by a dash and a number of commits, followed by a dash, a + * 'g', and an abbreviated object name. + * + * which means that the stuff before '-g${HASH}' needs to be a valid + * refname, a dash, and a non-negative integer. This function verifies + * that. + * + * In particular, we do not want to treat + * branchname:path/to/file/named/i-gaffed + * as a request for commit affed. + * + * More generally, we should probably not treat + * 'refs/heads/./../.../ ~^:/?*[////\\\&}/busted.lock-g050e0ef6ead' + * as a request for object 050e0ef6ead either. + * + * We are called with name[len] == '-' and name[len+1] == 'g', i.e. + * we are verifying ${REFNAME}-{INTEGER} part of the name. + */ +static int ref_and_count_parts_valid(const char *name, int len) +{ + struct strbuf sb; + const char *cp; + int flags = REFNAME_ALLOW_ONELEVEL; + int ret = 1; + + /* Ensure we have at least one digit */ + if (!isxdigit(name[len-1])) + return 0; + + /* Skip over digits backwards until we get to the dash */ + for (cp = name + len - 2; name < cp; cp--) { + if (*cp == '-') + break; + if (!isxdigit(*cp)) + return 0; + } + /* Ensure we found the leading dash */ + if (*cp != '-') + return 0; + + len = cp - name; + strbuf_init(&sb, len); + strbuf_add(&sb, name, len); + ret = !check_refname_format(sb.buf, flags); + strbuf_release(&sb); + return ret; +} + static int get_describe_name(struct repository *r, const char *name, int len, struct object_id *oid) @@ -1285,7 +1337,8 @@ static int get_describe_name(struct repository *r, /* We must be looking at g in "SOMETHING-g" * for it to be describe output. */ - if (ch == 'g' && cp[-1] == '-') { + if (ch == 'g' && cp[-1] == '-' && + ref_and_count_parts_valid(name, cp - 1 - name)) { cp++; len -= cp - name; return get_short_oid(r, diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh index 3f6160d702bc20..76843a61691cb5 100755 --- a/t/t6120-describe.sh +++ b/t/t6120-describe.sh @@ -82,11 +82,13 @@ check_describe R-2-gHASH HEAD^^ check_describe A-3-gHASH HEAD^^2 check_describe B HEAD^^2^ check_describe R-1-gHASH HEAD^^^ +check_describe R-1-gHASH R-1-g$(git rev-parse --short HEAD^^)~1 check_describe c-7-gHASH --tags HEAD check_describe c-6-gHASH --tags HEAD^ check_describe e-1-gHASH --tags HEAD^^ check_describe c-2-gHASH --tags HEAD^^2 +check_describe c-2-gHASH --tags c-2-g$(git rev-parse --short HEAD^^2)^0 check_describe B --tags HEAD^^2^ check_describe e --tags HEAD^^^ check_describe e --tags --exact-match HEAD^^^ @@ -725,4 +727,26 @@ test_expect_success '--exact-match does not show --always fallback' ' test_must_fail git describe --exact-match --always ' +test_expect_success 'avoid being fooled by describe-like filename' ' + test_when_finished rm out && + + git rev-parse --short HEAD >out && + FILENAME=filename-g$(cat out) && + touch $FILENAME && + git add $FILENAME && + git commit -m "Add $FILENAME" && + + git cat-file -t HEAD:$FILENAME >actual && + + echo blob >expect && + test_cmp expect actual +' + +test_expect_success 'do not be fooled by invalid describe format ' ' + test_when_finished rm out && + + git rev-parse --short HEAD >out && + test_must_fail git cat-file -t "refs/tags/super-invalid/./../...../ ~^:/?*[////\\\\\\&}/busted.lock-42-g"$(cat out) +' + test_done From bc67b4ab5f8bc268ecd2d9bb7dc1b7bf26884a8e Mon Sep 17 00:00:00 2001 From: Karthik Nayak Date: Wed, 15 Jan 2025 11:54:51 +0000 Subject: [PATCH 13/28] reftable: write correct max_update_index to header In 297c09eabb (refs: allow multiple reflog entries for the same refname, 2024-12-16), the reftable backend learned to handle multiple reflog entries within the same transaction. This was done modifying the `update_index` for reflogs with multiple indices. During writing the logs, the `max_update_index` of the writer was modified to ensure the limits were raised to the modified `update_index`s. However, since ref entries are written before the modification to the `max_update_index`, if there are multiple blocks to be written, the reftable backend writes the header with the old `max_update_index`. When all logs are finally written, the footer will be written with the new `min_update_index`. This causes a mismatch between the header and the footer and causes the reftable file to be corrupted. The existing tests only spawn a single block and since headers are lazily written with the first block, the tests didn't capture this bug. To fix the issue, the appropriate `max_update_index` limit must be set even before the first block is written. Add a `max_index` field to the transaction which holds the `max_index` within all its updates, then propagate this value to the reftable backend, wherein this is used to the set the `max_update_index` correctly. Add a test which creates a few thousand reference updates with multiple reflog entries, which should trigger the bug. Reported-by: brian m. carlson Signed-off-by: Karthik Nayak Signed-off-by: Junio C Hamano --- refs.c | 7 +++++++ refs/refs-internal.h | 1 + refs/reftable-backend.c | 20 ++++++++++---------- t/t1460-refs-migrate.sh | 12 ++++++++++++ 4 files changed, 30 insertions(+), 10 deletions(-) diff --git a/refs.c b/refs.c index c55583986940d8..a5851a7de00878 100644 --- a/refs.c +++ b/refs.c @@ -1297,6 +1297,13 @@ int ref_transaction_update_reflog(struct ref_transaction *transaction, update->flags &= ~REF_HAVE_OLD; update->index = index; + /* + * Reference backends may need to know the max index to optimize + * their writes. So we store the max_index on the transaction level. + */ + if (index > transaction->max_index) + transaction->max_index = index; + return 0; } diff --git a/refs/refs-internal.h b/refs/refs-internal.h index 79b287c5ec5c7d..2aaff91ab4826a 100644 --- a/refs/refs-internal.h +++ b/refs/refs-internal.h @@ -203,6 +203,7 @@ struct ref_transaction { enum ref_transaction_state state; void *backend_data; unsigned int flags; + unsigned int max_index; }; /* diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index bec5962debea7b..68db2baa8f15c4 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -852,6 +852,7 @@ struct write_transaction_table_arg { size_t updates_nr; size_t updates_alloc; size_t updates_expected; + unsigned int max_index; }; struct reftable_transaction_data { @@ -1302,7 +1303,6 @@ static int write_transaction_table(struct reftable_writer *writer, void *cb_data struct reftable_log_record *logs = NULL; struct ident_split committer_ident = {0}; size_t logs_nr = 0, logs_alloc = 0, i; - uint64_t max_update_index = ts; const char *committer_info; int ret = 0; @@ -1312,7 +1312,12 @@ static int write_transaction_table(struct reftable_writer *writer, void *cb_data QSORT(arg->updates, arg->updates_nr, transaction_update_cmp); - reftable_writer_set_limits(writer, ts, ts); + /* + * During reflog migration, we add indexes for a single reflog with + * multiple entries. Each entry will contain a different update_index, + * so set the limits accordingly. + */ + reftable_writer_set_limits(writer, ts, ts + arg->max_index); for (i = 0; i < arg->updates_nr; i++) { struct reftable_transaction_update *tx_update = &arg->updates[i]; @@ -1414,12 +1419,6 @@ static int write_transaction_table(struct reftable_writer *writer, void *cb_data */ log->update_index = ts + u->index; - /* - * Note the max update_index so the limit can be set later on. - */ - if (log->update_index > max_update_index) - max_update_index = log->update_index; - log->refname = xstrdup(u->refname); memcpy(log->value.update.new_hash, u->new_oid.hash, GIT_MAX_RAWSZ); @@ -1483,8 +1482,6 @@ static int write_transaction_table(struct reftable_writer *writer, void *cb_data * and log blocks. */ if (logs) { - reftable_writer_set_limits(writer, ts, max_update_index); - ret = reftable_writer_add_logs(writer, logs, logs_nr); if (ret < 0) goto done; @@ -1505,6 +1502,9 @@ static int reftable_be_transaction_finish(struct ref_store *ref_store UNUSED, struct reftable_transaction_data *tx_data = transaction->backend_data; int ret = 0; + if (tx_data->args) + tx_data->args->max_index = transaction->max_index; + for (size_t i = 0; i < tx_data->args_nr; i++) { ret = reftable_addition_add(tx_data->args[i].addition, write_transaction_table, &tx_data->args[i]); diff --git a/t/t1460-refs-migrate.sh b/t/t1460-refs-migrate.sh index f59bc4860f19c4..307b2998efe6ac 100755 --- a/t/t1460-refs-migrate.sh +++ b/t/t1460-refs-migrate.sh @@ -227,6 +227,18 @@ do done done +test_expect_success 'multiple reftable blocks with multiple entries' ' + test_when_finished "rm -rf repo" && + git init --ref-format=files repo && + test_commit -C repo first && + printf "create refs/heads/ref-%d HEAD\n" $(test_seq 5000) >stdin && + git -C repo update-ref --stdin stdin && + git -C repo update-ref --stdin Date: Sat, 18 Jan 2025 18:11:34 +0100 Subject: [PATCH 14/28] ref-filter: move ahead-behind bases into used_atom MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit verify_ref_format() parses a ref-filter format string and stores recognized items in the static array "used_atom". For "ahead-behind:" it stores the committish part in a string_list member "bases" of struct ref_format. ref_sorting_options() also parses bare ref-filter format items and stores stores recognized ones in "used_atom" as well. The committish parts go to a dummy struct ref_format in parse_sorting_atom(), though, and are leaked and forgotten. If verify_ref_format() is called before ref_sorting_options(), like in git for-each-ref, then all works well if the sort key is included in the format string. If it isn't then sorting cannot work as the committishes are missing. If ref_sorting_options() is called first, like in git branch, then we have the additional issue that if the sort key is included in the format string then filter_ahead_behind() can't see its committish, will not generate any results for it and thus it will be expanded to an empty string. Fix those issues by replacing the string_list with a field in used_atom for storing the committish. This way it can be shared for handling both ref-filter format strings and sorting options in the same command. Reported-by: Ross Goldberg Helped-by: Jeff King Signed-off-by: René Scharfe Signed-off-by: Junio C Hamano --- builtin/branch.c | 2 +- ref-filter.c | 50 ++++++++++++++++++++++++---------------- ref-filter.h | 5 ---- t/t3203-branch-output.sh | 28 ++++++++++++++++++++++ 4 files changed, 59 insertions(+), 26 deletions(-) diff --git a/builtin/branch.c b/builtin/branch.c index 6e7b0cfddbd607..fbb9536282dcaa 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -473,7 +473,7 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin if (verify_ref_format(format)) die(_("unable to parse format string")); - filter_ahead_behind(the_repository, format, &array); + filter_ahead_behind(the_repository, &array); ref_array_sort(sorting, &array); if (column_active(colopts)) { diff --git a/ref-filter.c b/ref-filter.c index 23054694c2c960..250e412c47181a 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -235,6 +235,9 @@ static struct used_atom { enum { S_BARE, S_GRADE, S_SIGNER, S_KEY, S_FINGERPRINT, S_PRI_KEY_FP, S_TRUST_LEVEL } option; } signature; + struct { + struct commit *commit; + } base; struct strvec describe_args; struct refname_atom refname; char *head; @@ -891,18 +894,15 @@ static int rest_atom_parser(struct ref_format *format UNUSED, return 0; } -static int ahead_behind_atom_parser(struct ref_format *format, - struct used_atom *atom UNUSED, +static int ahead_behind_atom_parser(struct ref_format *format UNUSED, + struct used_atom *atom, const char *arg, struct strbuf *err) { - struct string_list_item *item; - if (!arg) return strbuf_addf_ret(err, -1, _("expected format: %%(ahead-behind:)")); - item = string_list_append(&format->bases, arg); - item->util = lookup_commit_reference_by_name(arg); - if (!item->util) + atom->u.base.commit = lookup_commit_reference_by_name(arg); + if (!atom->u.base.commit) die("failed to find '%s'", arg); return 0; @@ -3084,22 +3084,30 @@ static void reach_filter(struct ref_array *array, } void filter_ahead_behind(struct repository *r, - struct ref_format *format, struct ref_array *array) { struct commit **commits; - size_t commits_nr = format->bases.nr + array->nr; + size_t bases_nr, commits_nr; - if (!format->bases.nr || !array->nr) + if (!array->nr) return; - ALLOC_ARRAY(commits, commits_nr); - for (size_t i = 0; i < format->bases.nr; i++) - commits[i] = format->bases.items[i].util; + for (size_t i = bases_nr = 0; i < used_atom_cnt; i++) { + if (used_atom[i].atom_type == ATOM_AHEADBEHIND) + bases_nr++; + } + if (!bases_nr) + return; - ALLOC_ARRAY(array->counts, st_mult(format->bases.nr, array->nr)); + ALLOC_ARRAY(commits, st_add(bases_nr, array->nr)); + for (size_t i = 0, j = 0; i < used_atom_cnt; i++) { + if (used_atom[i].atom_type == ATOM_AHEADBEHIND) + commits[j++] = used_atom[i].u.base.commit; + } - commits_nr = format->bases.nr; + ALLOC_ARRAY(array->counts, st_mult(bases_nr, array->nr)); + + commits_nr = bases_nr; array->counts_nr = 0; for (size_t i = 0; i < array->nr; i++) { const char *name = array->items[i]->refname; @@ -3108,8 +3116,8 @@ void filter_ahead_behind(struct repository *r, if (!commits[commits_nr]) continue; - CALLOC_ARRAY(array->items[i]->counts, format->bases.nr); - for (size_t j = 0; j < format->bases.nr; j++) { + CALLOC_ARRAY(array->items[i]->counts, bases_nr); + for (size_t j = 0; j < bases_nr; j++) { struct ahead_behind_count *count; count = &array->counts[array->counts_nr++]; count->tip_index = commits_nr; @@ -3277,9 +3285,12 @@ static inline int can_do_iterative_format(struct ref_filter *filter, * - filtering on reachability * - including ahead-behind information in the formatted output */ + for (size_t i = 0; i < used_atom_cnt; i++) { + if (used_atom[i].atom_type == ATOM_AHEADBEHIND) + return 0; + } return !(filter->reachable_from || filter->unreachable_from || - format->bases.nr || format->is_base_tips.nr); } @@ -3303,7 +3314,7 @@ void filter_and_format_refs(struct ref_filter *filter, unsigned int type, } else { struct ref_array array = { 0 }; filter_refs(&array, filter, type); - filter_ahead_behind(the_repository, format, &array); + filter_ahead_behind(the_repository, &array); filter_is_base(the_repository, format, &array); ref_array_sort(sorting, &array); print_formatted_ref_array(&array, format); @@ -3647,7 +3658,6 @@ void ref_format_init(struct ref_format *format) void ref_format_clear(struct ref_format *format) { - string_list_clear(&format->bases, 0); string_list_clear(&format->is_base_tips, 0); ref_format_init(format); } diff --git a/ref-filter.h b/ref-filter.h index 754038ab078669..5f3dd6c9318915 100644 --- a/ref-filter.h +++ b/ref-filter.h @@ -99,9 +99,6 @@ struct ref_format { /* Internal state to ref-filter */ int need_color_reset_at_eol; - /* List of bases for ahead-behind counts. */ - struct string_list bases; - /* List of bases for is-base indicators. */ struct string_list is_base_tips; @@ -117,7 +114,6 @@ struct ref_format { } #define REF_FORMAT_INIT { \ .use_color = -1, \ - .bases = STRING_LIST_INIT_DUP, \ .is_base_tips = STRING_LIST_INIT_DUP, \ } @@ -205,7 +201,6 @@ struct ref_array_item *ref_array_push(struct ref_array *array, * If this is not called, then any ahead-behind atoms will be blank. */ void filter_ahead_behind(struct repository *r, - struct ref_format *format, struct ref_array *array); /* diff --git a/t/t3203-branch-output.sh b/t/t3203-branch-output.sh index 500c9d0e727a29..a6bd88a58d0a8c 100755 --- a/t/t3203-branch-output.sh +++ b/t/t3203-branch-output.sh @@ -368,6 +368,34 @@ test_expect_success 'git branch --format with ahead-behind' ' test_cmp expect actual ' +test_expect_success 'git branch `--sort=[-]ahead-behind` option' ' + cat >expect <<-\EOF && + (HEAD detached from fromtag) 0 0 + refs/heads/ambiguous 0 0 + refs/heads/branch-two 0 0 + refs/heads/branch-one 1 0 + refs/heads/main 1 0 + refs/heads/ref-to-branch 1 0 + refs/heads/ref-to-remote 1 0 + EOF + git branch --format="%(refname) %(ahead-behind:HEAD)" \ + --sort=refname --sort=ahead-behind:HEAD >actual && + test_cmp expect actual && + + cat >expect <<-\EOF && + (HEAD detached from fromtag) 0 0 + refs/heads/branch-one 1 0 + refs/heads/main 1 0 + refs/heads/ref-to-branch 1 0 + refs/heads/ref-to-remote 1 0 + refs/heads/ambiguous 0 0 + refs/heads/branch-two 0 0 + EOF + git branch --format="%(refname) %(ahead-behind:HEAD)" \ + --sort=refname --sort=-ahead-behind:HEAD >actual && + test_cmp expect actual +' + test_expect_success 'git branch with --format=%(rest) must fail' ' test_must_fail git branch --format="%(rest)" >actual ' From 7ee4fd18ace71d187ee3ea5ba745a6a3493e0e0a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= Date: Sat, 18 Jan 2025 18:11:51 +0100 Subject: [PATCH 15/28] ref-filter: move is-base tip to used_atom MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The string_list "is_base_tips" in struct ref_format stores the committish part of "is-base:". It has the same problems that its sibling string_list "bases" had. Fix them the same way as the previous commit did for the latter, by replacing the string_list with fields in "used_atom". Helped-by: Jeff King Signed-off-by: René Scharfe Signed-off-by: Junio C Hamano --- ref-filter.c | 56 +++++++++++++++++++++++++------------------ ref-filter.h | 5 ---- t/t6600-test-reach.sh | 29 ++++++++++++++++++++++ 3 files changed, 62 insertions(+), 28 deletions(-) diff --git a/ref-filter.c b/ref-filter.c index 250e412c47181a..1c474224a04240 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -236,6 +236,7 @@ static struct used_atom { S_FINGERPRINT, S_PRI_KEY_FP, S_TRUST_LEVEL } option; } signature; struct { + char *name; struct commit *commit; } base; struct strvec describe_args; @@ -908,18 +909,16 @@ static int ahead_behind_atom_parser(struct ref_format *format UNUSED, return 0; } -static int is_base_atom_parser(struct ref_format *format, - struct used_atom *atom UNUSED, +static int is_base_atom_parser(struct ref_format *format UNUSED, + struct used_atom *atom, const char *arg, struct strbuf *err) { - struct string_list_item *item; - if (!arg) return strbuf_addf_ret(err, -1, _("expected format: %%(is-base:)")); - item = string_list_append(&format->is_base_tips, arg); - item->util = lookup_commit_reference_by_name(arg); - if (!item->util) + atom->u.base.name = xstrdup(arg); + atom->u.base.commit = lookup_commit_reference_by_name(arg); + if (!atom->u.base.commit) die("failed to find '%s'", arg); return 0; @@ -3009,6 +3008,8 @@ void ref_array_clear(struct ref_array *array) free(atom->u.head); else if (atom->atom_type == ATOM_DESCRIBE) strvec_clear(&atom->u.describe_args); + else if (atom->atom_type == ATOM_ISBASE) + free(atom->u.base.name); else if (atom->atom_type == ATOM_TRAILERS || (atom->atom_type == ATOM_CONTENTS && atom->u.contents.option == C_TRAILERS)) { @@ -3133,14 +3134,20 @@ void filter_ahead_behind(struct repository *r, } void filter_is_base(struct repository *r, - struct ref_format *format, struct ref_array *array) { struct commit **bases; - size_t bases_nr = 0; + size_t bases_nr = 0, is_base_nr; struct ref_array_item **back_index; - if (!format->is_base_tips.nr || !array->nr) + if (!array->nr) + return; + + for (size_t i = is_base_nr = 0; i < used_atom_cnt; i++) { + if (used_atom[i].atom_type == ATOM_ISBASE) + is_base_nr++; + } + if (!is_base_nr) return; CALLOC_ARRAY(back_index, array->nr); @@ -3150,7 +3157,7 @@ void filter_is_base(struct repository *r, const char *name = array->items[i]->refname; struct commit *c = lookup_commit_reference_by_name_gently(name, 1); - CALLOC_ARRAY(array->items[i]->is_base, format->is_base_tips.nr); + CALLOC_ARRAY(array->items[i]->is_base, is_base_nr); if (!c) continue; @@ -3160,15 +3167,20 @@ void filter_is_base(struct repository *r, bases_nr++; } - for (size_t i = 0; i < format->is_base_tips.nr; i++) { - struct commit *tip = format->is_base_tips.items[i].util; - int base_index = get_branch_base_for_tip(r, tip, bases, bases_nr); + for (size_t i = 0, j = 0; i < used_atom_cnt; i++) { + struct commit *tip; + int base_index; + + if (used_atom[i].atom_type != ATOM_ISBASE) + continue; + tip = used_atom[i].u.base.commit; + base_index = get_branch_base_for_tip(r, tip, bases, bases_nr); if (base_index < 0) continue; /* Store the string for use in output later. */ - back_index[base_index]->is_base[i] = xstrdup(format->is_base_tips.items[i].string); + back_index[base_index]->is_base[j++] = xstrdup(used_atom[i].u.base.name); } free(back_index); @@ -3260,8 +3272,7 @@ struct ref_sorting { }; static inline int can_do_iterative_format(struct ref_filter *filter, - struct ref_sorting *sorting, - struct ref_format *format) + struct ref_sorting *sorting) { /* * Reference backends sort patterns lexicographically by refname, so if @@ -3288,17 +3299,17 @@ static inline int can_do_iterative_format(struct ref_filter *filter, for (size_t i = 0; i < used_atom_cnt; i++) { if (used_atom[i].atom_type == ATOM_AHEADBEHIND) return 0; + if (used_atom[i].atom_type == ATOM_ISBASE) + return 0; } - return !(filter->reachable_from || - filter->unreachable_from || - format->is_base_tips.nr); + return !(filter->reachable_from || filter->unreachable_from); } void filter_and_format_refs(struct ref_filter *filter, unsigned int type, struct ref_sorting *sorting, struct ref_format *format) { - if (can_do_iterative_format(filter, sorting, format)) { + if (can_do_iterative_format(filter, sorting)) { int save_commit_buffer_orig; struct ref_filter_and_format_cbdata ref_cbdata = { .filter = filter, @@ -3315,7 +3326,7 @@ void filter_and_format_refs(struct ref_filter *filter, unsigned int type, struct ref_array array = { 0 }; filter_refs(&array, filter, type); filter_ahead_behind(the_repository, &array); - filter_is_base(the_repository, format, &array); + filter_is_base(the_repository, &array); ref_array_sort(sorting, &array); print_formatted_ref_array(&array, format); ref_array_clear(&array); @@ -3658,6 +3669,5 @@ void ref_format_init(struct ref_format *format) void ref_format_clear(struct ref_format *format) { - string_list_clear(&format->is_base_tips, 0); ref_format_init(format); } diff --git a/ref-filter.h b/ref-filter.h index 5f3dd6c9318915..0ba94df65198a2 100644 --- a/ref-filter.h +++ b/ref-filter.h @@ -99,9 +99,6 @@ struct ref_format { /* Internal state to ref-filter */ int need_color_reset_at_eol; - /* List of bases for is-base indicators. */ - struct string_list is_base_tips; - struct { int max_count; int omit_empty; @@ -114,7 +111,6 @@ struct ref_format { } #define REF_FORMAT_INIT { \ .use_color = -1, \ - .is_base_tips = STRING_LIST_INIT_DUP, \ } /* Macros for checking --merged and --no-merged options */ @@ -210,7 +206,6 @@ void filter_ahead_behind(struct repository *r, * If this is not called, then any is-base atoms will be blank. */ void filter_is_base(struct repository *r, - struct ref_format *format, struct ref_array *array); void ref_filter_init(struct ref_filter *filter); diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh index 2591f8b8b39bf4..6638d1aa1dcebe 100755 --- a/t/t6600-test-reach.sh +++ b/t/t6600-test-reach.sh @@ -733,4 +733,33 @@ test_expect_success 'for-each-ref is-base:multiple' ' --format="%(refname)[%(is-base:commit-2-3)-%(is-base:commit-6-5)]" --stdin ' +test_expect_success 'for-each-ref is-base: --sort' ' + cat >input <<-\EOF && + refs/heads/commit-1-1 + refs/heads/commit-4-2 + refs/heads/commit-4-4 + refs/heads/commit-8-4 + EOF + + cat >expect <<-\EOF && + refs/heads/commit-1-1 + refs/heads/commit-4-4 + refs/heads/commit-8-4 + refs/heads/commit-4-2 + EOF + run_all_modes git for-each-ref \ + --format="%(refname)" --stdin \ + --sort=refname --sort=is-base:commit-2-3 && + + cat >expect <<-\EOF && + refs/heads/commit-4-2 + refs/heads/commit-1-1 + refs/heads/commit-4-4 + refs/heads/commit-8-4 + EOF + run_all_modes git for-each-ref \ + --format="%(refname)" --stdin \ + --sort=refname --sort=-is-base:commit-2-3 +' + test_done From c5490ce9d1b625516b17253d1d2d0352730b7b84 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= Date: Sat, 18 Jan 2025 18:11:59 +0100 Subject: [PATCH 16/28] ref-filter: remove ref_format_clear() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Now that ref_format_clear() no longer releases any memory we don't need it anymore. Remove it and its counterpart, ref_format_init(). Signed-off-by: René Scharfe Signed-off-by: Junio C Hamano --- builtin/branch.c | 1 - builtin/for-each-ref.c | 1 - builtin/tag.c | 1 - builtin/verify-tag.c | 1 - ref-filter.c | 11 ----------- ref-filter.h | 3 --- 6 files changed, 18 deletions(-) diff --git a/builtin/branch.c b/builtin/branch.c index fbb9536282dcaa..9a29de5bf1dd41 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -884,7 +884,6 @@ int cmd_branch(int argc, string_list_clear(&output, 0); ref_sorting_release(sorting); ref_filter_clear(&filter); - ref_format_clear(&format); ret = 0; goto out; diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c index 715745a262aa35..8085ebd8fe97b5 100644 --- a/builtin/for-each-ref.c +++ b/builtin/for-each-ref.c @@ -108,7 +108,6 @@ int cmd_for_each_ref(int argc, filter_and_format_refs(&filter, flags, sorting, &format); ref_filter_clear(&filter); - ref_format_clear(&format); ref_sorting_release(sorting); strvec_clear(&vec); return 0; diff --git a/builtin/tag.c b/builtin/tag.c index c4bd1458318900..e8a344b9264b9c 100644 --- a/builtin/tag.c +++ b/builtin/tag.c @@ -698,7 +698,6 @@ int cmd_tag(int argc, cleanup: ref_sorting_release(sorting); ref_filter_clear(&filter); - ref_format_clear(&format); strbuf_release(&buf); strbuf_release(&ref); strbuf_release(&reflog_msg); diff --git a/builtin/verify-tag.c b/builtin/verify-tag.c index a7f20618ffd528..f6b97048a57d37 100644 --- a/builtin/verify-tag.c +++ b/builtin/verify-tag.c @@ -69,6 +69,5 @@ int cmd_verify_tag(int argc, if (format.format) pretty_print_ref(name, &oid, &format); } - ref_format_clear(&format); return had_error; } diff --git a/ref-filter.c b/ref-filter.c index 1c474224a04240..2ec72d66bbb256 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -3660,14 +3660,3 @@ void ref_filter_clear(struct ref_filter *filter) free_commit_list(filter->unreachable_from); ref_filter_init(filter); } - -void ref_format_init(struct ref_format *format) -{ - struct ref_format blank = REF_FORMAT_INIT; - memcpy(format, &blank, sizeof(blank)); -} - -void ref_format_clear(struct ref_format *format) -{ - ref_format_init(format); -} diff --git a/ref-filter.h b/ref-filter.h index 0ba94df65198a2..013d4cfa64b310 100644 --- a/ref-filter.h +++ b/ref-filter.h @@ -211,7 +211,4 @@ void filter_is_base(struct repository *r, void ref_filter_init(struct ref_filter *filter); void ref_filter_clear(struct ref_filter *filter); -void ref_format_init(struct ref_format *format); -void ref_format_clear(struct ref_format *format); - #endif /* REF_FILTER_H */ From 75bc40de27966ec02fc0fb91b875a47fff022f9b Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Sun, 19 Jan 2025 08:23:08 -0500 Subject: [PATCH 17/28] bswap.h: squelch potential sparse -Wcast-truncate warnings In put_be32(), we right-shift a uint32_t value various amounts and then assign the low 8-bits to individual "unsigned char" bytes, throwing away the high bits. For shifts smaller than 24 bits, those thrown away bits will be arbitrary bits from the original uint32_t. This works exactly as we want, but if you feed a constant, then sparse complains. For example if we write this (which we plan to do in a future patch): put_be32(hdr, PACK_SIGNATURE); then "make sparse" produces: compat/bswap.h:175:22: error: cast truncates bits from constant value (5041 becomes 41) compat/bswap.h:176:22: error: cast truncates bits from constant value (504143 becomes 43) compat/bswap.h:177:22: error: cast truncates bits from constant value (5041434b becomes 4b) And the same issue exists in the other put_be*() functions, when used with a constant. We can silence this warning by explicitly masking off the truncated bits. The compiler is smart enough to know the result is the same, and the asm generated by gcc (with both -O0 and -O2) is identical. Curiously this line already exists: put_be32(&hdr_version, INDEX_EXTENSION_VERSION2); in the fsmonitor.c file, but it does not get flagged because the CPP macro expands to a small integer (2). Signed-off-by: Jeff King Signed-off-by: Junio C Hamano Signed-off-by: Johannes Schindelin --- compat/bswap.h | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/compat/bswap.h b/compat/bswap.h index 512f6f4b9937c8..b34054f2bd7284 100644 --- a/compat/bswap.h +++ b/compat/bswap.h @@ -171,23 +171,23 @@ static inline uint64_t get_be64(const void *ptr) static inline void put_be32(void *ptr, uint32_t value) { unsigned char *p = ptr; - p[0] = value >> 24; - p[1] = value >> 16; - p[2] = value >> 8; - p[3] = value >> 0; + p[0] = (value >> 24) & 0xff; + p[1] = (value >> 16) & 0xff; + p[2] = (value >> 8) & 0xff; + p[3] = (value >> 0) & 0xff; } static inline void put_be64(void *ptr, uint64_t value) { unsigned char *p = ptr; - p[0] = value >> 56; - p[1] = value >> 48; - p[2] = value >> 40; - p[3] = value >> 32; - p[4] = value >> 24; - p[5] = value >> 16; - p[6] = value >> 8; - p[7] = value >> 0; + p[0] = (value >> 56) & 0xff; + p[1] = (value >> 48) & 0xff; + p[2] = (value >> 40) & 0xff; + p[3] = (value >> 32) & 0xff; + p[4] = (value >> 24) & 0xff; + p[5] = (value >> 16) & 0xff; + p[6] = (value >> 8) & 0xff; + p[7] = (value >> 0) & 0xff; } #endif /* COMPAT_BSWAP_H */ From b3c9b6138373ccc8ca3568593b3ae0fe72c25ad0 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Sun, 19 Jan 2025 08:23:37 -0500 Subject: [PATCH 18/28] packfile: factor out --pack_header argument parsing Both index-pack and unpack-objects accept a --pack_header argument. This is an undocumented internal argument used by receive-pack and fetch to pass along information about the header of the pack, which they've already read from the incoming stream. In preparation for a bugfix, let's factor the duplicated code into a common helper. The callers are still responsible for identifying the option. While this could likewise be factored out, it is more flexible this way (e.g., if they ever started using parse-options and wanted to handle both the stuck and unstuck forms). Likewise, the callers are responsible for reporting errors, though they both just call die(). I've tweaked unpack-objects to match index-pack in marking the error for translation. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano Signed-off-by: Johannes Schindelin --- builtin/index-pack.c | 14 +++----------- builtin/unpack-objects.c | 16 ++++------------ packfile.c | 17 +++++++++++++++++ packfile.h | 6 ++++++ 4 files changed, 30 insertions(+), 23 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 763b01372aade4..bab42dfc2a491a 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1801,18 +1801,10 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) nr_threads = 1; } } else if (starts_with(arg, "--pack_header=")) { - struct pack_header *hdr; - char *c; - - hdr = (struct pack_header *)input_buffer; - hdr->hdr_signature = htonl(PACK_SIGNATURE); - hdr->hdr_version = htonl(strtoul(arg + 14, &c, 10)); - if (*c != ',') - die(_("bad %s"), arg); - hdr->hdr_entries = htonl(strtoul(c + 1, &c, 10)); - if (*c) + if (parse_pack_header_option(arg + 14, + input_buffer, + &input_len) < 0) die(_("bad %s"), arg); - input_len = sizeof(*hdr); } else if (!strcmp(arg, "-v")) { verbose = 1; } else if (!strcmp(arg, "--progress-title")) { diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index 08fa2a7a743dc9..31614472749bf9 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -15,6 +15,7 @@ #include "progress.h" #include "decorate.h" #include "fsck.h" +#include "packfile.h" static int dry_run, quiet, recover, has_errors, strict; static const char unpack_usage[] = "git unpack-objects [-n] [-q] [-r] [--strict]"; @@ -639,18 +640,9 @@ int cmd_unpack_objects(int argc, const char **argv, const char *prefix UNUSED) continue; } if (starts_with(arg, "--pack_header=")) { - struct pack_header *hdr; - char *c; - - hdr = (struct pack_header *)buffer; - hdr->hdr_signature = htonl(PACK_SIGNATURE); - hdr->hdr_version = htonl(strtoul(arg + 14, &c, 10)); - if (*c != ',') - die("bad %s", arg); - hdr->hdr_entries = htonl(strtoul(c + 1, &c, 10)); - if (*c) - die("bad %s", arg); - len = sizeof(*hdr); + if (parse_pack_header_option(arg + 14, + buffer, &len) < 0) + die(_("bad %s"), arg); continue; } if (skip_prefix(arg, "--max-input-size=", &arg)) { diff --git a/packfile.c b/packfile.c index 813584646f762a..e2bdadc7cbd44b 100644 --- a/packfile.c +++ b/packfile.c @@ -2294,3 +2294,20 @@ int is_promisor_object(const struct object_id *oid) } return oidset_contains(&promisor_objects, oid); } + +int parse_pack_header_option(const char *in, unsigned char *out, unsigned int *len) +{ + struct pack_header *hdr; + char *c; + + hdr = (struct pack_header *)out; + hdr->hdr_signature = htonl(PACK_SIGNATURE); + hdr->hdr_version = htonl(strtoul(in, &c, 10)); + if (*c != ',') + return -1; + hdr->hdr_entries = htonl(strtoul(c + 1, &c, 10)); + if (*c) + return -1; + *len = sizeof(*hdr); + return 0; +} diff --git a/packfile.h b/packfile.h index eb18ec15dbf3bc..41f38b4832bf91 100644 --- a/packfile.h +++ b/packfile.h @@ -210,4 +210,10 @@ int is_promisor_object(const struct object_id *oid); int load_idx(const char *path, const unsigned int hashsz, void *idx_map, size_t idx_size, struct packed_git *p); +/* + * Parse a --pack_header option as accepted by index-pack and unpack-objects, + * turning it into the matching bytes we'd find in a pack. + */ +int parse_pack_header_option(const char *in, unsigned char *out, unsigned int *len); + #endif From 56c5e82ca8e351bb2093a804e6684a72dd392125 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Sun, 19 Jan 2025 08:23:44 -0500 Subject: [PATCH 19/28] parse_pack_header_option(): avoid unaligned memory writes In order to recreate a pack header in our in-memory buffer, we cast the buffer to a "struct pack_header" and assign the individual fields. This is reported to cause SIGBUS on sparc64 due to alignment issues. We can work around this by using put_be32() which will write individual bytes into the buffer. Reported-by: Koakuma Signed-off-by: Jeff King Signed-off-by: Junio C Hamano Signed-off-by: Johannes Schindelin --- packfile.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/packfile.c b/packfile.c index e2bdadc7cbd44b..93f771ad95c949 100644 --- a/packfile.c +++ b/packfile.c @@ -2297,17 +2297,20 @@ int is_promisor_object(const struct object_id *oid) int parse_pack_header_option(const char *in, unsigned char *out, unsigned int *len) { - struct pack_header *hdr; + unsigned char *hdr; char *c; - hdr = (struct pack_header *)out; - hdr->hdr_signature = htonl(PACK_SIGNATURE); - hdr->hdr_version = htonl(strtoul(in, &c, 10)); + hdr = out; + put_be32(hdr, PACK_SIGNATURE); + hdr += 4; + put_be32(hdr, strtoul(in, &c, 10)); + hdr += 4; if (*c != ',') return -1; - hdr->hdr_entries = htonl(strtoul(c + 1, &c, 10)); + put_be32(hdr, strtoul(c + 1, &c, 10)); + hdr += 4; if (*c) return -1; - *len = sizeof(*hdr); + *len = hdr - out; return 0; } From 7215d586d4139a7f30039a4d106f3965ae8f6071 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Sun, 19 Jan 2025 08:25:47 -0500 Subject: [PATCH 20/28] index-pack, unpack-objects: use get_be32() for reading pack header Both of these commands read the incoming pack into a static unsigned char buffer in BSS, and then parse it by casting the start of the buffer to a struct pack_header. This can result in SIGBUS on some platforms if the compiler doesn't place the buffer in a position that is properly aligned for 4-byte integers. This reportedly happens with unpack-objects (but not index-pack) on sparc64 when compiled with clang (but not gcc). But we are definitely in the wrong in both spots; since the buffer's type is unsigned char, we can't depend on larger alignment. When it works it is only because we are lucky. We'll fix this by switching to get_be32() to read the headers (just like the last few commits similarly switched us to put_be32() for writing into the same buffer). It would be nice to factor this out into a common helper function, but the interface ends up quite awkward. Either the caller needs to hardcode how many bytes we'll need, or it needs to pass us its fill()/use() functions as pointers. So I've just fixed both spots in the same way; this is not code that is likely to be repeated a third time (most of the pack reading code uses an mmap'd buffer, which should be properly aligned). I did make one tweak to the shared code: our pack_version_ok() macro expects us to pass the big-endian value we'd get by casting. We can introduce a "native" variant which uses the host integer ordering. Reported-by: Koakuma Signed-off-by: Jeff King Signed-off-by: Junio C Hamano Signed-off-by: Johannes Schindelin --- builtin/index-pack.c | 12 +++++++----- builtin/unpack-objects.c | 13 +++++++------ pack.h | 3 ++- 3 files changed, 16 insertions(+), 12 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index bab42dfc2a491a..5f0ff1ce04a22f 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -363,16 +363,18 @@ static const char *open_pack_file(const char *pack_name) static void parse_pack_header(void) { - struct pack_header *hdr = fill(sizeof(struct pack_header)); + unsigned char *hdr = fill(sizeof(struct pack_header)); /* Header consistency check */ - if (hdr->hdr_signature != htonl(PACK_SIGNATURE)) + if (get_be32(hdr) != PACK_SIGNATURE) die(_("pack signature mismatch")); - if (!pack_version_ok(hdr->hdr_version)) + hdr += 4; + if (!pack_version_ok_native(get_be32(hdr))) die(_("pack version %"PRIu32" unsupported"), - ntohl(hdr->hdr_version)); + get_be32(hdr)); + hdr += 4; - nr_objects = ntohl(hdr->hdr_entries); + nr_objects = get_be32(hdr); use(sizeof(struct pack_header)); } diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index 31614472749bf9..fc3de6dac8913e 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -576,15 +576,16 @@ static void unpack_one(unsigned nr) static void unpack_all(void) { int i; - struct pack_header *hdr = fill(sizeof(struct pack_header)); + unsigned char *hdr = fill(sizeof(struct pack_header)); - nr_objects = ntohl(hdr->hdr_entries); - - if (ntohl(hdr->hdr_signature) != PACK_SIGNATURE) + if (get_be32(hdr) != PACK_SIGNATURE) die("bad pack file"); - if (!pack_version_ok(hdr->hdr_version)) + hdr += 4; + if (!pack_version_ok_native(get_be32(hdr))) die("unknown pack file version %"PRIu32, - ntohl(hdr->hdr_version)); + get_be32(hdr)); + hdr += 4; + nr_objects = get_be32(hdr); use(sizeof(struct pack_header)); if (!quiet) diff --git a/pack.h b/pack.h index 3ab9e3f60c0b03..33b42fdad597dc 100644 --- a/pack.h +++ b/pack.h @@ -13,7 +13,8 @@ struct repository; */ #define PACK_SIGNATURE 0x5041434b /* "PACK" */ #define PACK_VERSION 2 -#define pack_version_ok(v) ((v) == htonl(2) || (v) == htonl(3)) +#define pack_version_ok(v) pack_version_ok_native(ntohl(v)) +#define pack_version_ok_native(v) ((v) == 2 || (v) == 3) struct pack_header { uint32_t hdr_signature; uint32_t hdr_version; From f2d9cf95f7f9f84d54e1a9f96f867e818fcb6a71 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Sun, 19 Jan 2025 08:25:53 -0500 Subject: [PATCH 21/28] index-pack, unpack-objects: use skip_prefix to avoid magic number When parsing --pack_header=, we manually skip 14 bytes to the data. Let's use skip_prefix() to do this automatically. Note that we overwrite our pointer to the front of the string, so we have to add more context to the error message. We could avoid this by declaring an extra pointer to hold the value, but I think the modified message is actually preferable; it should give translators a bit more context. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano Signed-off-by: Johannes Schindelin --- builtin/index-pack.c | 6 +++--- builtin/unpack-objects.c | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 5f0ff1ce04a22f..d80e05458121ea 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1802,11 +1802,11 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) warning(_("no threads support, ignoring %s"), arg); nr_threads = 1; } - } else if (starts_with(arg, "--pack_header=")) { - if (parse_pack_header_option(arg + 14, + } else if (skip_prefix(arg, "--pack_header=", &arg)) { + if (parse_pack_header_option(arg, input_buffer, &input_len) < 0) - die(_("bad %s"), arg); + die(_("bad --pack_header: %s"), arg); } else if (!strcmp(arg, "-v")) { verbose = 1; } else if (!strcmp(arg, "--progress-title")) { diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index fc3de6dac8913e..028cfe175e035e 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -640,10 +640,10 @@ int cmd_unpack_objects(int argc, const char **argv, const char *prefix UNUSED) fsck_set_msg_types(&fsck_options, arg); continue; } - if (starts_with(arg, "--pack_header=")) { - if (parse_pack_header_option(arg + 14, + if (skip_prefix(arg, "--pack_header=", &arg)) { + if (parse_pack_header_option(arg, buffer, &len) < 0) - die(_("bad %s"), arg); + die(_("bad --pack_header: %s"), arg); continue; } if (skip_prefix(arg, "--max-input-size=", &arg)) { From 66387793677c660bbafeb8bbf1ec2ac914e6a1d0 Mon Sep 17 00:00:00 2001 From: Karthik Nayak Date: Tue, 21 Jan 2025 04:34:10 +0100 Subject: [PATCH 22/28] refs: mark `ref_transaction_update_reflog()` as static The `ref_transaction_update_reflog()` function is only used within 'refs.c', so mark it as static. Reported-by: Junio C Hamano Signed-off-by: Karthik Nayak Signed-off-by: Junio C Hamano --- refs.c | 22 +++++++++++++++------- refs.h | 14 -------------- 2 files changed, 15 insertions(+), 21 deletions(-) diff --git a/refs.c b/refs.c index a5851a7de00878..4c67c95fbed0c9 100644 --- a/refs.c +++ b/refs.c @@ -1270,13 +1270,21 @@ int ref_transaction_update(struct ref_transaction *transaction, return 0; } -int ref_transaction_update_reflog(struct ref_transaction *transaction, - const char *refname, - const struct object_id *new_oid, - const struct object_id *old_oid, - const char *committer_info, unsigned int flags, - const char *msg, unsigned int index, - struct strbuf *err) +/* + * Similar to`ref_transaction_update`, but this function is only for adding + * a reflog update. Supports providing custom committer information. The index + * field can be utiltized to order updates as desired. When not used, the + * updates default to being ordered by refname. + */ +static int ref_transaction_update_reflog(struct ref_transaction *transaction, + const char *refname, + const struct object_id *new_oid, + const struct object_id *old_oid, + const char *committer_info, + unsigned int flags, + const char *msg, + unsigned int index, + struct strbuf *err) { struct ref_update *update; diff --git a/refs.h b/refs.h index b0dfc65ed2e59c..a5bedf48cf6de9 100644 --- a/refs.h +++ b/refs.h @@ -727,20 +727,6 @@ int ref_transaction_update(struct ref_transaction *transaction, unsigned int flags, const char *msg, struct strbuf *err); -/* - * Similar to`ref_transaction_update`, but this function is only for adding - * a reflog update. Supports providing custom committer information. The index - * field can be utiltized to order updates as desired. When not used, the - * updates default to being ordered by refname. - */ -int ref_transaction_update_reflog(struct ref_transaction *transaction, - const char *refname, - const struct object_id *new_oid, - const struct object_id *old_oid, - const char *committer_info, unsigned int flags, - const char *msg, unsigned int index, - struct strbuf *err); - /* * Add a reference creation to transaction. new_oid is the value that * the reference should have after the update; it must not be From a89e12dc164f43f9e728ac096db02bd7e403bdbc Mon Sep 17 00:00:00 2001 From: Karthik Nayak Date: Tue, 21 Jan 2025 04:34:11 +0100 Subject: [PATCH 23/28] refs: use 'uint64_t' for 'ref_update.index' The 'ref_update.index' variable is used to store an index for a given reference update. This index is used to order the updates in a predetermined order, while the default ordering is alphabetical as per the refname. For large repositories with millions of references, it should be safer to use 'uint64_t'. Let's do that. This also is applied for all other code sections where we store 'index' and pass it around. Reported-by: brian m. carlson Signed-off-by: Karthik Nayak Signed-off-by: Junio C Hamano --- refs.c | 4 ++-- refs/refs-internal.h | 4 ++-- refs/reftable-backend.c | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/refs.c b/refs.c index 4c67c95fbed0c9..c65a554a7368ed 100644 --- a/refs.c +++ b/refs.c @@ -1283,7 +1283,7 @@ static int ref_transaction_update_reflog(struct ref_transaction *transaction, const char *committer_info, unsigned int flags, const char *msg, - unsigned int index, + uint64_t index, struct strbuf *err) { struct ref_update *update; @@ -2731,7 +2731,7 @@ static int migrate_one_ref(const char *refname, const char *referent UNUSED, con } struct reflog_migration_data { - unsigned int index; + uint64_t index; const char *refname; struct ref_store *old_refs; struct ref_transaction *transaction; diff --git a/refs/refs-internal.h b/refs/refs-internal.h index 2aaff91ab4826a..ecb25f6d2062f2 100644 --- a/refs/refs-internal.h +++ b/refs/refs-internal.h @@ -120,7 +120,7 @@ struct ref_update { * when migrating reflogs and we want to ensure we carry over the * same order. */ - unsigned int index; + uint64_t index; /* * If this ref_update was split off of a symref update via @@ -203,7 +203,7 @@ struct ref_transaction { enum ref_transaction_state state; void *backend_data; unsigned int flags; - unsigned int max_index; + uint64_t max_index; }; /* diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index 68db2baa8f15c4..628bfa0c5be922 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -852,7 +852,7 @@ struct write_transaction_table_arg { size_t updates_nr; size_t updates_alloc; size_t updates_expected; - unsigned int max_index; + uint64_t max_index; }; struct reftable_transaction_data { From 148560f1e3bbdb9cf27533bed43cdf116bf61d01 Mon Sep 17 00:00:00 2001 From: Karthik Nayak Date: Tue, 21 Jan 2025 04:34:12 +0100 Subject: [PATCH 24/28] reftable: prevent 'update_index' changes after adding records The function `reftable_writer_set_limits()` allows updating the 'min_update_index' and 'max_update_index' of a reftable writer. These values are written to both the writer's header and footer. Since the header is written during the first block write, any subsequent changes to the update index would create a mismatch between the header and footer values. The footer would contain the newer values while the header retained the original ones. To fix this bug, prevent callers from updating these values after any record is written. To do this, modify the function to return an error whenever the limits are modified after any record adds. Check for record adds within `reftable_writer_set_limits()` by checking the `last_key` variable, which is set whenever a new record is added. Modify all callers of the function to anticipate a return type and handle it accordingly. Helped-by: Patrick Steinhardt Signed-off-by: Karthik Nayak Signed-off-by: Junio C Hamano --- refs/reftable-backend.c | 20 +++++++++++++++----- reftable/reftable-error.h | 1 + reftable/reftable-writer.h | 24 ++++++++++++++---------- reftable/stack.c | 6 ++++-- reftable/writer.c | 13 +++++++++++-- t/unit-tests/t-reftable-stack.c | 8 +++++--- 6 files changed, 50 insertions(+), 22 deletions(-) diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index 628bfa0c5be922..b94a6abae0e4bb 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -1317,7 +1317,9 @@ static int write_transaction_table(struct reftable_writer *writer, void *cb_data * multiple entries. Each entry will contain a different update_index, * so set the limits accordingly. */ - reftable_writer_set_limits(writer, ts, ts + arg->max_index); + ret = reftable_writer_set_limits(writer, ts, ts + arg->max_index); + if (ret < 0) + goto done; for (i = 0; i < arg->updates_nr; i++) { struct reftable_transaction_update *tx_update = &arg->updates[i]; @@ -1640,7 +1642,9 @@ static int write_copy_table(struct reftable_writer *writer, void *cb_data) deletion_ts = creation_ts = reftable_stack_next_update_index(arg->stack); if (arg->delete_old) creation_ts++; - reftable_writer_set_limits(writer, deletion_ts, creation_ts); + ret = reftable_writer_set_limits(writer, deletion_ts, creation_ts); + if (ret < 0) + goto done; /* * Add the new reference. If this is a rename then we also delete the @@ -2160,7 +2164,9 @@ static int write_reflog_existence_table(struct reftable_writer *writer, if (ret <= 0) goto done; - reftable_writer_set_limits(writer, ts, ts); + ret = reftable_writer_set_limits(writer, ts, ts); + if (ret < 0) + goto done; /* * The existence entry has both old and new object ID set to the @@ -2219,7 +2225,9 @@ static int write_reflog_delete_table(struct reftable_writer *writer, void *cb_da uint64_t ts = reftable_stack_next_update_index(arg->stack); int ret; - reftable_writer_set_limits(writer, ts, ts); + ret = reftable_writer_set_limits(writer, ts, ts); + if (ret < 0) + goto out; ret = reftable_stack_init_log_iterator(arg->stack, &it); if (ret < 0) @@ -2295,7 +2303,9 @@ static int write_reflog_expiry_table(struct reftable_writer *writer, void *cb_da if (arg->records[i].value_type == REFTABLE_LOG_UPDATE) live_records++; - reftable_writer_set_limits(writer, ts, ts); + ret = reftable_writer_set_limits(writer, ts, ts); + if (ret < 0) + return ret; if (!is_null_oid(&arg->update_oid)) { struct reftable_ref_record ref = {0}; diff --git a/reftable/reftable-error.h b/reftable/reftable-error.h index f4048265629fe4..a7e33d964d0cfe 100644 --- a/reftable/reftable-error.h +++ b/reftable/reftable-error.h @@ -30,6 +30,7 @@ enum reftable_error { /* Misuse of the API: * - on writing a record with NULL refname. + * - on writing a record before setting the writer limits. * - on writing a reftable_ref_record outside the table limits * - on writing a ref or log record before the stack's * next_update_inde*x diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h index e4fc95378835ff..46776578ef5573 100644 --- a/reftable/reftable-writer.h +++ b/reftable/reftable-writer.h @@ -109,17 +109,21 @@ int reftable_writer_new(struct reftable_writer **out, int (*flush_func)(void *), void *writer_arg, const struct reftable_write_options *opts); -/* Set the range of update indices for the records we will add. When writing a - table into a stack, the min should be at least - reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned. - - For transactional updates to a stack, typically min==max, and the - update_index can be obtained by inspeciting the stack. When converting an - existing ref database into a single reftable, this would be a range of - update-index timestamps. +/* + * Set the range of update indices for the records we will add. When writing a + * table into a stack, the min should be at least + * reftable_stack_next_update_index(), or REFTABLE_API_ERROR is returned. + * + * For transactional updates to a stack, typically min==max, and the + * update_index can be obtained by inspeciting the stack. When converting an + * existing ref database into a single reftable, this would be a range of + * update-index timestamps. + * + * The function should be called before adding any records to the writer. If not + * it will fail with REFTABLE_API_ERROR. */ -void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min, - uint64_t max); +int reftable_writer_set_limits(struct reftable_writer *w, uint64_t min, + uint64_t max); /* Add a reftable_ref_record. The record should have names that come after diff --git a/reftable/stack.c b/reftable/stack.c index c33979536efa3a..95938ec1ef5af2 100644 --- a/reftable/stack.c +++ b/reftable/stack.c @@ -1045,8 +1045,10 @@ static int stack_write_compact(struct reftable_stack *st, for (size_t i = first; i <= last; i++) st->stats.bytes += st->readers[i]->size; - reftable_writer_set_limits(wr, st->readers[first]->min_update_index, - st->readers[last]->max_update_index); + err = reftable_writer_set_limits(wr, st->readers[first]->min_update_index, + st->readers[last]->max_update_index); + if (err < 0) + goto done; err = reftable_merged_table_new(&mt, st->readers + first, subtabs_len, st->opts.hash_id); diff --git a/reftable/writer.c b/reftable/writer.c index 0b88e5583b2993..0bd834090c0302 100644 --- a/reftable/writer.c +++ b/reftable/writer.c @@ -165,11 +165,20 @@ int reftable_writer_new(struct reftable_writer **out, return 0; } -void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min, - uint64_t max) +int reftable_writer_set_limits(struct reftable_writer *w, uint64_t min, + uint64_t max) { + /* + * The limits should be set before any records are added to the writer. + * Check if any records were added by checking if `last_key` was set. + */ + if (w->last_key.len) + return REFTABLE_API_ERROR; + w->min_update_index = min; w->max_update_index = max; + + return 0; } static void writer_release(struct reftable_writer *w) diff --git a/t/unit-tests/t-reftable-stack.c b/t/unit-tests/t-reftable-stack.c index 52b81475c36aa9..2738f449f0b252 100644 --- a/t/unit-tests/t-reftable-stack.c +++ b/t/unit-tests/t-reftable-stack.c @@ -98,7 +98,8 @@ static void t_read_file(void) static int write_test_ref(struct reftable_writer *wr, void *arg) { struct reftable_ref_record *ref = arg; - reftable_writer_set_limits(wr, ref->update_index, ref->update_index); + check(!reftable_writer_set_limits(wr, ref->update_index, + ref->update_index)); return reftable_writer_add_ref(wr, ref); } @@ -138,7 +139,8 @@ static int write_test_log(struct reftable_writer *wr, void *arg) { struct write_log_arg *wla = arg; - reftable_writer_set_limits(wr, wla->update_index, wla->update_index); + check(!reftable_writer_set_limits(wr, wla->update_index, + wla->update_index)); return reftable_writer_add_log(wr, wla->log); } @@ -956,7 +958,7 @@ static void t_reflog_expire(void) static int write_nothing(struct reftable_writer *wr, void *arg UNUSED) { - reftable_writer_set_limits(wr, 1, 1); + check(!reftable_writer_set_limits(wr, 1, 1)); return 0; } From 1edca76e805c22daeba4d7ea4193bcfe3df98ef6 Mon Sep 17 00:00:00 2001 From: Adam Murray Date: Fri, 10 Jan 2025 07:28:20 +0000 Subject: [PATCH 25/28] trace2: prevent segfault on config collection where no value specified When TRACE2 analytics is enabled, a git config option that has no value causes a segfault. Steps to Reproduce GIT_TRACE2=true GIT_TRACE2_CONFIG_PARAMS=status.* git -c status.relativePaths version Expected Result git version 2.46.0 Actual Result zsh: segmentation fault GIT_TRACE2=true This adds checks to prevent the segfault and instead return an empty value. Signed-off-by: Adam Murray Signed-off-by: Johannes Schindelin --- t/t0210-trace2-normal.sh | 8 ++++++++ trace2.c | 2 +- trace2/tr2_tgt_event.c | 3 ++- trace2/tr2_tgt_normal.c | 5 +++-- trace2/tr2_tgt_perf.c | 5 +++-- 5 files changed, 17 insertions(+), 6 deletions(-) diff --git a/t/t0210-trace2-normal.sh b/t/t0210-trace2-normal.sh index eff9a59dbd0d5d..b95f0f8a9dc3b0 100755 --- a/t/t0210-trace2-normal.sh +++ b/t/t0210-trace2-normal.sh @@ -243,6 +243,14 @@ test_expect_success 'bug messages followed by BUG() are written to trace2' ' test_cmp expect actual ' +test_expect_success 'empty configuration values are handled' ' + test_when_finished "rm trace2.normal actual expect" && + echo >expect && + GIT_TRACE2="$(pwd)/trace2.normal" GIT_TRACE2_CONFIG_PARAMS=foo.empty \ + git -c foo.empty config foo.empty >actual && + test_cmp expect actual +' + sane_unset GIT_TRACE2_BRIEF # Now test without environment variables and get all Trace2 settings diff --git a/trace2.c b/trace2.c index 82d16e2783d986..a21b1e91dbca20 100644 --- a/trace2.c +++ b/trace2.c @@ -764,7 +764,7 @@ void trace2_def_param_fl(const char *file, int line, const char *param, if (!trace2_enabled) return; - redacted = redact_arg(value); + redacted = value ? redact_arg(value): NULL; for_each_wanted_builtin (j, tgt_j) if (tgt_j->pfn_param_fl) diff --git a/trace2/tr2_tgt_event.c b/trace2/tr2_tgt_event.c index 69ee40449fa4a7..5a0381791f7eb4 100644 --- a/trace2/tr2_tgt_event.c +++ b/trace2/tr2_tgt_event.c @@ -493,7 +493,8 @@ static void fn_param_fl(const char *file, int line, const char *param, event_fmt_prepare(event_name, file, line, NULL, &jw); jw_object_string(&jw, "scope", scope_name); jw_object_string(&jw, "param", param); - jw_object_string(&jw, "value", value); + if (value) + jw_object_string(&jw, "value", value); jw_end(&jw); tr2_dst_write_line(&tr2dst_event, &jw.json); diff --git a/trace2/tr2_tgt_normal.c b/trace2/tr2_tgt_normal.c index baef48aa6989ce..924736ab36093b 100644 --- a/trace2/tr2_tgt_normal.c +++ b/trace2/tr2_tgt_normal.c @@ -307,8 +307,9 @@ static void fn_param_fl(const char *file, int line, const char *param, enum config_scope scope = kvi->scope; const char *scope_name = config_scope_name(scope); - strbuf_addf(&buf_payload, "def_param scope:%s %s=%s", scope_name, param, - value); + strbuf_addf(&buf_payload, "def_param scope:%s %s", scope_name, param); + if (value) + strbuf_addf(&buf_payload, "=%s", value); normal_io_write_fl(file, line, &buf_payload); strbuf_release(&buf_payload); } diff --git a/trace2/tr2_tgt_perf.c b/trace2/tr2_tgt_perf.c index 298ae27f9d7f24..4eb9289f950505 100644 --- a/trace2/tr2_tgt_perf.c +++ b/trace2/tr2_tgt_perf.c @@ -448,8 +448,9 @@ static void fn_param_fl(const char *file, int line, const char *param, struct strbuf scope_payload = STRBUF_INIT; enum config_scope scope = kvi->scope; const char *scope_name = config_scope_name(scope); - - strbuf_addf(&buf_payload, "%s:%s", param, value); + strbuf_addstr(&buf_payload, param); + if (value) + strbuf_addf(&buf_payload, ":%s", value); strbuf_addf(&scope_payload, "%s:%s", "scope", scope_name); perf_io_write_fl(file, line, event_name, NULL, NULL, NULL, From b7a9905f756b42aa4d2bafc07c489fa69e4fffca Mon Sep 17 00:00:00 2001 From: Jeff King Date: Mon, 13 Jan 2025 01:26:01 -0500 Subject: [PATCH 26/28] grep: prevent `^$` false match at end of file In some implementations, `regexec_buf()` assumes that it is fed lines; Without `REG_NOTEOL` it thinks the end of the buffer is the end of a line. Which makes sense, but trips up this case because we are not feeding lines, but rather a whole buffer. So the final newline is not the start of an empty line, but the true end of the buffer. This causes an interesting bug: $ echo content >file.txt $ git grep --no-index -n '^$' file.txt file.txt:2: This bug is fixed by making the end of the buffer consistently the end of the final line. The patch was applied from https://lore.kernel.org/git/20250113062601.GD767856@coredump.intra.peff.net/ Reported-by: Olly Betts Signed-off-by: Johannes Schindelin --- grep.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/grep.c b/grep.c index 4e155ee9e66367..c4bb9f10814024 100644 --- a/grep.c +++ b/grep.c @@ -1646,6 +1646,8 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle bol = gs->buf; left = gs->size; + if (left && gs->buf[left-1] == '\n') + left--; while (left) { const char *eol; int hit; From 86d0c304264527fd26d7f1d29bab83c6e566d947 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Tue, 21 Jan 2025 16:52:35 -0500 Subject: [PATCH 27/28] update-ref: do set reflog's `old_oid` In git 2.48.1, the `git update-ref` subcommand no longer correctly updates the reflog in some cases. Specifically, it appears that the `old_oid` field will not be updated when modifying a branch referenced by another symbolic ref (e.g. HEAD). This doesn't break the `git reflog` subcommand, but does break references like `HEAD@{1}`, which appear to read the `old_oid` field: git init -b main git commit --allow-empty -m "A" git commit --allow-empty -m "B" git update-ref -m "reason" refs/heads/main HEAD~ HEAD The `old_oid` field is now empty (all zeroes). This is only the case in derived reflogs (in this case .git/logs/HEAD). The reflog for `refs/heads/main` appears to be updated correctly. This was broken in 297c09eabb (refs: allow multiple reflog entries for the same refname, 2024-12-16). The reason for that was that there was assumed the flow of `lock_ref_for_update()` for reflog only updates was to capture the lock only. But this is wrong since this misses the `old_oid` population. As such this patch is the correct fix. Reported-by: Nika Layzell Acked-by: Karthik Nayak Signed-off-by: Johannes Schindelin --- refs/files-backend.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 5cfb8b7ca8678e..29f08dced40418 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -2615,9 +2615,6 @@ static int lock_ref_for_update(struct files_ref_store *refs, update->backend_data = lock; - if (update->flags & REF_LOG_ONLY) - goto out; - if (update->type & REF_ISSYMREF) { if (update->flags & REF_NO_DEREF) { /* From 290ad15c95ffece76bebd729edc5a929259a2a74 Mon Sep 17 00:00:00 2001 From: Karthik Nayak Date: Wed, 15 Jan 2025 11:54:51 +0000 Subject: [PATCH 28/28] fixup! reftable: write correct max_update_index to header The original commit was missing some initializations. This lead to the somewhat intuitive (and not reliably reproducible, until the trick was found to use `sanitize=address,undefined`) symptom that t1400.249 and/or t2400.171 failed with: Assertion failed: (ret != REFTABLE_API_ERROR), function reftable_be_transaction_finish, file reftable-backend.c, line 1648. or Assertion failed: (ret != REFTABLE_API_ERROR), function write_transaction_table, file reftable-backend.c, line 1619. Signed-off-by: Karthik Nayak Signed-off-by: Johannes Schindelin --- refs/reftable-backend.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index 9cfb0cb26721a9..d26b5bf85c9b2c 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -1020,6 +1020,7 @@ static int prepare_transaction_update(struct write_transaction_table_arg **out, arg->updates_nr = 0; arg->updates_alloc = 0; arg->updates_expected = 0; + arg->max_index = 0; } arg->updates_expected++; @@ -1634,6 +1635,8 @@ static int reftable_be_transaction_finish(struct ref_store *ref_store UNUSED, tx_data->args->max_index = transaction->max_index; for (size_t i = 0; i < tx_data->args_nr; i++) { + tx_data->args[i].max_index = transaction->max_index; + ret = reftable_addition_add(tx_data->args[i].addition, write_transaction_table, &tx_data->args[i]); if (ret < 0)