Skip to content

Commit

Permalink
backfill: add --batch-size=<n> option
Browse files Browse the repository at this point in the history
Users may want to specify a minimum batch size for their needs. This is only
a minimum: the path-walk API provides a list of OIDs that correspond to the
same path, and thus it is optimal to allow delta compression across those
objects in a single server request.

We could consider limiting the request to have a maximum batch size in the
future.

Signed-off-by: Derrick Stolee <[email protected]>
  • Loading branch information
derrickstolee authored and dscho committed Jan 11, 2025
1 parent ebd1692 commit 6bbc831
Show file tree
Hide file tree
Showing 3 changed files with 30 additions and 2 deletions.
10 changes: 9 additions & 1 deletion Documentation/git-backfill.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ git-backfill - Download missing objects in a partial clone
SYNOPSIS
--------
[verse]
'git backfill' [<options>]
'git backfill' [--batch-size=<n>]

DESCRIPTION
-----------
Expand Down Expand Up @@ -38,6 +38,14 @@ delta compression in the packfile sent by the server.
By default, `git backfill` downloads all blobs reachable from the `HEAD`
commit. This set can be restricted or expanded using various options.

OPTIONS
-------

--batch-size=<n>::
Specify a minimum size for a batch of missing objects to request
from the server. This size may be exceeded by the last set of
blobs seen at a given path. Default batch size is 16,000.

SEE ALSO
--------
linkgit:git-clone[1].
Expand Down
4 changes: 3 additions & 1 deletion builtin/backfill.c
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
#include "path-walk.h"

static const char * const builtin_backfill_usage[] = {
N_("git backfill [<options>]"),
N_("git backfill [--batch-size=<n>]"),
NULL
};

Expand Down Expand Up @@ -113,6 +113,8 @@ int cmd_backfill(int argc, const char **argv, const char *prefix, struct reposit
.batch_size = 50000,
};
struct option options[] = {
OPT_INTEGER(0, "batch-size", &ctx.batch_size,
N_("Minimun number of objects to request at a time")),
OPT_END(),
};

Expand Down
18 changes: 18 additions & 0 deletions t/t5620-backfill.sh
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,24 @@ test_expect_success 'do partial clone 1, backfill gets all objects' '
test_line_count = 0 revs2
'

test_expect_success 'do partial clone 2, backfill batch size' '
git clone --no-checkout --filter=blob:none \
--single-branch --branch=main \
"file://$(pwd)/srv.bare" backfill2 &&
GIT_TRACE2_EVENT="$(pwd)/batch-trace" git \
-C backfill2 backfill --batch-size=20 &&
# Batches were used
test_trace2_data promisor fetch_count 20 <batch-trace >matches &&
test_line_count = 2 matches &&
test_trace2_data promisor fetch_count 8 <batch-trace &&
# No more missing objects!
git -C backfill2 rev-list --quiet --objects --missing=print HEAD >revs2 &&
test_line_count = 0 revs2
'

. "$TEST_DIRECTORY"/lib-httpd.sh
start_httpd

Expand Down

0 comments on commit 6bbc831

Please sign in to comment.