Skip to content

fix git remote performance on dolt push#10538

Merged
coffeegoddd merged 5 commits intomainfrom
db/perf
Feb 20, 2026
Merged

fix git remote performance on dolt push#10538
coffeegoddd merged 5 commits intomainfrom
db/perf

Conversation

@coffeegoddd
Copy link
Copy Markdown
Contributor

Prior to this change pushes that would take ~6s to DoltHub took ~70s to a Git remote. This pr removes fetch for non-manifest keys that are already in the cache on the read path which brings a push to a git remote down to around ~17s. One the write path, tablefiles are now only written to the remote at the same time the manifest is written, in a single commit and push call. This change brings the performance down to about ~8s in my testing. This optimization doesn't change the Blobstore interface, but does introduce a risk for a caller-side bug, by incorrectly calling Put(tf1), CheckAndPut(manifest_referencing_tf1_and_tf2), Put(tf2). This changes the visibility contract that a Put is already on the remote.

Also, fixes git fetch failures caused by stale remote-tracking refs: add --refmap="" so fetch only processes our explicit refspec and ignores the remote's default fetch refspecs, which can fail with ref directory/file conflicts unrelated to our ref.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes GitBlobstore performance when pushing to a Git remote by avoiding unnecessary fetches for cached non-manifest keys, batching non-manifest writes to push alongside the manifest in a single commit/push, and making git fetch ignore the remote’s default refspecs to avoid failures from stale tracking refs.

Changes:

  • Add --refmap= to git fetch to prevent processing the remote’s configured fetch refspecs and avoid ref conflicts.
  • Defer non-manifest Put/Concatenate remote writes and flush them during CheckAndPut("manifest") as a single commit+push.
  • Update tests to account for deferred non-manifest writes by explicitly flushing via CheckAndPut("manifest").

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
go/store/blobstore/internal/git/impl.go Adjusts FetchRef to use --refmap= so fetch only applies the explicit refspec.
go/store/blobstore/git_blobstore.go Implements deferred non-manifest writes, cache-first read behavior for non-manifest keys, and manifest-flush batching.
go/store/blobstore/git_blobstore_test.go Updates remote-managed tests to flush deferred writes via manifest CheckAndPut.
go/store/blobstore/git_blobstore_chunked_put_test.go Updates chunked put tests for deferred write behavior and remote verification after flush.
go/store/blobstore/git_blobstore_helpers_test.go Removes helper tests for functions deleted/refactored by the read-path changes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@coffeegoddd
Copy link
Copy Markdown
Contributor Author

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000
version result total
d718e9d ok 5937471
version total_tests
d718e9d 5937471
correctness_percentage
100.0

@coffeegoddd
Copy link
Copy Markdown
Contributor Author

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000
version result total
56e9b56 ok 5937471
version total_tests
56e9b56 5937471
correctness_percentage
100.0

Copy link
Copy Markdown
Contributor

@macneale4 macneale4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit, and a question about removed tests. If the tests were moved because they are no longer relevant, go ahead and ship. It seems like they are still useful though, unless I'm not understanding something.

return f.pushRefWithLease(ctx, remote, srcRef, dstRef, expectedDstOID)
}

func TestGitBlobstoreHelpers_resolveCommitForGet(t *testing.T) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this no longer required? Seems like you'd still want to know when these resources aren't found.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these were for some helper methods that were removed, we have coverage elsewhere

require.Equal(t, []byte("abcdefghij"), got)

// Flush deferred writes and verify remote state.
_, err = bs.CheckAndPut(ctx, "", "manifest", 3, bytes.NewReader([]byte("m1\n")))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit. the use of m1\n makes is look like this is a magic string, when I think it's just like saying foobar\n - so it's clear to people that it's nonsense data input.

@coffeegoddd
Copy link
Copy Markdown
Contributor Author

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000
version result total
93af7ac ok 5937471
version total_tests
93af7ac 5937471
correctness_percentage
100.0

@coffeegoddd
Copy link
Copy Markdown
Contributor Author

@coffeegoddd DOLT

test_name from_latency_p95 to_latency_p95 percent_change
tpcc-scale-factor-1 61.08 61.08 0.0
test_name from_server_name from_server_version from_tps to_server_name to_server_version to_tps percent_change
tpcc-scale-factor-1 dolt 72f81a6 37.78 dolt 93af7ac 37.68 -0.26

@coffeegoddd coffeegoddd merged commit a6b115f into main Feb 20, 2026
25 checks passed
@coffeegoddd coffeegoddd deleted the db/perf branch February 20, 2026 23:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants