forked from git/git
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Hydrate missing loose objects in check_and_freshen()
Hydrate missing loose objects in check_and_freshen() when running virtualized. Add test cases to verify read-object hook works when running virtualized. This hook is called in check_and_freshen() rather than check_and_freshen_local() to make the hook work also with alternates. Helped-by: Kevin Willford <[email protected]> Signed-off-by: Ben Peart <[email protected]>
- Loading branch information
Showing
5 changed files
with
485 additions
and
16 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,102 @@ | ||
Read Object Process | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
The read-object process enables Git to read all missing blobs with a | ||
single process invocation for the entire life of a single Git command. | ||
This is achieved by using a packet format (pkt-line, see technical/ | ||
protocol-common.txt) based protocol over standard input and standard | ||
output as follows. All packets, except for the "*CONTENT" packets and | ||
the "0000" flush packet, are considered text and therefore are | ||
terminated by a LF. | ||
|
||
Git starts the process when it encounters the first missing object that | ||
needs to be retrieved. After the process is started, Git sends a welcome | ||
message ("git-read-object-client"), a list of supported protocol version | ||
numbers, and a flush packet. Git expects to read a welcome response | ||
message ("git-read-object-server"), exactly one protocol version number | ||
from the previously sent list, and a flush packet. All further | ||
communication will be based on the selected version. | ||
|
||
The remaining protocol description below documents "version=1". Please | ||
note that "version=42" in the example below does not exist and is only | ||
there to illustrate how the protocol would look with more than one | ||
version. | ||
|
||
After the version negotiation Git sends a list of all capabilities that | ||
it supports and a flush packet. Git expects to read a list of desired | ||
capabilities, which must be a subset of the supported capabilities list, | ||
and a flush packet as response: | ||
------------------------ | ||
packet: git> git-read-object-client | ||
packet: git> version=1 | ||
packet: git> version=42 | ||
packet: git> 0000 | ||
packet: git< git-read-object-server | ||
packet: git< version=1 | ||
packet: git< 0000 | ||
packet: git> capability=get | ||
packet: git> capability=have | ||
packet: git> capability=put | ||
packet: git> capability=not-yet-invented | ||
packet: git> 0000 | ||
packet: git< capability=get | ||
packet: git< 0000 | ||
------------------------ | ||
The only supported capability in version 1 is "get". | ||
|
||
Afterwards Git sends a list of "key=value" pairs terminated with a flush | ||
packet. The list will contain at least the command (based on the | ||
supported capabilities) and the sha1 of the object to retrieve. Please | ||
note, that the process must not send any response before it received the | ||
final flush packet. | ||
|
||
When the process receives the "get" command, it should make the requested | ||
object available in the git object store and then return success. Git will | ||
then check the object store again and this time find it and proceed. | ||
------------------------ | ||
packet: git> command=get | ||
packet: git> sha1=0a214a649e1b3d5011e14a3dc227753f2bd2be05 | ||
packet: git> 0000 | ||
------------------------ | ||
|
||
The process is expected to respond with a list of "key=value" pairs | ||
terminated with a flush packet. If the process does not experience | ||
problems then the list must contain a "success" status. | ||
------------------------ | ||
packet: git< status=success | ||
packet: git< 0000 | ||
------------------------ | ||
|
||
In case the process cannot or does not want to process the content, it | ||
is expected to respond with an "error" status. | ||
------------------------ | ||
packet: git< status=error | ||
packet: git< 0000 | ||
------------------------ | ||
|
||
In case the process cannot or does not want to process the content as | ||
well as any future content for the lifetime of the Git process, then it | ||
is expected to respond with an "abort" status at any point in the | ||
protocol. | ||
------------------------ | ||
packet: git< status=abort | ||
packet: git< 0000 | ||
------------------------ | ||
|
||
Git neither stops nor restarts the process in case the "error"/"abort" | ||
status is set. | ||
|
||
If the process dies during the communication or does not adhere to the | ||
protocol then Git will stop the process and restart it with the next | ||
object that needs to be processed. | ||
|
||
After the read-object process has processed an object it is expected to | ||
wait for the next "key=value" list containing a command. Git will close | ||
the command pipe on exit. The process is expected to detect EOF and exit | ||
gracefully on its own. Git will wait until the process has stopped. | ||
|
||
A long running read-object process demo implementation can be found in | ||
`contrib/long-running-read-object/example.pl` located in the Git core | ||
repository. If you develop your own long running process then the | ||
`GIT_TRACE_PACKET` environment variables can be very helpful for | ||
debugging (see linkgit:git[1]). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
#!/usr/bin/perl | ||
# | ||
# Example implementation for the Git read-object protocol version 1 | ||
# See Documentation/technical/read-object-protocol.txt | ||
# | ||
# Allows you to test the ability for blobs to be pulled from a host git repo | ||
# "on demand." Called when git needs a blob it couldn't find locally due to | ||
# a lazy clone that only cloned the commits and trees. | ||
# | ||
# A lazy clone can be simulated via the following commands from the host repo | ||
# you wish to create a lazy clone of: | ||
# | ||
# cd /host_repo | ||
# git rev-parse HEAD | ||
# git init /guest_repo | ||
# git cat-file --batch-check --batch-all-objects | grep -v 'blob' | | ||
# cut -d' ' -f1 | git pack-objects /guest_repo/.git/objects/pack/noblobs | ||
# cd /guest_repo | ||
# git config core.virtualizeobjects true | ||
# git reset --hard <sha from rev-parse call above> | ||
# | ||
# Please note, this sample is a minimal skeleton. No proper error handling | ||
# was implemented. | ||
# | ||
|
||
use strict; | ||
use warnings; | ||
|
||
# | ||
# Point $DIR to the folder where your host git repo is located so we can pull | ||
# missing objects from it | ||
# | ||
my $DIR = "/host_repo/.git/"; | ||
|
||
sub packet_bin_read { | ||
my $buffer; | ||
my $bytes_read = read STDIN, $buffer, 4; | ||
if ( $bytes_read == 0 ) { | ||
|
||
# EOF - Git stopped talking to us! | ||
exit(); | ||
} | ||
elsif ( $bytes_read != 4 ) { | ||
die "invalid packet: '$buffer'"; | ||
} | ||
my $pkt_size = hex($buffer); | ||
if ( $pkt_size == 0 ) { | ||
return ( 1, "" ); | ||
} | ||
elsif ( $pkt_size > 4 ) { | ||
my $content_size = $pkt_size - 4; | ||
$bytes_read = read STDIN, $buffer, $content_size; | ||
if ( $bytes_read != $content_size ) { | ||
die "invalid packet ($content_size bytes expected; $bytes_read bytes read)"; | ||
} | ||
return ( 0, $buffer ); | ||
} | ||
else { | ||
die "invalid packet size: $pkt_size"; | ||
} | ||
} | ||
|
||
sub packet_txt_read { | ||
my ( $res, $buf ) = packet_bin_read(); | ||
unless ( $buf =~ s/\n$// ) { | ||
die "A non-binary line MUST be terminated by an LF."; | ||
} | ||
return ( $res, $buf ); | ||
} | ||
|
||
sub packet_bin_write { | ||
my $buf = shift; | ||
print STDOUT sprintf( "%04x", length($buf) + 4 ); | ||
print STDOUT $buf; | ||
STDOUT->flush(); | ||
} | ||
|
||
sub packet_txt_write { | ||
packet_bin_write( $_[0] . "\n" ); | ||
} | ||
|
||
sub packet_flush { | ||
print STDOUT sprintf( "%04x", 0 ); | ||
STDOUT->flush(); | ||
} | ||
|
||
( packet_txt_read() eq ( 0, "git-read-object-client" ) ) || die "bad initialize"; | ||
( packet_txt_read() eq ( 0, "version=1" ) ) || die "bad version"; | ||
( packet_bin_read() eq ( 1, "" ) ) || die "bad version end"; | ||
|
||
packet_txt_write("git-read-object-server"); | ||
packet_txt_write("version=1"); | ||
packet_flush(); | ||
|
||
( packet_txt_read() eq ( 0, "capability=get" ) ) || die "bad capability"; | ||
( packet_bin_read() eq ( 1, "" ) ) || die "bad capability end"; | ||
|
||
packet_txt_write("capability=get"); | ||
packet_flush(); | ||
|
||
while (1) { | ||
my ($command) = packet_txt_read() =~ /^command=([^=]+)$/; | ||
|
||
if ( $command eq "get" ) { | ||
my ($sha1) = packet_txt_read() =~ /^sha1=([0-9a-f]{40})$/; | ||
packet_bin_read(); | ||
|
||
system ('git --git-dir="' . $DIR . '" cat-file blob ' . $sha1 . ' | git -c core.virtualizeobjects=false hash-object -w --stdin >/dev/null 2>&1'); | ||
packet_txt_write(($?) ? "status=error" : "status=success"); | ||
packet_flush(); | ||
} else { | ||
die "bad command '$command'"; | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.