Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

protobuf: add vtproto as a supplemental marshaler #5356

Merged
merged 1 commit into from
Oct 7, 2024

Conversation

jsternberg
Copy link
Collaborator

vtproto is an extra protobuf compiler that generates special methods
suffixed with VT that create typed and unrolled marshal and unmarshal
functions similar to gogo that can be used for performance sensitive
code. These extensions are optional for code to use but buildkit uses
them.

A codec is also included to utilize vtproto for grpc code. If the
package github.com/moby/buildkit/util/encoding/proto is imported then
vtproto will be used if it exists and otherwise it will use the standard
marshaling and unmarshaling methods.

@jsternberg
Copy link
Collaborator Author

I did some profiling of go proto versus vtproto as its used in this PR.

While vtproto offers some definite improvements in speed, one of the bigger improvements is through being able to reuse the byte slice properly. I found in an investigation of the main protobuf code that, even if you use merge, the implementation of unmarshal won't ever reuse that slice so you'll always have a memory allocation. In contrast, vtproto will reuse the slice if it's there and you can generate a reset method that does that using the pool feature.

I've included the current PR with the updated grpc encoding with both a cpu profile and a memory profile to show the difference. It was tested on a clean build where the build context was ~1.1 GB so there's heavy usage of the data transfer over the session.

pprof.buildkitd.cpu.goproto.pb.gz
pprof.buildkitd.cpu.vtproto.pb.gz
pprof.buildkitd.mem.goproto.pb.gz
pprof.buildkitd.mem.vtproto.pb.gz

@@ -17,7 +17,7 @@ func BenchmarkMarshalVertex(b *testing.B) {
v := sampleVertex()
for i := 0; i < b.N; i++ {
var err error
Buf, err = proto.Marshal(v)
Buf, err = v.MarshalVT()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than replace these with a specific implementation, it might be good to have a centralized Marshal function that can be switched out to use different implementations.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A potential issue with this is it could mask which implementation of marshal is being used. The type switch on the message type to determine if it supports MarshalVT would also incur an extra memory allocation due to the creation of an interface type.

We have some areas in the code where we explicitly use the original protobuf implementation because we need the deterministic marshaling of the original library.

@jsternberg
Copy link
Collaborator Author

I posted the profiles with a bit of detail but forgot to expand on that in the issue itself.

The profiles can be loaded with go tool pprof and it's easiest to look at them with -http=: and use the flamegraph and look at the alloc_space sampling. The difference between the two is that the one with the standard Go library allocates a total of 4.34 GB over the course of the build while the vtproto implementation uses 1.46 GB total. This results in the garbage collector running less frequently and the cpu profile shows a total of 29 seconds of CPU use going to 25 seconds of CPU use. This change likely means that the extra time is being spent in IO since both cpu profiles were for 30 seconds.

While there's definitely a benefit to the generated code, a lot of the improvement from this particular case is caused by being able to reuse the byte slice which the standard Go version doesn't allow. That's the link I pasted earlier and there's really nothing you can do other than modify the upstream library to alleviate that problem.

vtproto is an extra protobuf compiler that generates special methods
suffixed with `VT` that create typed and unrolled marshal and unmarshal
functions similar to gogo that can be used for performance sensitive
code. These extensions are optional for code to use but buildkit uses
them.

A codec is also included to utilize vtproto for grpc code. If the
package `github.com/moby/buildkit/util/grpcutil/encoding/proto` is
imported then vtproto will be used if it exists and otherwise it will
use the standard marshaling and unmarshaling methods.

This codec has an important difference from the default codec. The
default codec will always reset messages before unmarshaling. In most
cases, this is unnecessary and is only relevant for `RecvMsg` on
streams. In most cases, if we are passing in an existing message to this
method, we want to reuse the buffers. This codec will always merge the
message when unmarshaling instead of resetting the input message.

Signed-off-by: Jonathan A. Sternberg <[email protected]>
@jsternberg jsternberg marked this pull request as ready for review October 4, 2024 20:24
@tonistiigi tonistiigi merged commit 605c469 into moby:master Oct 7, 2024
93 checks passed
@jsternberg jsternberg deleted the vtproto branch October 7, 2024 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants