prototype(protobuf): two-pass size estimator (+6.28x encode, -54% memory)#3
Closed
intech wants to merge 1 commit intofeat/add-benchmark-suitefrom
Closed
prototype(protobuf): two-pass size estimator (+6.28x encode, -54% memory)#3intech wants to merge 1 commit intofeat/add-benchmark-suitefrom
intech wants to merge 1 commit intofeat/add-benchmark-suitefrom
Conversation
Adds toBinaryFast() — opt-in fast path using two-pass size estimation and a pre-allocated buffer for write. Ports the pattern from open-telemetry/opentelemetry-js#6390 (ProtobufLogsSerializer) to the protobuf-es reflective encode. Motivation ---------- The existing toBinary uses BinaryWriter with fork/join per length- delimited field — every nested message and every packed repeated scalar pushes chunk/buf state onto a stack, serializes into its own chunk list, then re-emits a varint length prefix and concatenates. On OTel-shaped workloads (ResourceSpans -> ScopeSpans -> Span -> KeyValue) that produces many small Uint8Array/number[] allocations and a final double-copy in finish(). The two-pass variant walks the descriptor once to compute the exact encoded size, allocates a single Uint8Array of that size, then writes bytes into it at fixed offsets. Length prefixes computed in pass 1 are cached per submessage object and reused in pass 2, so pass 2 is a straight-line write loop. Results (Node 25.8.1, x86_64, tinybench, OTel-like 100-span payload) -------------------------------------------------------------------- create() + toBinary() combined workload: create + toBinary 353 ops/s baseline create + toBinaryFast 1758 ops/s +397% (4.98x) toBinary() on pre-built message: toBinary 385 ops/s baseline toBinaryFast 2417 ops/s +528% (6.28x) Cross-library (vs protobufjs pbjs static-module): protobuf-es toBinary pre-built 428 ops/s protobuf-es toBinaryFast pre-built 3868 ops/s protobufjs encode pre-built 3259 ops/s -> toBinaryFast beats protobufjs by +19% on encode path. Memory (1000 iters, forced GC, heapUsed delta): protobuf-es create+toBinary 10,211 B/op protobuf-es create+toBinaryFast 4,670 B/op -54% protobufjs create+encode 7,450 B/op -> toBinaryFast now uses less heap than protobufjs. Scope (MVP) ----------- Supported: all 15 scalar types, enum, repeated scalar (packed and unpacked), nested messages, repeated messages. Correctness verified with semantic round-trip (decode(toBinaryFast) structurally-equal to decode(toBinary)) on the OTel ExportTraceRequest fixture and on SimpleMessage; both fixtures in fact produce byte-identical output in the current code path. Fallback: schemas using maps, oneofs, extensions, or delimited/group encoding fall back to toBinary. The decision is cached per DescMessage in a WeakMap, so the support check does not dominate the hot path after the first call. Unknown fields are dropped by the fast path. Callers that must round-trip unknown fields should continue to use toBinary. Testing ------- - Existing protobuf-test suite: 2823/2823 passing. - Correctness verification: benchmarks/src/verify-correctness.ts exercises ExportTraceRequest and SimpleMessage fixtures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced Apr 19, 2026
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Opt-in fast-path
toBinaryFastusing two-pass size estimation and apre-allocated buffer. Ports opentelemetry-js#6390
(ProtobufLogsSerializer) to the protobuf-es reflective encode.
Hypothesis
Existing reflective
toBinaryusesBinaryWriterwithfork/joinfor every length-delimited field — each nested message and every packed
repeated scalar stacks its own
chunks: Uint8Array[]andbuf: number[],then re-emits a varint length prefix and concatenates. On OTel-shaped
workloads (ResourceSpans -> ScopeSpans -> Span -> KeyValue) that produces
many small allocations and a final double-copy in
finish(). Two-passvariant (size estimate -> pre-alloc -> single straight-line write) should
eliminate this.
Results
Node 25.8.1, x86_64, tinybench, 100-span OTel-like payload.
Throughput (create + encode combined):
toBinarytoBinaryFastThroughput (encode-only, pre-built message):
toBinarytoBinaryFastCross-library (encode-only, pre-built):
toBinaryencodetoBinaryFastFast path now beats protobufjs by +19% on the encode-only path for
this workload.
Memory (1000 iters, forced GC, heapUsed delta):
create + toBinarycreate + toBinaryFastcreate + encodeFast path uses less heap than protobufjs on the same workload.
Scope (MVP)
Supported:
Fallback to
toBinarywhen schema uses:The support decision is cached per
DescMessagein aWeakMap, so thefallback check does not dominate the hot path after the first call per
schema.
Unknown fields are dropped by the fast path. Callers that must round-trip
unknowns should continue to use
toBinary.Correctness
benchmarks/src/verify-correctness.tsexercises the OTelExportTraceRequestfixture andSimpleMessagefixture. Both producebyte-identical output compared to
toBinary(stricter than the claimedsemantic-identity guarantee).
@bufbuild/protobuf-testsuite: 2,823 / 2,823 passing.decode(toBinaryFast(msg))structurally equal todecode(toBinary(msg))on OTel fixtures.
Trade-offs
toBinaryis untouched — no behaviour change forexisting callers.
flat messages the combined
create + toBinaryFastpath is only ~1.4xfaster than baseline (vs ~5x on the nested OTel payload) because the
estimator walk is a larger share of the budget when there is little to
serialize.
schemas with many unsupported fields will not benefit.
Scope of this PR
Internal PR within the
Connectum-Framework/protobuf-esfork. Upstreamsubmission to
bufbuild/protobuf-esis gated on further review; thepoint of this stacked PR is to measure the pattern and decide whether to
fold it into the default
toBinarypath or keep it as a separateexport.
Test plan
@bufbuild/protobuf-test)decode(toBinaryFast(x)) == decode(toBinary(x))on OTeland SimpleMessage fixtures
Base:
feat/add-benchmark-suite(stacked on the benchmark infrastructure PR).Generated with Claude Code