Skip to content

prototype(protobuf): extend toBinaryFast with map + oneof (+ real OTel fixture)#4

Closed
intech wants to merge 2 commits intofeat/prototype-size-estimatorfrom
feat/prototype-estimator-map-oneof
Closed

prototype(protobuf): extend toBinaryFast with map + oneof (+ real OTel fixture)#4
intech wants to merge 2 commits intofeat/prototype-size-estimatorfrom
feat/prototype-estimator-map-oneof

Conversation

@intech
Copy link
Copy Markdown

@intech intech commented Apr 19, 2026

Summary

  • Extends toBinaryFast to support map fields (all legal key and value types) and oneof groups (scalar/message/enum cases).
  • Only proto2 delimited/group encoding still falls back to reflective toBinary.
  • Updates the benchmark fixture to the real OTel shape: AnyValue oneof under KeyValue, map<string,string> on Resource.labels.
  • Adds 16 focused parity tests asserting byte-identical output against toBinary for every new feature surface, including the tricky zero-valued oneof case.

Supported field types after this PR

Scalars (all 15 types), enums, nested messages, repeated scalar (packed + unpacked), repeated message, map (any legal K/V), oneof groups. Only delimited (group) encoding falls through to toBinary.

Correctness

  • Byte-identical round-trip verified on extended fixture (32,926-byte OTel payload) via verify-correctness.ts.
  • 2,823 existing tests pass; 16 new tests (to-binary-fast.test.ts) cover map<string,>, map<int32,>, map<int64,>, map<bool,string>, map<,message>, map<*,enum>, empty maps, oneof scalar/message/enum cases, zero-valued oneof, and a full-scalar regression.

Results on full-shape fixture (Node 25.8, 100 spans, OTel-like)

Variant ops/s bytes/op
create+toBinary (reflective) 436 21,465
create+toBinaryFast 455 19,501
protobufjs create+encode 2,570 47,457
Pre-built encode ops/s
toBinary 488
toBinaryFast 494
protobufjs 2,689

The full-shape fixture exposes per-entry map dispatch and oneof walk overhead that the simpler pre-H2 fixture didn't. Fast path still beats reflective on throughput and on allocation; the remaining ~5x gap vs protobufjs is the next hypothesis to test (per-schema codegen vs two-pass reflective walk).

Test plan

  • byte-identical round-trip on extended fixture (AnyValue + map)
  • 2,823 existing tests pass, 16 new tests pass
  • benchmarks runnable and produce stable numbers
  • lint clean (biome --error-on-warnings)
  • typecheck clean (packages/protobuf + benchmarks)

Scope

Internal PR within the Connectum-Framework fork, stacked on #3 (toBinaryFast H2 prototype). Upstream submission to bufbuild/protobuf-es is gated on user approval.

Removes map fields and oneof groups from the fast-path fallback blacklist
and encodes both directly using the same two-pass size-estimate-then-write
pattern as the rest of the fast path. The only remaining fallback to the
reflective toBinary is proto2 delimited (group) encoding.

Map fields iterate Object.keys on the runtime plain-object representation
and parse integer/bool string keys back to their typed value before
running the scalar-size / scalar-write helpers. Entry bodies are not
cached — recomputing key+value size per entry is cheap because only the
submessage branch already has a cached size in the sizes map.

Oneof groups are dispatched via desc.oneofs after the regular-field loop.
The ADT shape (message[oneof.localName] = { case, value }) is read
directly; fields with `field.oneof !== undefined` are skipped in the
regular loop so they can't be encoded twice. Crucially, zero-valued
oneof cases are always emitted because presence is carried by the
discriminator, not by the value (new test covers this).

Benchmark fixture updated to the full OTel shape so the measurements
reflect real workload:
  - KeyValue.value is now an AnyValue oneof (string / bool / int / bytes /
    double), matching opentelemetry.proto.common.v1.AnyValue
  - Resource.labels is a map<string,string>, exercising the new map path
  - fixture AnyValue distribution: mostly string, some int, some bool,
    matching what a real OTLP exporter batches

Measurements (Node 25.8, OTel 100-span full-shape fixture):

| Variant                        | ops/s | bytes/op |
|--------------------------------|-------|----------|
| create+toBinary (reflective)   |  436  |  21,465  |
| create+toBinaryFast            |  455  |  19,501  |
| protobufjs create+encode       | 2,570 |  47,457  |

| Pre-built encode              | ops/s |
|-------------------------------|-------|
| toBinary                      |  488  |
| toBinaryFast                  |  494  |
| protobufjs                    | 2,689 |

Correctness: byte-identical output verified against toBinary on the full
OTel fixture (32,926 bytes). 2,823 existing tests pass plus 16 new
tests covering every legal map K/V combination and every oneof-member
kind (scalar, message, enum) including the zero-valued case.

The throughput gap vs protobufjs on this shape (~5x) is larger than on
the simpler pre-H2 fixture. The richer shape exposes per-entry map
overhead and oneof dispatch that protobufjs amortizes in codegen. Next
hypothesis: codegen an encoder per schema so the field walk disappears
from the hot path. Tracked separately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The package compiles with target ES2017; BigInt literal syntax (`0n`,
`-9007199254740993n`) requires ES2020 and triggers TS2737. Materialize
the bigint zero once at module load with a /*@__PURE__*/ annotation so
tree-shakers can drop it, and construct the 64-bit test literal via
`BigInt("...")` string parse.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@intech
Copy link
Copy Markdown
Author

intech commented Apr 19, 2026

Superseded by L0 (#8) + L1+L2 (#10). Kept branch as experimental reference.

@intech intech closed this Apr 19, 2026
@intech intech deleted the feat/prototype-estimator-map-oneof branch April 21, 2026 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant