Add L1 schema plan and L2 specialized writer by intech · Pull Request #10 · Connectum-Framework/protobuf-es

intech · 2026-04-19T18:51:05Z

Summary

Replaces reflective encoding with a compiled-plan interpreter on the
toBinaryFast entry point. Each DescMessage is compiled once into a
flat Int32Array opcode stream plus side tables (field names,
pre-encoded tag bytes, sub-plans, map-entry plans, oneof case tables);
a single dense switch in executeSchemaPlan walks the stream with
every BinaryWriter call inlined so V8 keeps monomorphic receivers
on the hot path.

Implementation follows the 20 pinned decisions in the L1/L2 design
spec (analysis/p1-t4-l1-l2-design-spec.md, P1-P20).

Approach

L1: flat Int32Array opcodes per schema, variable stride (2 for
singular scalars, 3 for message/list/map/oneof), side-tables indexed
by slot. Compiled once, cached in WeakMap<DescMessage, SchemaPlan | null>. Cycle-safe two-phase compile.
L2: specialized writers inlined into the interpreter switch.
ASCII fast path and int64 tri-dispatch inherited from L0 (no
duplication, P11/P12). Pre-encoded tag bytes emitted via
writer.raw(tags[slot]) (P13). Packed repeated scalars use a tight
inline loop inside a single fork/join (P14). Element dispatch
factored through a small writeScalarByOp helper so V8 sees one
call site per writer method (P19).
Fallback: proto2 groups, delimited-encoded messages inside
lists, and messages carrying unknown fields with
writeUnknownFields: true all transparently delegate to the
reflective toBinary (P5, P10).

Measurements

Node.js 25.8.1, 2-second windows, warmup + 50-iter outer loop.

Workload	Reflective	`toBinaryFast` (L1+L2)	Delta
PerfMessage (nested, 16,132 B)	8,961 ops/s	15,861 ops/s	+77.0%
ScalarValuesMessage (97 B)	453,597 ops/s	505,119 ops/s	+11.4%
MessageFieldMessage (9 B)	737,761 ops/s	725,993 ops/s	-1.6% (noise)
PerfMessage heapUsed/op	664 B	119 B	-82%

Byte-parity verified against reflective on all three fixtures and on
the 14-case parity test suite.

Note: the external benchmarks/ folder used for the L0 OTel
100-span measurements is not merged into the L0 branch. The
PerfMessage fixture (100 nested message payload bundling lists,
maps, oneofs, and strings) is the closest representative workload
available in-tree.

Gates

Byte-parity on all fixtures (14 new parity tests + PerfMessage round-trip)
2,889 existing tests pass (2,875 pre-existing + 14 new)
Throughput improvement on realistic workload (+77%)
Memory improvement on realistic workload (-82%)
No regression on SimpleMessage-style workload (within noise)
Diff under 1,500 LOC (971 LOC in src/, 192 LOC tests)

Stacked on

PR #8 (L0 contiguous-buffer writer). Cannot merge until L0 merges;
rebases automatically once L0 lands on main.

🤖 Generated with Claude Code

Introduce a compiled-plan fast path for protobuf encoding. Each `DescMessage` compiles to a flat `Int32Array` opcode stream plus a handful of side tables (field names, pre-encoded tag bytes, sub-plans, map-entry plans, oneof case tables). A single dense switch in `executeSchemaPlan` interprets the stream with every `BinaryWriter` call inlined into the hot loop, eliminating the reflective dispatch that dominated the previous encoder. Implementation follows the 20 pinned decisions in the L1/L2 design spec: - P1-P10 (L1): flat Int32Array opcodes with variable stride, side tables indexed by slot, `WeakMap` plan cache, cycle-safe two-phase compile, reflective fallback for groups / messages carrying unknown fields, `toBinaryFast` as the sole public entry point. - P11-P20 (L2): inherits the ASCII fast path and int64 tri-dispatch from the L0 `BinaryWriter`, emits pre-encoded tag bytes via `writer.raw`, packs repeated scalars inline inside `fork/join`, keeps element writes monomorphic via a small `writeScalarByOp` dispatcher, skips a separate field-writers module so V8 sees one call site per writer method. Correctness: - 2,889 tests green (2,875 pre-existing + 14 new schema-plan parity tests covering scalars, strings ASCII / UTF-8, packed scalars, singular and repeated messages, string- and int-keyed maps, oneof arms, and unset oneofs). - Byte-parity verified against the reflective `toBinary` on the 16,132-byte `PerfMessage` fixture (100-message payload: scalars, lists, maps, oneofs, nested messages). Measurements (Node.js 25.8.1, 2s windows): - PerfMessage throughput: 8,961 -> 15,861 ops/s (+77%) - ScalarValuesMessage: 453,597 -> 505,119 ops/s (+11%) - MessageFieldMessage: 737,761 -> 725,993 ops/s (-1.6%, noise) - PerfMessage heapUsed/op: 664 B -> 119 B (-82%) Files: - packages/protobuf/src/wire/schema-plan.ts (compiler + interpreter) - packages/protobuf/src/to-binary-fast.ts (public entry + fallback) - packages/protobuf/src/index.ts (export toBinaryFast) - packages/protobuf-test/src/wire/schema-plan.test.ts (parity suite) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Remove the biome-ignore for noDoubleEquals in schema-plan.ts: `v` is typed as `unknown` at the comparison site, so the rule never fires and the suppression triggers `suppressions/unused`. - Re-run `biome format` on the two files touched by the L1/L2 commit so CI's gh-diffcheck stays green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Regenerated after merge of #6 (benchmark matrix), #8 (L0 contiguous writer), #10 (L1+L2 schema plans + specialized writers), #11 (correctness tests). Key results (Node 25.8, log-scale chart): - OTel 100 spans: 525 -> 2,501 ops/s (+376%), 0.80x pbjs (3,110) - OTel Metrics 50: 891 -> 4,773 ops/s (+435%) - OTel Logs 100: 880 -> 3,772 ops/s (+329%) - K8sPodList 20: 712 -> 3,510 ops/s (+393%) - Stress d=8 w=200: 2,568 -> 14,378 ops/s (+460%) - SimpleMessage: 1.39M -> 1.81M ops/s (+30%) Memory allocations per encode reduced proportionally via L0 contiguous buffer + L1 schema-plan opcode interpreter + L2 specialized field writers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Regenerated after merge of #6 (benchmark matrix), #8 (L0 contiguous writer), #10 (L1+L2 schema plans + specialized writers), #11 (correctness tests). Key results (Node 25.8, log-scale chart): - OTel 100 spans: 525 -> 2,501 ops/s (+376%), 0.80x pbjs (3,110) - OTel Metrics 50: 891 -> 4,773 ops/s (+435%) - OTel Logs 100: 880 -> 3,772 ops/s (+329%) - K8sPodList 20: 712 -> 3,510 ops/s (+393%) - Stress d=8 w=200: 2,568 -> 14,378 ops/s (+460%) - SimpleMessage: 1.39M -> 1.81M ops/s (+30%) Memory allocations per encode reduced proportionally via L0 contiguous buffer + L1 schema-plan opcode interpreter + L2 specialized field writers. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

intech changed the title ~~feat(protobuf): L1 schema plans + L2 specialized writers~~ feat(protobuf): Add L1 schema plan and L2 specialized writer Apr 19, 2026

intech changed the title ~~feat(protobuf): Add L1 schema plan and L2 specialized writer~~ Add L1 schema plan and L2 specialized writer Apr 19, 2026

intech and others added 2 commits April 20, 2026 01:39

intech force-pushed the feat/l1-l2-schema-plans branch from 6e276e6 to 85f4dd7 Compare April 19, 2026 21:42

intech self-assigned this Apr 19, 2026

intech merged commit 57c12eb into feat/l0-contiguous-writer Apr 19, 2026
1 check passed

intech mentioned this pull request Apr 19, 2026

Refresh benchmark chart with finalized L0+L1+L2 stack #12

Merged

4 tasks

intech deleted the feat/l1-l2-schema-plans branch April 21, 2026 11:10

intech mentioned this pull request Apr 21, 2026

L1+L2 schema-plan optimization merged into public toBinary (known edge-case gaps) #25

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add L1 schema plan and L2 specialized writer#10

Add L1 schema plan and L2 specialized writer#10
intech merged 2 commits intofeat/l0-contiguous-writerfrom
feat/l1-l2-schema-plans

intech commented Apr 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

intech commented Apr 19, 2026

Summary

Approach

Measurements

Gates

Stacked on

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant