Drop toBinaryFast from main, archive L1+L2, add upstream baseline by intech · Pull Request #22 · Connectum-Framework/protobuf-es

intech · 2026-04-20T21:58:35Z

Summary

Reverts the two-encoder API split on main. toBinary stays (L0 contiguous writer from #8, byte-identical to upstream on every fixture in the correctness matrix); toBinaryFast is removed from the public surface. The L1+L2 schema-plan work is preserved on branch archive/l1-l2-schema-plans-experimental for future iteration.

Also adds upstream-protobuf-es (an npm alias for @bufbuild/protobuf@latest) as a third baseline in the benchmark matrix so the chart honestly shows fork's toBinary vs original upstream vs protobufjs, not fork-internal self-comparison. This re-applies the baseline infrastructure from the closed #20.

Why

Merging L1+L2 directly into public toBinary (attempted in Merge L1+L2 schema-plan into public toBinary (drop toBinaryFast API) #21) broke 3 extension-related unit tests and 6 conformance cases. Root cause: runtime extension data on messages triggers the plan walk but extension encoding is not covered by the plan. Not quick to fix.
Keeping two exports (toBinary + toBinaryFast) forces consumers to change call sites to see the L1+L2 gains — defeats the point of landing it behind the existing API.
Decision: ship what is safe on main (L0 only), park L1+L2 on a branch that can be resumed from once the extension-path gap is closed.

Changes

Removed from packages/protobuf/: src/to-binary-fast.ts (980 lines), export from src/index.ts, and the matching packages/protobuf-test/src/to-binary-fast.test.ts.
Removed from benchmarks/: all toBinaryFast wiring across bench-toBinary.ts, bench-create-toBinary.ts, bench-comparison-protobufjs.ts, bench-memory.ts, bench-streaming.ts, heap-prof-driver.ts, and scripts/run-heap-prof.sh. Deleted src/verify-correctness.ts (was purely a toBinaryFast round-trip check).
Added: upstream-protobuf-es devDependency in benchmarks/package.json, new upstream-protobuf-es bar in src/report.ts via toBinary as upstreamToBinary, three-way encoder list in src/report-helpers.ts.
Regenerated: chart.svg and chart-delta.svg with the new three-way layout (upstream red, fork blue, protobufjs sage). Delta chart now reads "fork toBinary speedup vs upstream" and "vs protobufjs".
Updated: benchmarks/README.md — new encoder legend, added "Current state" section linking to archive/l1-l2-schema-plans-experimental, corrected bench table from 4 columns to 3, adjusted streaming/heap-prof docs.
Unchanged: toBinary output bytes — identical to upstream on every fixture the correctness matrix covers.

Benchmark (local, Node v25.8.1, unpinned host)

OTel-shaped export, 100 spans:

Encoder	ops/s	vs upstream
upstream `toBinary`	331	baseline
fork `toBinary` (L0)	1,012	+206%
protobufjs	1,680	+408%

Full 10-fixture matrix lives in benchmarks/README.md; CI-pinned numbers (taskset -c 0, median-of-5) will shift these ratios by roughly ±15% but preserve the ordering. Regenerate with npm run bench:report -w @bufbuild/protobuf-benchmarks.

Test plan

packages/protobuf — build + typecheck green.
packages/protobuf-test — 2,929 tests pass.
benchmarks — typecheck + biome lint clean.
bench:report generates three-way chart.svg / chart-delta.svg and regenerates the README table.
bench:streaming runs with the 2-encoder matrix (toBinary + protobufjs).
CI-pinned bench:matrix:ci rerun in CI — expected to reflect published fork-vs-upstream delta with tighter margins.

Scope

Internal fork PR, draft — do not merge without review. The archive branch
(archive/l1-l2-schema-plans-experimental) preserves the full L1+L2 implementation verbatim for whoever resumes the extension-path fix.

Closes #21. Supersedes #20 (re-applied the upstream-baseline infrastructure in this branch).

Generated with Claude Code

github-actions · 2026-04-20T22:08:31Z

Benchmark: no regressions

Thresholds: throughput regression >5%, memory regression >10%. Runner pinned to CPU 0 via taskset; baseline and current are benchmarked on the same runner within one workflow invocation. Current run on linux/x64, Node v22.22.2, captured 2026-04-21T10:28:34.098Z.
Baseline captured 2026-04-21T10:19:52.048Z on linux/x64, Node v22.22.2.

Summary: 0 regressed, 1 improved, 0 new, 19 unchanged.

Fixture	Baseline ops/s	PR ops/s	Δ ops	Baseline B/op	PR B/op	Δ mem	Status
SimpleMessage :: toBinary (pre-built, 19 B)	868,276	888,982	+2.4%	–	–	–	ok
ExportTraceRequest (100 spans) :: toBinary (pre-built, 32926 B)	1,244	1,247	+0.2%	–	–	–	ok
ExportMetricsRequest (50 series) :: toBinary (pre-built, 17696 B)	2,204	2,182	-1.0%	–	–	–	ok
ExportLogsRequest (100 records) :: toBinary (pre-built, 21319 B)	2,167	2,195	+1.3%	–	–	–	ok
K8sPodList (20 pods) :: toBinary (pre-built, 28900 B)	2,371	2,375	+0.1%	–	–	–	ok
GraphQLRequest :: toBinary (pre-built, 624 B)	176,163	177,286	+0.6%	–	–	–	ok
GraphQLResponse :: toBinary (pre-built, 1366 B)	244,354	248,769	+1.8%	–	–	–	ok
RpcRequest :: toBinary (pre-built, 501 B)	301,157	302,639	+0.5%	–	–	–	ok
RpcResponse :: toBinary (pre-built, 602 B)	437,196	444,173	+1.6%	–	–	–	ok
StressMessage (depth=8, width=200) :: toBinary (pre-built, 12868 B)	8,007	8,103	+1.2%	–	–	–	ok
SimpleMessage :: fromBinary (19 B)	1,020,703	1,018,764	-0.2%	–	–	–	ok
ExportTraceRequest (100 spans) :: fromBinary (32926 B)	612.4	622.2	+1.6%	–	–	–	ok
ExportMetricsRequest (50 series) :: fromBinary (17696 B)	1,169	1,179	+0.8%	–	–	–	ok
ExportLogsRequest (100 records) :: fromBinary (21319 B)	1,094	1,109	+1.3%	–	–	–	ok
K8sPodList (20 pods) :: fromBinary (28900 B)	1,396	1,415	+1.4%	–	–	–	ok
GraphQLRequest :: fromBinary (624 B)	300,140	303,177	+1.0%	–	–	–	ok
GraphQLResponse :: fromBinary (1366 B)	269,958	273,381	+1.3%	–	–	–	ok
RpcRequest :: fromBinary (501 B)	271,242	273,447	+0.8%	–	–	–	ok
RpcResponse :: fromBinary (602 B)	378,923	379,316	+0.1%	–	–	–	ok
StressMessage (depth=8, width=200) :: fromBinary (12868 B)	3,986	4,207	+5.5%	–	–	–	improved

Produced by benchmarks/scripts/compare-results.ts. Artifacts: bench-results-<pr> (current), bench-baseline-main (baseline).

…seline Previous state on main: toBinary (L0 contiguous writer, PR #8) + toBinaryFast (L1+L2 schema plans, opt-in). Merging L1+L2 into toBinary broke extension conformance (PR #21). Keeping two encoders required users to change their call sites to see L1+L2 gains — also unacceptable. This change: - Removes toBinaryFast export and L1+L2 source files from main (packages/protobuf/src/to-binary-fast.ts, its unit test, and all benchmark wiring that referenced it). - Preserves the full L1+L2 implementation on branch archive/l1-l2-schema-plans-experimental for future iteration, not discarded. Closes PR #21. - Adds upstream-protobuf-es (npm alias for @bufbuild/protobuf@latest) as a third baseline column in benchmarks (re-applying the infrastructure from the closed PR #20), so chart.svg and chart-delta.svg honestly show "this fork's toBinary vs original upstream vs protobufjs" instead of fork-internal self-comparison. - Regenerates chart.svg and chart-delta.svg with the three-way layout and rewrites the README narrative around the new encoders and the "Current state" archival note. - Updates correctness-matrix.test.ts comments to point at the archive branch so contributors know where L1+L2 lives. Local run (Node v25.8.1, unpinned host — CI pinned numbers will differ): - upstream toBinary (OTel 100 spans): 331 ops/s (baseline) - fork toBinary (OTel 100 spans): 1,012 ops/s (+206% vs upstream) - protobufjs (OTel 100 spans): 1,680 ops/s L0 alone gets the fork past 3x original upstream on the real OTel workload that drove this whole investigation, with byte-identical wire output.

Single-run benchmark output from report.ts was vulnerable to host jitter on small/fast fixtures — a SimpleMessage measurement spread 2-8x across back-to-back runs on the same machine. A committed chart snapshot could therefore show fork's toBinary 19% slower than upstream in one run and 15-20% faster in the next four. Wraps the benchmark loop inside report.ts with a BENCH_REPORT_RUNS counter (default 5) and takes the per-fixture per-encoder median. Override via the env var for faster iteration or longer sweeps. Fresh median-of-5 numbers (local unpinned host, taskset -c 0): - SimpleMessage toBinary: upstream 1.27M, fork 1.35M (+6.3%) - OTel 100 spans toBinary: upstream 536, fork 1634 (+205%) - K8sPodList toBinary: upstream 837, fork 3140 (+275%) - StressMessage toBinary: upstream 2873, fork 10476 (+265%) - RpcResponse toBinary: fork 630K vs protobufjs 532K (fork ahead on this shape) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files deleted: - packages/protobuf-test/src/schema-plan-adaptive.test.ts — L3 test file, imports toBinaryFast + @bufbuild/protobuf/wire/schema-plan-adaptive both of which were already removed from main - packages/protobuf/src/wire/schema-plan-adaptive.ts — L3 implementation, orphaned after its only consumer was removed - benchmarks/src/bench-multishape.ts — L3-only benchmark, imports toBinaryFast These files survived the initial cleanup because they were only exercised by L3/L1+L2 code paths that were themselves removed. The full L1+L2 + L3 implementation remains on archive/l1-l2-schema-plans-experimental. Full test suite now passes: 2909 / 0 fail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Removed stale L1+L2 descriptions and the toBinaryFast 0.80x claim from 'Reading the results'. Replaced with L0 description, median-of-5 note, and fresh numbers (OTel 100 spans 0.88x protobufjs, cumulative vs upstream +6% to +275%). Dropped bench-multishape reference from 'Future work' (file deleted in 2dfe2fa). Added 'Archived work' section pointing at archive/l1-l2-schema-plans-experimental for L1+L2 and L3 prototypes.

The subpath export + typesVersions alias survived the L3 cleanup even though wire/schema-plan-adaptive.ts was deleted in 2442fd7. attw was reporting 'Resolution failed' on the orphaned entry, failing PR #22 CI.

Previously the bench-matrix job downloaded the baseline JSON from the most recent push-to-main workflow artifact. That baseline was captured on whatever ubuntu-latest runner GitHub happened to assign at that time; the PR run then happened on a different physical host. Even with taskset -c 0 + median-of-5 on both sides, cross-host variance (different P/E-core topologies, SMT neighbours, thermal state) remained 5-7%, producing chronic false-positive regressions on PRs that did not touch the encode/decode hot path at all. Switch the PR job to benchmark origin/main and the PR merge commit in sequence on the SAME runner within the SAME workflow invocation. Every factor except the code under test is now held constant. The 'bench-baseline-main' artifact upload on push-to-main is preserved for external trend consumers but is no longer read by PR comparison. Mechanics: - Checkout PR as usual. - Record current + origin/main SHAs. - git checkout origin/main, install, build, generate, run matrix into baseline-results.json. - git checkout back to PR head, rerun install/build/generate (PR may have changed package-lock), run matrix into bench-results.json. - compare-results.ts diffs the two local JSONs, posts the sticky comment, flags regressions. Doubled bench work pushes the job from ~9 min to ~18 min; timeout raised 25 -> 40 min with buffer for slow runners. Also: compare-results.ts header text now notes 'baseline and current are benchmarked on the same runner' so the PR comment reflects the new guarantee, and .gitignore covers baseline-results.json / bench-report.md to keep the PR working tree clean after a local bench pass.

intech self-assigned this Apr 21, 2026

intech force-pushed the chore/drop-toBinaryFast-keep-l0-only branch from a7dd65a to 2dfe2fa Compare April 21, 2026 08:14

intech and others added 5 commits April 21, 2026 12:39

Drop stale wire/schema-plan-adaptive export from package.json

e11a6a3

The subpath export + typesVersions alias survived the L3 cleanup even though wire/schema-plan-adaptive.ts was deleted in 2442fd7. attw was reporting 'Resolution failed' on the orphaned entry, failing PR #22 CI.

intech marked this pull request as ready for review April 21, 2026 10:31

intech merged commit 242d5f6 into main Apr 21, 2026
28 checks passed

intech deleted the chore/drop-toBinaryFast-keep-l0-only branch April 21, 2026 10:31

This was referenced Apr 21, 2026

CI: auto-refresh benchmarks README table + charts after main merges #23

Merged

L1+L2 schema-plan optimization merged into public toBinary (known edge-case gaps) #25

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drop toBinaryFast from main, archive L1+L2, add upstream baseline#22

Drop toBinaryFast from main, archive L1+L2, add upstream baseline#22
intech merged 6 commits intomainfrom
chore/drop-toBinaryFast-keep-l0-only

intech commented Apr 20, 2026

Uh oh!

github-actions Bot commented Apr 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

intech commented Apr 20, 2026

Summary

Why

Changes

Benchmark (local, Node v25.8.1, unpinned host)

Test plan

Scope

Uh oh!

github-actions Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark: no regressions

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Apr 20, 2026 •

edited

Loading