Add matrix regression detection CI workflow by intech · Pull Request #9 · Connectum-Framework/protobuf-es

intech · 2026-04-19T18:29:49Z

Summary

Adds a GitHub Actions workflow that runs the bench-matrix.ts suite on
every PR against main, diffs the results against the latest main
baseline, and posts a sticky comment flagging >5% throughput regressions
and >10% memory regressions. Push-to-main runs refresh the authoritative
baseline artifact. Supporting scripts keep the behaviour reproducible
locally via npm run bench:matrix:ci + npm run bench:matrix:compare.

This is additive — no existing benchmark code is touched beyond
honouring two new env vars (BENCH_MATRIX_TIME, BENCH_MATRIX_WARMUP)
in bench-matrix.ts, which the CI wrapper uses to get tighter RME than
the dev-optimised defaults.

Targets feat/benchmark-matrix (PR #7) so the workflow rides in with
the matrix when that PR lands on main.

What's in the box

.github/workflows/benchmark.yaml — PR + push + manual triggers,
Node 22 (matches .nvmrc consumer requirement), 25-minute timeout,
concurrency group scoped per ref.
benchmarks/scripts/run-matrix-ci.sh — host profile logging +
throwaway JIT warmup run + real measurement pass at 3000 ms time /
1000 ms warmup, extracts the matrix JSON payload from stdout into
bench-results.json.
benchmarks/scripts/compare-results.ts — delta computation against
the baseline, configurable thresholds, emits a sticky-comment-ready
markdown table. Exits 0 even on regression — the workflow surfaces
the flag via ::warning:: annotation plus the PR comment so intentional
throughput trades are not hard-blocked. Tech-lead can promote to
hard-fail later.
benchmarks/baselines/README.md — documents the two-tier storage
decision (Actions artifact bench-baseline-main is source of truth
with 365-day retention; committed baselines/main.json is the
zero-network fallback and local-dev quick reference; refreshed by a
follow-up chore PR after material main-branch moves).
bench:matrix:ci + bench:matrix:compare npm scripts — same
pipeline, runnable locally.

Baseline storage decision

Two-tier. Artifacts hold the authoritative baseline because they give
trend history for free and survive repo churn. A committed
baselines/main.json quick-reference de-risks the artifact dependency
and lets developers run the comparison offline. Refresh of the
in-repo file is manual via a chore(benchmarks): refresh main baseline
PR until a follow-up automates it. Rationale in
benchmarks/baselines/README.md.

Test plan

Workflow YAML parses (python3 -c 'import yaml; yaml.safe_load(open(".github/workflows/benchmark.yaml"))' locally — passes).
bash -n benchmarks/scripts/run-matrix-ci.sh — passes.
compare-results.ts smoke test on synthetic baseline + current
JSON correctly flags a -10% row as REGRESSION, +9.1% as improved,
and a new-only row as new — verified locally.
First CI run on this PR produces a bench-results-<n> artifact
and posts a sticky comment. Since no bench-baseline-main artifact
exists yet on the fork, the first comment will be informational
("No baseline available").
After this PR rides into main via PR Add benchmark matrix with realistic fixtures #7, the push-to-main run
uploads the first bench-baseline-main artifact. Subsequent PRs
get real deltas.

Follow-ups

Trend dashboard: small JSON-aggregation script that pulls the last
N bench-baseline-main artifacts and renders a sparkline per fixture
into the benchmarks README. Blocked on having N>1 baselines.
Self-hosted single-core runner to cut RME further. GitHub-hosted
runners give ±2–5% RME on most fixtures; a pinned bare-metal runner
would bring that to ±0.5% and let us drop the regression threshold
to 3%.
Automate the in-repo main.json refresh by having the push-to-main
job open a PR with the updated file instead of relying on manual
chore PRs.
Wire bytesPerOp into bench-matrix.ts (currently only the compare
script has the plumbing — the fixture runner does not yet emit heap
delta per op).

Adds a GitHub Actions workflow that runs the benchmark matrix on every PR against main and flags >5% throughput regressions / >10% memory regressions. Push-to-main runs refresh the authoritative baseline artifact. - .github/workflows/benchmark.yaml — PR + push + manual triggers on Node 22, 25-minute timeout, concurrency group scoped per ref. - benchmarks/scripts/run-matrix-ci.sh — host profile + discarded JIT warmup run + measurement pass with 3000 ms time / 1000 ms warmup, extracts the matrix JSON payload from stdout into bench-results.json. - benchmarks/scripts/compare-results.ts — delta computation with configurable thresholds, emits a sticky-comment-ready markdown table, exits 0 on regression (workflow surfaces via ::warning:: annotation). - benchmarks/src/bench-matrix.ts — honour BENCH_MATRIX_TIME and BENCH_MATRIX_WARMUP env vars so the wrapper can tune budgets without editing the runner. - benchmarks/baselines/README.md — documents the two-tier storage (Actions artifact is source of truth, in-repo main.json is the zero-network fallback and local-dev quick reference). - bench:matrix:ci + bench:matrix:compare npm scripts — local equivalent of the CI flow for reproducible dev-side regression checks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Formatting drift from PR #9 surfaced by turbo format check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

intech changed the title ~~ci(benchmarks): matrix regression detection workflow~~ ci(benchmarks): Add matrix regression detection workflow Apr 19, 2026

intech changed the title ~~ci(benchmarks): Add matrix regression detection workflow~~ Add matrix regression detection CI workflow Apr 19, 2026

intech force-pushed the feat/benchmark-matrix branch from 0852010 to 63b86ef Compare April 19, 2026 21:35

intech force-pushed the ci/benchmark-workflow branch from 11018a3 to c54d41a Compare April 19, 2026 21:35

intech self-assigned this Apr 19, 2026

intech force-pushed the ci/benchmark-workflow branch from c54d41a to c69ff8b Compare April 19, 2026 22:20

intech changed the base branch from feat/benchmark-matrix to main April 19, 2026 22:20

intech merged commit f764e81 into main Apr 19, 2026
2 checks passed

intech deleted the ci/benchmark-workflow branch April 19, 2026 22:22

intech added a commit that referenced this pull request Apr 20, 2026

Apply biome format to compare-results.ts

e9e7b27

Formatting drift from PR #9 surfaced by turbo format check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

intech mentioned this pull request Apr 20, 2026

Actualize benchmarks README and fix chart-delta layout #19

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add matrix regression detection CI workflow#9

Add matrix regression detection CI workflow#9
intech merged 1 commit intomainfrom
ci/benchmark-workflow

intech commented Apr 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

intech commented Apr 19, 2026

Summary

What's in the box

Baseline storage decision

Test plan

Follow-ups

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant