Skip to content

perf(daemon): add CI workflow for sync bench comparison#398

Open
andreabadesso wants to merge 1 commit intofeat/daemon-bench-comparatorfrom
feat/daemon-bench-ci
Open

perf(daemon): add CI workflow for sync bench comparison#398
andreabadesso wants to merge 1 commit intofeat/daemon-bench-comparatorfrom
feat/daemon-bench-ci

Conversation

@andreabadesso
Copy link
Copy Markdown
Collaborator

Motivation

Third and final piece of the sync-benchmarking infrastructure (follows #396 and #397). On every PR that touches packages/daemon, this workflow runs the bench against both master and the PR branch in parallel, then posts a sticky PR comment with the comparator output.

Informational only. No exit gating, no red X. CI runner variance is too high for a hard threshold to be useful at the run counts we can afford.

Stacked on #397 — targets feat/daemon-bench-comparator. When the stack merges, the final result is the full comparator + workflow + harness landing on master together.

Architecture

                    ┌─────────────────────┐
   PR trigger ──┬──►│ bench (baseline)    ├─► artifact: bench-baseline.json
                │   └─────────────────────┘                              │
                │                                                         ▼
                │   ┌─────────────────────┐                     ┌──────────────┐
                └──►│ bench (candidate)   ├─► artifact ────────►│   compare    │
                    └─────────────────────┘     bench-candidate │  (comparator │
                                                                │   + comment) │
                                                                └──────────────┘
  • Matrix strategy: baseline (master) and candidate (PR branch) run in parallel, each with its own MySQL + simulator containers — no cross-run state bleed
  • PR's bench scripts are overlaid onto the baseline checkout via refs/pull/N/head (works for same-repo and fork PRs), so the measurement tool stays constant and baseline works before the harness has landed on master
  • Report is also emitted to the workflow job summary, so fork PRs (where GITHUB_TOKEN is read-only and gh pr comment would 403) still surface results
  • Sticky PR comment, deduped by <!-- daemon-bench-report --> marker
  • concurrency.cancel-in-progress: true — pushing new commits cancels in-flight bench runs

Acceptance Criteria

  • On any PR touching packages/daemon/**, the daemon-bench workflow runs
  • Workflow runs master's production code + PR's production code in separate matrix jobs
  • PR comment is posted (or updated) with the markdown comparison
  • Workflow never fails the PR, regardless of bench results
  • First run against this very PR (once merged in the stack) validates the full pipeline

Known constraints

  • 66-event scenario ceiling (same as perf(daemon): add sync benchmark harness #396/perf(daemon): add sync bench comparator #397): the first real runs will mostly show ⚪ noise. Dial up scenario size before expecting actionable signal.
  • Bench-script decoupling: added a warning comment at the top of bench-sync.ts noting that any symbol it references must also exist on master. If a future PR renames a span or removes a service export, the workflow needs a matching update.
  • Starts at 5 runs × 1 warmup per side (matches the local self-test). Can be dialed up once the scenario grows.
  • First run is the validation — no way to unit-test a GitHub Actions workflow. Watch the first PR-triggered run and expect to iterate.

Checklist

  • If you are requesting a merge into master, confirm this code is production-ready and can be included in future releases as soon as it gets merged
  • Make sure either the unit tests and/or the QA tests are capable of testing the new features (N/A — workflow is self-validating on first run)
  • Make sure you do not include new dependencies in the project unless strictly necessary and do not include dev-dependencies as production ones. (no new deps)

🤖 Generated with Claude Code

Third piece of the benchmarking infrastructure. On PRs that touch
packages/daemon, runs the bench against both master and the PR branch
in parallel, then posts (or updates) a sticky PR comment with the
comparator output.

Key design choices (rationale in #396 review thread):
- Matrix strategy — two runners in parallel, each with its own MySQL +
  simulator containers, no cross-run state bleed.
- Bench scripts from PR head are overlaid onto the baseline checkout
  (via refs/pull/N/head so fork PRs work too) so the measurement tool
  stays constant across the comparison and baseline works even before
  the harness has landed on master.
- No exit gating. continue-on-error on every bench step — CI runner
  variance is too high for a hard threshold to mean anything at the run
  counts we can afford.
- Report also emitted to the job summary, so fork PRs (where
  GITHUB_TOKEN is read-only) still surface results.
- concurrency.cancel-in-progress: true to avoid stacking stale runs
  when a PR is pushed to repeatedly.
- Starts at 5 runs × 1 warmup per side; dial up once the scenario
  grows past its current 66-event ceiling.

Also adds a warning comment at the top of bench-sync.ts flagging the
overlay constraint: any symbol this script references must also exist
on master.

Depends on #396 (harness) and #397 (comparator). Targets #397 so the
three PRs can be reviewed as a stack.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 17, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5bdb7932-b600-46c1-bfff-9342819b9468

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/daemon-bench-ci

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@andreabadesso andreabadesso self-assigned this Apr 17, 2026
@andreabadesso andreabadesso added the enhancement New feature or request label Apr 17, 2026
@andreabadesso andreabadesso moved this from Todo to In Progress (WIP) in Hathor Network Apr 17, 2026
@github-actions
Copy link
Copy Markdown

�[94m➤�[39m YN0000: · Yarn 4.7.0
�[94m➤�[39m �[90mYN0000�[39m: ┌ Resolution step
::group::Resolution step
�[94m➤�[39m YN0085: │ �[38;5;70m+�[39m �[38;5;173mts-node�[39m�[38;5;111m@�[39m�[38;5;111mnpm:10.9.2�[39m, �[38;5;166m@cspotcode/�[39m�[38;5;173msource-map-support�[39m�[38;5;111m@�[39m�[38;5;111mnpm:0.8.1�[39m, �[38;5;166m@jridgewell/�[39m�[38;5;173mresolve-uri�[39m�[38;5;111m@�[39m�[38;5;111mnpm:3.1.2�[39m, �[38;5;166m@jridgewell/�[39m�[38;5;173msourcemap-codec�[39m�[38;5;111m@�[39m�[38;5;111mnpm:1.5.5�[39m, and �[38;5;220m13�[39m more.
::endgroup::
�[94m➤�[39m �[90mYN0000�[39m: └ Completed
�[94m➤�[39m �[90mYN0000�[39m: ┌ Post-resolution validation
::group::Post-resolution validation
�[93m➤�[39m YN0002: │ �[38;5;173mroot-workspace-0b6124�[39m�[38;5;111m@�[39m�[38;5;111mworkspace:.�[39m doesn't provide �[38;5;166m@types/�[39m�[38;5;173mnode�[39m (�[38;5;111mp3fc06�[39m), requested by �[38;5;173mts-node�[39m.
�[93m➤�[39m YN0002: │ �[38;5;173mroot-workspace-0b6124�[39m�[38;5;111m@�[39m�[38;5;111mworkspace:.�[39m doesn't provide �[38;5;173mtypescript�[39m (�[38;5;111mp98e8b�[39m), requested by �[38;5;173mts-node�[39m.
�[93m➤�[39m YN0086: │ Some peer dependencies are incorrectly met by your project; run �[38;5;111myarn explain peer-requirements �[39m for details, where �[38;5;111m�[39m is the six-letter p-prefixed code.
::endgroup::
�[94m➤�[39m �[90mYN0000�[39m: └ Completed
�[94m➤�[39m �[90mYN0000�[39m: ┌ Fetch step
::group::Fetch step
�[94m➤�[39m YN0013: │ �[38;5;220m6�[39m packages were added to the project (�[38;5;160m+ 410.36 KiB�[39m).
::endgroup::
�[94m➤�[39m �[90mYN0000�[39m: └ Completed in 0s 256ms
�[94m➤�[39m �[90mYN0000�[39m: ┌ Link step
::group::Link step
::endgroup::
�[94m➤�[39m �[90mYN0000�[39m: └ Completed in 1s 941ms
�[93m➤�[39m YN0000: · Done with warnings in 2s 401ms

Sync benchmark comparison

Scenario: VOIDED_TOKEN_AUTHORITY (66 events)
Runs: baseline=5 (label: baseline), candidate=5 (label: candidate), warmup=1/1
Bootstrap samples: 10000, seed: 42
Verdict: 🟢 8 improvements · 🔴 0 regressions · ⚪ 11 noise · ⚠️ 0 skipped

metric baseline p50 (ms) candidate p50 (ms) Δ 95% CI
totalMs 876.3 776.8 -11.4% [-18.2%, -2.1%] 🟢
addOrUpdateTx 9.391 8.690 -7.5% [-17.2%, -2.1%] 🟢
addUtxos 9.203 7.735 -16.0% [-20.8%, -9.7%] 🟢
clearTxProposalForVoidedTx 0.368 0.328 -10.9% [-26.9%, -2.8%] 🟢
getAddressWalletInfo 9.688 8.465 -12.6% [-18.9%, -3.8%] 🟢
getTransactionById 22.64 21.96 -3.0% [-12.2%, +1.1%]
getTxOutputsFromTx 1.598 1.604 +0.4% [-27.6%, +21.2%]
handleTxFirstBlock 11.91 12.47 +4.7% [-15.8%, +19.8%]
handleVertexAccepted 196.2 187.0 -4.7% [-11.5%, +4.1%]
handleVoidedTx 35.51 34.23 -3.6% [-34.4%, +5.0%]
markUtxosAsVoided 1.469 1.302 -11.4% [-28.2%, +4.0%]
metadataDiff 58.02 56.70 -2.3% [-15.0%, +8.3%]
unspendInputs 0.769 0.749 -2.6% [-18.9%, +9.8%]
updateAddressTablesWithTx 30.66 26.95 -12.1% [-19.5%, -6.2%] 🟢
updateTxOutputSpentBy 1.359 1.210 -11.0% [-19.0%, -0.9%] 🟢
updateWalletTablesWithTx 0.278 0.194 -30.2% [-35.4%, -20.1%] 🟢
voidAddressTransaction 9.944 9.072 -8.8% [-21.9%, +4.3%]
voidTransaction 1.412 1.381 -2.2% [-19.1%, +11.3%]
voidWalletTransaction 1.420 1.323 -6.8% [-17.1%, +1.6%]

🟢/🔴 mean the 95% CI is fully on one side of 0. ⚪ means the CI crosses 0 — the difference is indistinguishable from noise at this run count. This report is informational only; CI runner variance makes hard gates unreliable at the run counts we can afford in CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

1 participant