perf(daemon): add CI workflow for sync bench comparison by andreabadesso · Pull Request #398 · HathorNetwork/hathor-wallet-service

andreabadesso · 2026-04-17T16:45:42Z

Motivation

Third and final piece of the sync-benchmarking infrastructure (follows #396 and #397). On every PR that touches packages/daemon, this workflow runs the bench against both master and the PR branch in parallel, then posts a sticky PR comment with the comparator output.

Informational only. No exit gating, no red X. CI runner variance is too high for a hard threshold to be useful at the run counts we can afford.

Stacked on #397 — targets feat/daemon-bench-comparator. When the stack merges, the final result is the full comparator + workflow + harness landing on master together.

Architecture

                    ┌─────────────────────┐
   PR trigger ──┬──►│ bench (baseline)    ├─► artifact: bench-baseline.json
                │   └─────────────────────┘                              │
                │                                                         ▼
                │   ┌─────────────────────┐                     ┌──────────────┐
                └──►│ bench (candidate)   ├─► artifact ────────►│   compare    │
                    └─────────────────────┘     bench-candidate │  (comparator │
                                                                │   + comment) │
                                                                └──────────────┘

Matrix strategy: baseline (master) and candidate (PR branch) run in parallel, each with its own MySQL + simulator containers — no cross-run state bleed
PR's bench scripts are overlaid onto the baseline checkout via refs/pull/N/head (works for same-repo and fork PRs), so the measurement tool stays constant and baseline works before the harness has landed on master
Report is also emitted to the workflow job summary, so fork PRs (where GITHUB_TOKEN is read-only and gh pr comment would 403) still surface results
Sticky PR comment, deduped by  marker
concurrency.cancel-in-progress: true — pushing new commits cancels in-flight bench runs

Acceptance Criteria

On any PR touching packages/daemon/**, the daemon-bench workflow runs
Workflow runs master's production code + PR's production code in separate matrix jobs
PR comment is posted (or updated) with the markdown comparison
Workflow never fails the PR, regardless of bench results
First run against this very PR (once merged in the stack) validates the full pipeline

Known constraints

66-event scenario ceiling (same as perf(daemon): add sync benchmark harness #396/perf(daemon): add sync bench comparator #397): the first real runs will mostly show ⚪ noise. Dial up scenario size before expecting actionable signal.
Bench-script decoupling: added a warning comment at the top of bench-sync.ts noting that any symbol it references must also exist on master. If a future PR renames a span or removes a service export, the workflow needs a matching update.
Starts at 5 runs × 1 warmup per side (matches the local self-test). Can be dialed up once the scenario grows.
First run is the validation — no way to unit-test a GitHub Actions workflow. Watch the first PR-triggered run and expect to iterate.

Checklist

If you are requesting a merge into master, confirm this code is production-ready and can be included in future releases as soon as it gets merged
Make sure either the unit tests and/or the QA tests are capable of testing the new features (N/A — workflow is self-validating on first run)
Make sure you do not include new dependencies in the project unless strictly necessary and do not include dev-dependencies as production ones. (no new deps)

🤖 Generated with Claude Code

Third piece of the benchmarking infrastructure. On PRs that touch packages/daemon, runs the bench against both master and the PR branch in parallel, then posts (or updates) a sticky PR comment with the comparator output. Key design choices (rationale in #396 review thread): - Matrix strategy — two runners in parallel, each with its own MySQL + simulator containers, no cross-run state bleed. - Bench scripts from PR head are overlaid onto the baseline checkout (via refs/pull/N/head so fork PRs work too) so the measurement tool stays constant across the comparison and baseline works even before the harness has landed on master. - No exit gating. continue-on-error on every bench step — CI runner variance is too high for a hard threshold to mean anything at the run counts we can afford. - Report also emitted to the job summary, so fork PRs (where GITHUB_TOKEN is read-only) still surface results. - concurrency.cancel-in-progress: true to avoid stacking stale runs when a PR is pushed to repeatedly. - Starts at 5 runs × 1 warmup per side; dial up once the scenario grows past its current 66-event ceiling. Also adds a warning comment at the top of bench-sync.ts flagging the overlay constraint: any symbol this script references must also exist on master. Depends on #396 (harness) and #397 (comparator). Targets #397 so the three PRs can be reviewed as a stack. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-17T16:45:50Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5bdb7932-b600-46c1-bfff-9342819b9468

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/daemon-bench-ci

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-17T16:54:21Z

�[94m➤�[39m YN0000: · Yarn 4.7.0
�[94m➤�[39m �[90mYN0000�[39m: ┌ Resolution step
::group::Resolution step
�[94m➤�[39m YN0085: │ �[38;5;70m+�[39m �[38;5;173mts-node�[39m�[38;5;111m@�[39m�[38;5;111mnpm:10.9.2�[39m, �[38;5;166m@cspotcode/�[39m�[38;5;173msource-map-support�[39m�[38;5;111m@�[39m�[38;5;111mnpm:0.8.1�[39m, �[38;5;166m@jridgewell/�[39m�[38;5;173mresolve-uri�[39m�[38;5;111m@�[39m�[38;5;111mnpm:3.1.2�[39m, �[38;5;166m@jridgewell/�[39m�[38;5;173msourcemap-codec�[39m�[38;5;111m@�[39m�[38;5;111mnpm:1.5.5�[39m, and �[38;5;220m13�[39m more.
::endgroup::
�[94m➤�[39m �[90mYN0000�[39m: └ Completed
�[94m➤�[39m �[90mYN0000�[39m: ┌ Post-resolution validation
::group::Post-resolution validation
�[93m➤�[39m YN0002: │ �[38;5;173mroot-workspace-0b6124�[39m�[38;5;111m@�[39m�[38;5;111mworkspace:.�[39m doesn't provide �[38;5;166m@types/�[39m�[38;5;173mnode�[39m (�[38;5;111mp3fc06�[39m), requested by �[38;5;173mts-node�[39m.
�[93m➤�[39m YN0002: │ �[38;5;173mroot-workspace-0b6124�[39m�[38;5;111m@�[39m�[38;5;111mworkspace:.�[39m doesn't provide �[38;5;173mtypescript�[39m (�[38;5;111mp98e8b�[39m), requested by �[38;5;173mts-node�[39m.
�[93m➤�[39m YN0086: │ Some peer dependencies are incorrectly met by your project; run �[38;5;111myarn explain peer-requirements �[39m for details, where �[38;5;111m�[39m is the six-letter p-prefixed code.
::endgroup::
�[94m➤�[39m �[90mYN0000�[39m: └ Completed
�[94m➤�[39m �[90mYN0000�[39m: ┌ Fetch step
::group::Fetch step
�[94m➤�[39m YN0013: │ �[38;5;220m6�[39m packages were added to the project (�[38;5;160m+ 410.36 KiB�[39m).
::endgroup::
�[94m➤�[39m �[90mYN0000�[39m: └ Completed in 0s 256ms
�[94m➤�[39m �[90mYN0000�[39m: ┌ Link step
::group::Link step
::endgroup::
�[94m➤�[39m �[90mYN0000�[39m: └ Completed in 1s 941ms
�[93m➤�[39m YN0000: · Done with warnings in 2s 401ms

Sync benchmark comparison

Scenario: VOIDED_TOKEN_AUTHORITY (66 events)
Runs: baseline=5 (label: baseline), candidate=5 (label: candidate), warmup=1/1
Bootstrap samples: 10000, seed: 42
Verdict: 🟢 8 improvements · 🔴 0 regressions · ⚪ 11 noise · ⚠️ 0 skipped

metric	baseline p50 (ms)	candidate p50 (ms)	Δ	95% CI
totalMs	876.3	776.8	-11.4%	[-18.2%, -2.1%]	🟢
addOrUpdateTx	9.391	8.690	-7.5%	[-17.2%, -2.1%]	🟢
addUtxos	9.203	7.735	-16.0%	[-20.8%, -9.7%]	🟢
clearTxProposalForVoidedTx	0.368	0.328	-10.9%	[-26.9%, -2.8%]	🟢
getAddressWalletInfo	9.688	8.465	-12.6%	[-18.9%, -3.8%]	🟢
getTransactionById	22.64	21.96	-3.0%	[-12.2%, +1.1%]	⚪
getTxOutputsFromTx	1.598	1.604	+0.4%	[-27.6%, +21.2%]	⚪
handleTxFirstBlock	11.91	12.47	+4.7%	[-15.8%, +19.8%]	⚪
handleVertexAccepted	196.2	187.0	-4.7%	[-11.5%, +4.1%]	⚪
handleVoidedTx	35.51	34.23	-3.6%	[-34.4%, +5.0%]	⚪
markUtxosAsVoided	1.469	1.302	-11.4%	[-28.2%, +4.0%]	⚪
metadataDiff	58.02	56.70	-2.3%	[-15.0%, +8.3%]	⚪
unspendInputs	0.769	0.749	-2.6%	[-18.9%, +9.8%]	⚪
updateAddressTablesWithTx	30.66	26.95	-12.1%	[-19.5%, -6.2%]	🟢
updateTxOutputSpentBy	1.359	1.210	-11.0%	[-19.0%, -0.9%]	🟢
updateWalletTablesWithTx	0.278	0.194	-30.2%	[-35.4%, -20.1%]	🟢
voidAddressTransaction	9.944	9.072	-8.8%	[-21.9%, +4.3%]	⚪
voidTransaction	1.412	1.381	-2.2%	[-19.1%, +11.3%]	⚪
voidWalletTransaction	1.420	1.323	-6.8%	[-17.1%, +1.6%]	⚪

🟢/🔴 mean the 95% CI is fully on one side of 0. ⚪ means the CI crosses 0 — the difference is indistinguishable from noise at this run count. This report is informational only; CI runner variance makes hard gates unreliable at the run counts we can afford in CI.

andreabadesso self-assigned this Apr 17, 2026

andreabadesso added the enhancement New feature or request label Apr 17, 2026

andreabadesso added this to Hathor Network Apr 17, 2026

github-project-automation Bot moved this to Todo in Hathor Network Apr 17, 2026

andreabadesso moved this from Todo to In Progress (WIP) in Hathor Network Apr 17, 2026

andreabadesso mentioned this pull request Apr 17, 2026

perf(daemon): release pooled connections instead of destroying them #399

Merged

3 tasks

andreabadesso mentioned this pull request Apr 20, 2026

perf(daemon): per-event sync optimizations identified from Tempo traces #395

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(daemon): add CI workflow for sync bench comparison#398

perf(daemon): add CI workflow for sync bench comparison#398
andreabadesso wants to merge 1 commit intofeat/daemon-bench-comparatorfrom
feat/daemon-bench-ci

andreabadesso commented Apr 17, 2026

Uh oh!

coderabbitai Bot commented Apr 17, 2026

Review skipped

Uh oh!

github-actions Bot commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

andreabadesso commented Apr 17, 2026

Motivation

Architecture

Acceptance Criteria

Known constraints

Checklist

Uh oh!

coderabbitai Bot commented Apr 17, 2026

Review skipped

Uh oh!

github-actions Bot commented Apr 17, 2026

Sync benchmark comparison

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant