Skip to content

perf(stages): avoid v2 history count pre-pass#23295

Closed
yongkangc wants to merge 4 commits into
mainfrom
codex/history-repro-20260331-035555
Closed

perf(stages): avoid v2 history count pre-pass#23295
yongkangc wants to merge 4 commits into
mainfrom
codex/history-repro-20260331-035555

Conversation

@yongkangc
Copy link
Copy Markdown
Contributor

@yongkangc yongkangc commented Mar 31, 2026

storage_v2 history indexing was paying a near-fixed per-cycle cost even when only a small live tail needed processing. Both history stages called account_changeset_count() / storage_changeset_count() only to format progress logs, and those helpers scanned .csoff metadata across finalized static files.

This removes that pre-pass from the live-sync path by logging progress against the requested block range instead. It also makes the count helpers themselves cheaper by reading only the last committed .csoff record per static file, so remaining callers no longer materialize every offset record.

Repro gist: https://gist.github.com/yongkangc/06d1da54a32b4a1818c9432483fef7a6

Repro setup:

  • cargo test -p reth-stages repro_collect_bottleneck_breakdown -- --ignored --nocapture
  • 200k historical blocks
  • 1k blocks per static file
  • 36-block live tail
Metric Before After stage fix After combined fix
Account stage 35.2ms 1.5ms 1.6ms
Storage stage 35.6ms 2.7ms 2.4ms
Account count probe 33.4ms 32.9ms 2.7ms
Storage count probe 35.8ms 32.7ms 2.6ms

Co-Authored-By: YK 46377366+yongkangc@users.noreply.github.com

Prompted by: yongkangc

Co-authored-by: YK <46377366+yongkangc@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d420d-4530-756b-856b-e3c3b0d8522b
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 31, 2026

⚠️ Changelog not found.

A changelog entry is required before merging. We've generated a suggested changelog based on your changes:

Preview
---
reth-stages: patch
reth-provider: patch
---

Fixed changeset count calculation to use last offset entry instead of summing all offsets, and improved history index progress reporting to track blocks in the active range rather than total changesets. Added tests for account and storage changeset count correctness.

Add changelog to commit this to your branch.

YK and others added 3 commits March 31, 2026 04:29
Co-authored-by: YK <46377366+yongkangc@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d420d-4530-756b-856b-e3c3b0d8522b
Co-authored-by: YK <46377366+yongkangc@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d420d-4530-756b-856b-e3c3b0d8522b
Co-authored-by: YK <46377366+yongkangc@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d420d-4530-756b-856b-e3c3b0d8522b
@yongkangc
Copy link
Copy Markdown
Contributor Author

Closing this in favor of #23296, which addresses the same bottleneck with a smaller change. It makes O(files) instead of O(blocks), preserves the existing progress accounting, and fixes the hot path without changing the stage collection logic.

@yongkangc yongkangc closed this Mar 31, 2026
@github-project-automation github-project-automation Bot moved this from Backlog to Done in Reth Tracker Mar 31, 2026
@yongkangc
Copy link
Copy Markdown
Contributor Author

Closing this in favor of #23296, which addresses the same bottleneck with a smaller change. It makes changeset_count() O(files) instead of O(blocks), preserves the existing progress accounting, and fixes the hot path without changing the stage collection logic.

@emmajam emmajam deleted the codex/history-repro-20260331-035555 branch May 1, 2026 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant