Skip to content

pruner: fall back to disk snapshot root when journal is missing#304

Merged
curryxbo merged 1 commit intofix/snapshot-zk-mpt-compatfrom
fix/pruner-fallback-missing-journal
Mar 20, 2026
Merged

pruner: fall back to disk snapshot root when journal is missing#304
curryxbo merged 1 commit intofix/snapshot-zk-mpt-compatfrom
fix/pruner-fallback-missing-journal

Conversation

@curryxbo
Copy link
Copy Markdown
Contributor

@curryxbo curryxbo commented Mar 20, 2026

Background

When node data is copied from a running node or after an unclean shutdown,
the snapshot journal is not written and in-memory diff layers are lost.
`snapshot.New(headBlock.Root())` fails because the on-disk snapshot root
lags behind the chain head by up to 128 blocks, causing `prune-state` to
error out and requiring the operator to run the node first before pruning.

Fix

When `snapshot.New(headBlock.Root())` fails in `NewPruner`, fall back to
`rawdb.ReadSnapshotRoot()` and retry. The pruning target becomes the disk
snapshot root (at most 128 blocks behind the chain head). The node will
re-execute those blocks automatically on the next restart.

Trade-off

  • Node data can be pruned immediately after copying, no need to run the node first
  • Pruning target is at most 128 blocks behind chain head — negligible impact, recovered automatically on restart

Relation to #303

This PR is an optional enhancement on top of #303 (`fix/snapshot-zk-mpt-compat`),
kept as a separate PR for independent review and merge consideration.

🤖 Generated with Claude Code

When data is copied from a running node or after an unclean shutdown,
the snapshot journal is not written and in-memory diff layers are lost.
snapshot.New(headBlock.Root()) fails because the on-disk snapshot root
lags behind the chain head by up to 128 blocks.

Retry with the persisted disk snapshot root so prune-state can proceed
without requiring the operator to run the node first. The pruning target
will be at most 128 blocks behind the chain head; the node will
re-execute those blocks on the next restart.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@curryxbo curryxbo requested a review from a team as a code owner March 20, 2026 02:03
@curryxbo curryxbo requested review from r3aker86 and removed request for a team March 20, 2026 02:03
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 20, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7ad6b62d-48a0-4f7f-b212-0b551013cf34

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/pruner-fallback-missing-journal
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@curryxbo curryxbo merged commit 7511887 into fix/snapshot-zk-mpt-compat Mar 20, 2026
1 check passed
@curryxbo curryxbo deleted the fix/pruner-fallback-missing-journal branch March 20, 2026 14:17
curryxbo pushed a commit that referenced this pull request Mar 24, 2026
After #304 merged, NewPruner can succeed via the disk-root fallback and
create a bloom filter. If prune-state is then interrupted, RecoverPruning
is triggered on the next run. Three issues needed fixing:

1. snapshot.New fallback: same as NewPruner — when the journal is missing,
   retry with ReadSnapshotRoot() so recovery can initialise the snapshot tree.

2. Snapshots() translation: headBlock.Root() may be a zkStateRoot; translate
   to mptStateRoot before the lookup so diff layers (keyed by mptRoot) can
   be found.

3. DiskRoot check: when the original prune used DiskRoot() as target (our
   fallback path), stateBloomRoot equals the disk layer root. Snapshots()
   with nodisk=true excludes the disk layer, so check snaptree.DiskRoot()
   explicitly to avoid a spurious "non-existent target state" error.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant