snapshot: resolve zkStateRoot to mptStateRoot for generation and loading by curryxbo · Pull Request #302 · morph-l2/go-ethereum

curryxbo · 2026-03-18T14:13:19Z

Summary

generate.go: Resolve zkStateRoot → mptStateRoot via ReadDiskStateRoot at the entry of generateSnapshot, so WriteSnapshotRoot and trie.New both use the correct on-disk MPT root
journal.go: In loadSnapshot, accept ZK/MPT root mismatch when ReadDiskStateRoot(zkRoot) == snapshotRoot, avoiding unnecessary rebuilds
pruner.go: In the journal-missing fallback path, translate the persisted SnapshotRoot from zkStateRoot to mptStateRoot and persist the correction

Background

MPT nodes syncing ZK-era blocks store trie data under the locally-computed mptStateRoot (Keccak256), but the block header carries a zkStateRoot (Poseidon hash). The state system translates between these via ReadDiskStateRoot/WriteDiskStateRoot in OpenTrie and trie.Database.Commit.

However, the snapshot generator calls trie.New(root, triedb) directly — bypassing the OpenTrie translation layer. When root is a zkStateRoot, the trie lookup fails immediately with MissingNodeError. The generator goroutine then blocks forever on <-genAbort, and waitBuild() hangs the caller (pruner or blockchain startup).

This means:

Snapshot flat data was never generated on MPT nodes running ZK-era blocks (accounts=0)
The pruner hangs indefinitely when attempting to prune state
After pruning (with fix), third-party nodes loading the snapshot would trigger an unnecessary rebuild

Test plan

Build passes: go build ./core/state/pruner/ ./core/state/snapshot/
Run geth snapshot prune-state on an MPT node with ZK-era blocks — verify snapshot generates from mptStateRoot (no "Trie missing" error)
Verify pruner completes successfully and produces a valid snapshot
Verify a fresh MPT node can load the pruned snapshot without triggering a rebuild (journal.go fix)
Verify no regression on post-Jade-fork blocks (where zkStateRoot == mptStateRoot, ReadDiskStateRoot returns error, root unchanged)

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Fixed persisted vs on-disk snapshot root mismatches to prevent "missing trie node" errors, persist corrected roots, and accept validated snapshots without unnecessary rebuilds; recovery now logs discontinuities and proceeds.
- Updated warning/help text for snapshot prune guidance.
Improvements
- Startup recovery avoids unnecessary chain rewinds by matching on-disk state roots and applies smarter truncation rules when trimming ancients.
Tests
- Added end-to-end tests covering root-translation, snapshot generate/load/rebuild, and lifecycle scenarios.

… and loading MPT nodes syncing ZK-era blocks store trie data under the locally-computed mptStateRoot (Keccak256), but the block header and snapshot system use the zkStateRoot (Poseidon hash). The snapshot generator calls trie.New(root, triedb) directly — bypassing the OpenTrie/ReadDiskStateRoot translation layer — so it can never find MPT trie nodes when root is a zkStateRoot. This causes the generator goroutine to block forever on genAbort, hanging the caller. Fix three sites: 1. generate.go (generateSnapshot): resolve zkStateRoot → mptStateRoot via ReadDiskStateRoot before writing SnapshotRoot and starting the generator, so trie.New uses the correct on-disk root. 2. journal.go (loadSnapshot): when the on-disk snapshot root (mptStateRoot) differs from the requested root (zkStateRoot), check ReadDiskStateRoot to verify they correspond — accept the snapshot without a costly rebuild. 3. pruner.go (NewPruner): in the journal-missing fallback path, translate the persisted SnapshotRoot from zkStateRoot to mptStateRoot and update the DB so that subsequent snapshot.New calls load correctly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai · 2026-03-18T14:13:41Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Resolve persisted zkStateRoots to on-disk MPT roots across pruner, snapshot generation/load/rebuild, and genesis extraction; persist corrected snapshot roots; accept snapshots when MPT translation matches in-memory head; adjust startup head-recovery/truncation to compare mapped disk roots.

Changes

Cohort / File(s)	Summary
Pruner & genesis extraction `core/state/pruner/pruner.go`	Read persisted snapshot root and translate zkStateRoot→mptStateRoot via `rawdb.ReadDiskStateRoot`; persist corrected root; resolve genesis root before creating secure trie; update warning text.
Snapshot generation & disk layer `core/state/snapshot/generate.go`, `core/state/snapshot/snapshot.go`	Translate zkStateRoot→mptStateRoot before initializing disk layer; ensure generated/rebuilt snapshots and internal layers use the translated on-disk root as map keys.
Snapshot load & journal validation `core/state/snapshot/journal.go`, `core/state/snapshot/snapshot.go`	When on-disk root ≠ requested root, accept snapshot if MPT translation matches in-memory head; otherwise error (unless recovery mode, then warn). Avoid unnecessary rebuilds by recognizing translated-root equivalence.
Snapshot compatibility tests `core/state/snapshot/zk_mpt_compat_test.go`	New tests exercising zk↔mpt translation for generate, load, rebuild, layer map-key behavior, and recovery scenarios.
Blockchain head recovery & truncation `core/blockchain.go`	Startup repair walks blocks mapping each block Root() via `rawdb.ReadDiskStateRoot` before comparing to snapshot disk root; updates full-block head markers without rewinding header/fast chain; refines ancient truncation logic considering fast/full positions.
Manifest `go.mod`	Minor manifest edits.

Sequence Diagram(s)

sequenceDiagram
participant Pruner
participant Generator
participant RawDB
participant InMemHead as InMem
participant Blockchain

Pruner->>RawDB: Read persisted snapDiskRoot
RawDB-->>Pruner: snapDiskRoot (zk or mpt)
Pruner->>RawDB: ReadDiskStateRoot(snapDiskRoot)
alt mapping exists
RawDB-->>Pruner: mptStateRoot
Pruner->>RawDB: WriteSnapshotRoot(mptStateRoot)
end
Pruner->>Generator: init snapshot with translated root
Generator->>RawDB: open disk layer at mptStateRoot
Generator->>InMem: compare disk root ↔ in-memory head
alt translation matches head
InMem-->>Generator: accept snapshot (no rebuild)
else mismatch & recovery==true
Generator-->>InMem: warn and proceed (rebuild)
else mismatch & recovery==false
Generator-->>InMem: reject snapshot (error)
end
Blockchain->>RawDB: ReadDiskStateRoot(block.Root()) during recovery walk
RawDB-->>Blockchain: mapped mpt root
Blockchain->>Blockchain: compare mapped root ↔ snapshot disk root to find match

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Possibly related PRs

pruner: fall back to disk snapshot root when journal is missing #300 — touches pruner snapshot-root initialization and fallback logic; closely related to snapDiskRoot resolution/persistence.

Suggested labels

bug

Suggested reviewers

panos-xyz
twcctop

Poem

🐇 I chased a root beneath the loam,
Mapped zk to mpt and brought it home,
I nudged the layers, fixed the map,
Watched snapshots hum and closed my lap,
Now state sleeps safe beneath my dome 🥕

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly and specifically describes the main change: resolving zkStateRoot to mptStateRoot in snapshot generation and loading, which directly addresses the core problem in the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 92.86% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/snapshot-generate-disk-state-root

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d784990b-a845-4682-83d1-62b8a0d49c20

📥 Commits

Reviewing files that changed from the base of the PR and between b3c5552 and d15a37c.

📒 Files selected for processing (3)

core/state/pruner/pruner.go
core/state/snapshot/generate.go
core/state/snapshot/journal.go

coderabbitai · 2026-03-18T14:22:09Z

+		if mptRoot, err2 := rawdb.ReadDiskStateRoot(db, snapDiskRoot); err2 == nil {
+			log.Info("Pruner: resolved snapshot ZK root to MPT root",
+				"zkRoot", snapDiskRoot, "mptRoot", mptRoot)
+			snapDiskRoot = mptRoot
+			// Persist the corrected root so that subsequent snapshot.New
+			// calls (inside loadSnapshot) can locate the disk layer.
+			rawdb.WriteSnapshotRoot(db, snapDiskRoot)
+		}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== ReadDiskStateRoot implementation ==" fd 'accessors_state.go$' --exec sed -n '95,125p' {} echo echo "== NewPruner fallback segment ==" fd 'pruner.go$' --exec sed -n '100,135p' {} echo echo "== All ReadDiskStateRoot callsites ==" rg -n -C2 '\bReadDiskStateRoot\s*\(' --type=go

Repository: morph-l2/go-ethereum

Length of output: 15914

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Search for ReadTrieNode implementation ==" rg -n 'func\s+ReadTrieNode' --type=go echo echo "== Check snapshot.New signature and early validation ==" fd 'snapshot.go$' -path '*/state/snapshot/*' --exec grep -n "func New" {} echo echo "== Check if snapshot.New validates root ==" fd 'snapshot.go$' -path '*/state/snapshot/*' --exec sed -n '1,100p' {} | head -60

Repository: morph-l2/go-ethereum

Length of output: 454

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== ReadTrieNode implementation ==" sed -n '79,95p' core/rawdb/accessors_state.go echo echo "== ReadDiskStateRoot full implementation ==" sed -n '107,115p' core/rawdb/accessors_state.go echo echo "== Snapshot.New signature and validation ==" fd 'snapshot.go' -path '*/state/snapshot/*' | head -1 | xargs sed -n '1,150p' | grep -A 20 "^func New" echo echo "== Check if snapshot.New validates root early ==" fd 'snapshot.go' -path '*/state/snapshot/*' | head -1 | xargs sed -n '1,300p' | tail -200

Repository: morph-l2/go-ethereum

Length of output: 1379

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Find snapshot.New function ==" rg -n 'func New\(' core/state/snapshot/ --type=go -A 10 echo echo "== Check pruner fallback and snapshot loading context ==" sed -n '100,145p' core/state/pruner/pruner.go

Repository: morph-l2/go-ethereum

Length of output: 3162

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Check if pruner code validates mptRoot before WriteSnapshotRoot ==" sed -n '110,130p' core/state/pruner/pruner.go echo echo "== Check other ReadDiskStateRoot callsites for validation patterns ==" rg -B2 -A5 'WriteSnapshotRoot|snapshot.New' core/state/snapshot/generate.go echo echo "== Check if mptRoot validation exists elsewhere in codebase ==" rg -n 'mptRoot.*Hash\{\}' --type=go echo echo "== Check snapshot journal handling ==" sed -n '160,180p' core/state/snapshot/journal.go

Repository: morph-l2/go-ethereum

Length of output: 2194

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Check snapshot.New full implementation for root validation ==" sed -n '186,250p' core/state/snapshot/snapshot.go echo echo "== Check if snapshot.New or its callees validate root existence ==" rg -n 'func.*loadDiskLayer|func.*load' core/state/snapshot/snapshot.go | head -5 echo echo "== Check snapshot/generate.go more context around WriteSnapshotRoot ==" sed -n '155,175p' core/state/snapshot/generate.go echo echo "== Look for any tests that trigger this pruner path ==" fd 'pruner.*test' --type=f

Repository: morph-l2/go-ethereum

Length of output: 2019

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Find loadSnapshot implementation ==" rg -n 'func loadSnapshot' core/state/snapshot/ --type=go -A 20 | head -40 echo echo "== Check if loadSnapshot validates root ==" rg -n 'func.*loadDiskLayer' core/state/snapshot/ --type=go -A 15 | head -40 echo echo "== Summary: Check all places where mptRoot is used after ReadDiskStateRoot ==" rg -B3 -A8 'ReadDiskStateRoot.*err.*== nil' core/state/pruner/pruner.go

Repository: morph-l2/go-ethereum

Length of output: 1779

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Check WriteSnapshotRoot implementation ==" rg -n 'func WriteSnapshotRoot' core/rawdb/ --type=go -A 5 echo echo "== Check if snapshot.New/loadSnapshot handles bad roots gracefully ==" sed -n '130,170p' core/state/snapshot/journal.go echo echo "== Examine what happens after snapshot.New fails in pruner ==" sed -n '125,145p' core/state/pruner/pruner.go echo echo "== Check if there's recovery logic for corrupted roots ==" rg -n 'WriteSnapshotRoot' core/state/snapshot/ --type=go -B 3 -A 3

Repository: morph-l2/go-ethereum

Length of output: 7832

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Full context of snapshot/generate.go root handling ==" sed -n '155,180p' core/state/snapshot/generate.go echo echo "== Verify if generate.go also has the same unvalidated persist issue ==" sed -n '162,175p' core/state/snapshot/generate.go echo echo "== Check if both pruner and generate paths have the same problem ==" echo "Pruner WriteSnapshotRoot location:" grep -n 'WriteSnapshotRoot' core/state/pruner/pruner.go echo echo "Generate WriteSnapshotRoot location:" grep -n 'WriteSnapshotRoot' core/state/snapshot/generate.go

Repository: morph-l2/go-ethereum

Length of output: 2068

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Confirm the exact line numbers in pruner.go ==" sed -n '107,125p' core/state/pruner/pruner.go | cat -n echo echo "== Confirm the exact pattern in generate.go also needs fixing ==" sed -n '157,180p' core/state/snapshot/generate.go | cat -n

Repository: morph-l2/go-ethereum

Length of output: 2543

Validate translated root before persisting to prevent corrupted snapshot state.

At lines 114–120, ReadDiskStateRoot returns the translated root without validating it. If the mapping is malformed or stale, line 120 persists an invalid snapshot root that causes all subsequent snapshot.New calls to fail on restart. The same vulnerability exists in core/state/snapshot/generate.go:162–173.

Add validation to confirm the translated root points to an actual trie node before persisting:

Suggested fix

if mptRoot, err2 := rawdb.ReadDiskStateRoot(db, snapDiskRoot); err2 == nil { + if mptRoot == (common.Hash{}) || len(rawdb.ReadTrieNode(db, mptRoot)) == 0 { + log.Warn("Pruner: translated MPT root is not usable, keeping persisted snapshot root", + "zkRoot", snapDiskRoot, "mptRoot", mptRoot) + } else { log.Info("Pruner: resolved snapshot ZK root to MPT root", "zkRoot", snapDiskRoot, "mptRoot", mptRoot) snapDiskRoot = mptRoot rawdb.WriteSnapshotRoot(db, snapDiskRoot) + } }

generateSnapshot may translate zkStateRoot to mptStateRoot internally (for MPT nodes syncing ZK-era blocks). Rebuild was still using the original root argument as the layers map key, creating a mismatch: key=zkRoot but diskLayer.root=mptRoot. This caused diffToDisk's stale-layer cleanup to miss the disk layer's children (children map is keyed by parent.Root()=mptRoot, but the stale disk layer was keyed by zkRoot), leaving orphaned diffLayers with a stale origin — triggering ErrSnapshotStale on subsequent reads. Fix: capture the returned base and use base.root as the map key. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Test generateSnapshot root translation, loadSnapshot ZK/MPT mismatch tolerance, and Rebuild map-key consistency to ensure prune-state succeeds on MPT nodes syncing ZK-era blocks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/state/snapshot/zk_mpt_compat_test.go`:
- Around line 80-102: The test currently manually shuts down the snapshot
generator with stop := make(chan *generatorStats); snap.genAbort <- stop; <-stop
at the end, which is skipped on t.Fatal and leaks goroutines; instead register a
t.Cleanup handler immediately after creating snap (or after generateSnapshot)
that sends a stop channel into snap.genAbort and waits for the reply (i.e.
t.Cleanup(func(){ stop := make(chan *generatorStats); snap.genAbort <- stop;
<-stop })), and remove the manual shutdown code at the end of the test; apply
the same change to the other occurrences referenced (lines around the other
tests) so the generator is always stopped even on failures.
- Around line 40-56: The test helper buildMPTTrie currently ignores errors from
functions that do return them; update it to check and fail the test on errors
from trie.NewSecure, rlp.EncodeToBytes, accTrie.Commit and triedb.Commit (use
t.Fatalf or t.Fatal with the wrapped error) while leaving accTrie.Update
unchecked since it does not return an error. Specifically, after calling
trie.NewSecure(common.Hash{}, triedb) verify the returned error before
proceeding, after rlp.EncodeToBytes(acc) check its error and abort on failure,
check the error returned by accTrie.Commit(nil) and handle it, and check/handle
any error returned by triedb.Commit(mptRoot, false, nil).

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a447f6da-6291-4d5d-9f6f-586476870a4b

📥 Commits

Reviewing files that changed from the base of the PR and between d0ec38b and fed227c.

📒 Files selected for processing (1)

core/state/snapshot/zk_mpt_compat_test.go

coderabbitai · 2026-03-18T15:22:49Z

+func buildMPTTrie(t *testing.T, diskdb *memorydb.Database) (common.Hash, *trie.Database) {
+	t.Helper()
+	triedb := trie.NewDatabase(diskdb)
+	accTrie, _ := trie.NewSecure(common.Hash{}, triedb)
+	for _, key := range []string{"acc-1", "acc-2", "acc-3"} {
+		acc := &Account{
+			Balance:          big.NewInt(1),
+			Root:             emptyRoot.Bytes(),
+			KeccakCodeHash:   emptyKeccakCode.Bytes(),
+			PoseidonCodeHash: emptyPoseidonCode.Bytes(),
+		}
+		val, _ := rlp.EncodeToBytes(acc)
+		accTrie.Update([]byte(key), val)
+	}
+	mptRoot, _, _ := accTrie.Commit(nil)
+	triedb.Commit(mptRoot, false, nil)
+	return mptRoot, triedb


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # Verify trie/RLP APIs used here actually return errors that should be handled. rg -nP --type go 'func\s+NewSecure$' -C2 rg -nP --type go 'func\s+\(.*$\s+Update$[^)]*$\s+error' -C2 rg -nP --type go 'func\s+$.*$\s+Commit$[^)]*$\s*$[^)]*error' -C2 rg -nP --type go 'func\s+EncodeToBytes\(' -C2

Repository: morph-l2/go-ethereum

Length of output: 4538

🏁 Script executed:

# Find SecureTrie.Update method definition rg -nP --type go 'func\s+\(\w+\s+\*SecureTrie$\s+Update' -C3

Repository: morph-l2/go-ethereum

Length of output: 485

🏁 Script executed:

# Find trie.Database.Commit method definition rg -nP --type go 'func\s+$\w+\s+\*Database$\s+Commit' -C3

Repository: morph-l2/go-ethereum

Length of output: 665

🏁 Script executed:

# Read the specific lines of the test file to confirm context sed -n '40,70p' core/state/snapshot/zk_mpt_compat_test.go

Repository: morph-l2/go-ethereum

Length of output: 1265

Fix error handling in test fixture builders.

The helpers currently discard returned errors from setup operations, which can hide initialization failures and produce misleading test failures.

The accTrie.Update method does not return an error (it logs internally), so error-checking logic should not be added there. The following error checks should be added:

🔧 Proposed fix

func buildMPTTrie(t *testing.T, diskdb *memorydb.Database) (common.Hash, *trie.Database) { t.Helper() triedb := trie.NewDatabase(diskdb) - accTrie, _ := trie.NewSecure(common.Hash{}, triedb) + accTrie, err := trie.NewSecure(common.Hash{}, triedb) + if err != nil { + t.Fatalf("failed to create secure trie: %v", err) + } for _, key := range []string{"acc-1", "acc-2", "acc-3"} { acc := &Account{ Balance: big.NewInt(1), Root: emptyRoot.Bytes(), KeccakCodeHash: emptyKeccakCode.Bytes(), PoseidonCodeHash: emptyPoseidonCode.Bytes(), } - val, _ := rlp.EncodeToBytes(acc) + val, err := rlp.EncodeToBytes(acc) + if err != nil { + t.Fatalf("failed to encode account: %v", err) + } accTrie.Update([]byte(key), val) } - mptRoot, _, _ := accTrie.Commit(nil) - triedb.Commit(mptRoot, false, nil) + mptRoot, _, err := accTrie.Commit(nil) + if err != nil { + t.Fatalf("failed to commit trie: %v", err) + } + if err := triedb.Commit(mptRoot, false, nil); err != nil { + t.Fatalf("failed to commit trie db: %v", err) + } return mptRoot, triedb } -func writeDoneGenerator(db ethdb.KeyValueWriter) { - blob, _ := rlp.EncodeToBytes(&journalGenerator{Done: true}) +func writeDoneGenerator(t *testing.T, db ethdb.KeyValueWriter) { + t.Helper() + blob, err := rlp.EncodeToBytes(&journalGenerator{Done: true}) + if err != nil { + t.Fatalf("failed to encode generator journal: %v", err) + } rawdb.WriteSnapshotGenerator(db, blob) }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@core/state/snapshot/zk_mpt_compat_test.go` around lines 40 - 56, The test helper buildMPTTrie currently ignores errors from functions that do return them; update it to check and fail the test on errors from trie.NewSecure, rlp.EncodeToBytes, accTrie.Commit and triedb.Commit (use t.Fatalf or t.Fatal with the wrapped error) while leaving accTrie.Update unchecked since it does not return an error. Specifically, after calling trie.NewSecure(common.Hash{}, triedb) verify the returned error before proceeding, after rlp.EncodeToBytes(acc) check its error and abort on failure, check the error returned by accTrie.Commit(nil) and handle it, and check/handle any error returned by triedb.Commit(mptRoot, false, nil).

coderabbitai · 2026-03-18T15:22:49Z

+	snap := generateSnapshot(diskdb, triedb, 16, zkRoot)
+
+	// diskLayer.root must be mptRoot, not zkRoot.
+	if snap.root != mptRoot {
+		t.Fatalf("diskLayer.root: got %x, want mptRoot %x", snap.root, mptRoot)
+	}
+	// SnapshotRoot persisted to DB must be mptRoot.
+	if stored := rawdb.ReadSnapshotRoot(diskdb); stored != mptRoot {
+		t.Fatalf("SnapshotRoot in DB: got %x, want mptRoot %x", stored, mptRoot)
+	}
+	// Generation must complete successfully.
+	select {
+	case <-snap.genPending:
+	case <-time.After(3 * time.Second):
+		t.Fatal("snapshot generation timed out — trie walk likely used zkRoot instead of mptRoot")
+	}
+	// The generated snapshot data must reproduce mptRoot exactly.
+	checkSnapRoot(t, snap, mptRoot)
+
+	stop := make(chan *generatorStats)
+	snap.genAbort <- stop
+	<-stop
+}


⚠️ Potential issue | 🟡 Minor

Use t.Cleanup for generator shutdown to prevent leaked goroutines on failing paths.

Manual shutdown at the end of each test is skipped if an earlier t.Fatal fires, leaving background generator goroutines alive and making the suite flaky.

🔧 Proposed fix

+func stopGeneratorOnCleanup(t *testing.T, dl *diskLayer) { + t.Helper() + t.Cleanup(func() { + stop := make(chan *generatorStats) + dl.genAbort <- stop + <-stop + }) +} func TestGenerateSnapshotTranslatesZkRoot(t *testing.T) { ... snap := generateSnapshot(diskdb, triedb, 16, zkRoot) + stopGeneratorOnCleanup(t, snap) ... - stop := make(chan *generatorStats) - snap.genAbort <- stop - <-stop } func TestGenerateSnapshotNoTranslation(t *testing.T) { ... snap := generateSnapshot(diskdb, triedb, 16, mptRoot) + stopGeneratorOnCleanup(t, snap) ... - stop := make(chan *generatorStats) - snap.genAbort <- stop - <-stop } func TestRebuildZkRootMapKeyConsistency(t *testing.T) { ... snaps.Rebuild(zkRoot) + stopGeneratorOnCleanup(t, snaps.disklayer()) ... - stop := make(chan *generatorStats) - snaps.disklayer().genAbort <- stop - <-stop } func TestRebuildWithoutTranslation(t *testing.T) { ... snaps.Rebuild(mptRoot) + stopGeneratorOnCleanup(t, snaps.disklayer()) ... - stop := make(chan *generatorStats) - snaps.disklayer().genAbort <- stop - <-stop }

Also applies to: 112-130, 201-244, 260-278

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@core/state/snapshot/zk_mpt_compat_test.go` around lines 80 - 102, The test currently manually shuts down the snapshot generator with stop := make(chan *generatorStats); snap.genAbort <- stop; <-stop at the end, which is skipped on t.Fatal and leaks goroutines; instead register a t.Cleanup handler immediately after creating snap (or after generateSnapshot) that sends a stop channel into snap.genAbort and waits for the reply (i.e. t.Cleanup(func(){ stop := make(chan *generatorStats); snap.genAbort <- stop; <-stop })), and remove the manual shutdown code at the end of the test; apply the same change to the other occurrences referenced (lines around the other tests) so the generator is always stopped even on failures.

coderabbitai

🧹 Nitpick comments (1)

core/blockchain.go (1)

318-347: Consider adding a log or sanity check when backward walk falls back to genesis.

If the loop cannot find any block whose on-disk root matches diskRoot and whose state is accessible, it falls back to genesis (lines 336-338). This could indicate a data corruption scenario where the snapshot's disk root doesn't correspond to any available state. Consider adding a warning log or assertion when the final newHead is genesis but diskRoot doesn't match genesis state, to aid debugging.

🔧 Suggested improvement

 			newHead = parent
 		}
-		log.Warn("Rewound full block to snapshot state",
-			"from", head.NumberU64(), "to", newHead.NumberU64(), "snaproot", diskRoot)
+		if newHead.NumberU64() == 0 && diskRoot != newHead.Root() {
+			log.Error("Backward walk fell back to genesis without matching snapshot root",
+				"snaproot", diskRoot, "genesisRoot", newHead.Root())
+		}
+		log.Warn("Rewound full block to snapshot state",
+			"from", head.NumberU64(), "to", newHead.NumberU64(), "snaproot", diskRoot)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@core/blockchain.go` around lines 318 - 347, The backward-walk in the block
rewind (loop using newHead, head, rawdb.ReadDiskStateRoot and state.New) can
fall back to bc.genesisBlock without verifying the snapshot diskRoot matches
genesis state; add a sanity check and warning when newHead == bc.genesisBlock
and the diskRoot was not matched earlier: attempt to resolve the genesis state's
on-disk root (using rawdb.ReadDiskStateRoot and state.New on
bc.genesisBlock.Root()), and if it still doesn't equal diskRoot or state.New
fails, emit a clear log.Warn/Error including diskRoot, genesis hash and any
error details to aid debugging and consider returning an error or metric
increment to highlight possible data corruption.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@core/blockchain.go`:
- Around line 318-347: The backward-walk in the block rewind (loop using
newHead, head, rawdb.ReadDiskStateRoot and state.New) can fall back to
bc.genesisBlock without verifying the snapshot diskRoot matches genesis state;
add a sanity check and warning when newHead == bc.genesisBlock and the diskRoot
was not matched earlier: attempt to resolve the genesis state's on-disk root
(using rawdb.ReadDiskStateRoot and state.New on bc.genesisBlock.Root()), and if
it still doesn't equal diskRoot or state.New fails, emit a clear log.Warn/Error
including diskRoot, genesis hash and any error details to aid debugging and
consider returning an error or metric increment to highlight possible data
corruption.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dcc2aa06-814a-4bce-bf16-0e14e9670b6c

📥 Commits

Reviewing files that changed from the base of the PR and between fed227c and 507d82f.

📒 Files selected for processing (2)

core/blockchain.go
core/state/pruner/pruner.go

🚧 Files skipped from review as they are similar to previous changes (1)

core/state/pruner/pruner.go

…rd walk ReadSnapshotRoot may return a zkStateRoot written by an older code path. Translate it via ReadDiskStateRoot before comparing with block roots in the post-prune repair walk, so the walk can find the correct block instead of falling through to genesis. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/blockchain.go`:
- Around line 330-354: The loop rewinds newHead until it finds a block whose
translated root equals diskRoot but then persists newHead.NumberU64(), which can
be lower if that block lacked trie state; instead, capture and keep the matched
snapshot-layer height when you detect blockRoot == diskRoot (before the
state.New check) into a separate variable (e.g., matchedSnapNumber), continue
the existing rewind logic to set bc.currentBlock and head hash as now, and call
rawdb.WriteSnapshotRecoveryNumber(bc.db, matchedSnapNumber) so the snapshot
recovery number persisted reflects the actual snapshot layer rather than the
possibly lower rewound full-block head; reference symbols: diskRoot, newHead,
head, state.New, rawdb.WriteSnapshotRecoveryNumber, rawdb.WriteHeadBlockHash,
bc.currentBlock.Store.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7ff1f568-07cf-4ce9-8498-ac1841955178

📥 Commits

Reviewing files that changed from the base of the PR and between 507d82f and 18b9b3b.

📒 Files selected for processing (1)

core/blockchain.go

…rsisted root The old fallback used ReadSnapshotRoot which returns the root from the last restart's generateSnapshot call. If the node synced N more blocks since that restart, the snapshot and prune target would be N blocks behind the current head, causing a post-prune rollback of N blocks on restart. Always resolve the current headBlock root (translating zkStateRoot → mptStateRoot via ReadDiskStateRoot) and use that as both the snapshot generation and pruning target, so the node restarts at the head height. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When snapshot.New(headRoot) succeeds via the normal path (e.g. journal.go accepts the mptRoot translation), the fallback block was skipped and snapDiskRoot stayed zero. Prune() then tried to find 128 diff layers which never exist for the pruner CLI, causing "snapshot not old enough" error. Always capture the disk root from the successfully loaded snapshot tree so that Prune() uses the disk-layer root as the pruning target. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Revert the custom backward walk in NewBlockChain back to the standard setHeadBeyondRoot call, which already handles ZK→MPT root translation since commit 507d82f. The custom walk only rewound currentBlock without touching currentFastBlock/currentHeader, creating an inconsistent chain state. Also restore the original freezer truncation conditions. In the pruner, only fall back to the snapshot disk-layer root when fewer than 128 diff layers are available, preserving the standard HEAD-127 pruning target when sufficient layers exist. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… shutdown BlockChain.Stop() calls snaps.Journal(currentBlock.Root()) to persist the snapshot tree, but currentBlock.Root() is a zkStateRoot while the snapshot disk layer is keyed by mptStateRoot. When no diff layers exist (e.g. right after prune-state or fresh snapshot generation), Snapshot(zkRoot) returns nil and the journal write silently fails. On next startup the journal is missing (diffs=missing), which cascades into pruner failures. Resolve zkRoot to mptRoot via ReadDiskStateRoot before calling Journal() so the disk layer can always be found in the layers map. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

curryxbo requested a review from a team as a code owner March 18, 2026 14:13

curryxbo requested review from secmgt and removed request for a team March 18, 2026 14:13

coderabbitai Bot reviewed Mar 18, 2026

View reviewed changes

corey and others added 2 commits March 18, 2026 22:39

coderabbitai Bot reviewed Mar 18, 2026

View reviewed changes

fix snapshot

507d82f

coderabbitai Bot reviewed Mar 19, 2026

View reviewed changes

Comment thread core/blockchain.go Outdated

corey and others added 4 commits March 19, 2026 18:39

curryxbo closed this Mar 20, 2026

Conversation

curryxbo commented Mar 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Background

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

curryxbo commented Mar 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 18, 2026 •

edited

Loading