snapshot: resolve zkStateRoot to mptStateRoot for generation and loading#302
snapshot: resolve zkStateRoot to mptStateRoot for generation and loading#302
Conversation
… and loading MPT nodes syncing ZK-era blocks store trie data under the locally-computed mptStateRoot (Keccak256), but the block header and snapshot system use the zkStateRoot (Poseidon hash). The snapshot generator calls trie.New(root, triedb) directly — bypassing the OpenTrie/ReadDiskStateRoot translation layer — so it can never find MPT trie nodes when root is a zkStateRoot. This causes the generator goroutine to block forever on genAbort, hanging the caller. Fix three sites: 1. generate.go (generateSnapshot): resolve zkStateRoot → mptStateRoot via ReadDiskStateRoot before writing SnapshotRoot and starting the generator, so trie.New uses the correct on-disk root. 2. journal.go (loadSnapshot): when the on-disk snapshot root (mptStateRoot) differs from the requested root (zkStateRoot), check ReadDiskStateRoot to verify they correspond — accept the snapshot without a costly rebuild. 3. pruner.go (NewPruner): in the journal-missing fallback path, translate the persisted SnapshotRoot from zkStateRoot to mptStateRoot and update the DB so that subsequent snapshot.New calls load correctly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughResolve persisted zkStateRoots to on-disk MPT roots across pruner, snapshot generation/load/rebuild, and genesis extraction; persist corrected snapshot roots; accept snapshots when MPT translation matches in-memory head; adjust startup head-recovery/truncation to compare mapped disk roots. Changes
Sequence Diagram(s)sequenceDiagram
participant Pruner
participant Generator
participant RawDB
participant InMemHead as InMem
participant Blockchain
Pruner->>RawDB: Read persisted snapDiskRoot
RawDB-->>Pruner: snapDiskRoot (zk or mpt)
Pruner->>RawDB: ReadDiskStateRoot(snapDiskRoot)
alt mapping exists
RawDB-->>Pruner: mptStateRoot
Pruner->>RawDB: WriteSnapshotRoot(mptStateRoot)
end
Pruner->>Generator: init snapshot with translated root
Generator->>RawDB: open disk layer at mptStateRoot
Generator->>InMem: compare disk root ↔ in-memory head
alt translation matches head
InMem-->>Generator: accept snapshot (no rebuild)
else mismatch & recovery==true
Generator-->>InMem: warn and proceed (rebuild)
else mismatch & recovery==false
Generator-->>InMem: reject snapshot (error)
end
Blockchain->>RawDB: ReadDiskStateRoot(block.Root()) during recovery walk
RawDB-->>Blockchain: mapped mpt root
Blockchain->>Blockchain: compare mapped root ↔ snapshot disk root to find match
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d784990b-a845-4682-83d1-62b8a0d49c20
📒 Files selected for processing (3)
core/state/pruner/pruner.gocore/state/snapshot/generate.gocore/state/snapshot/journal.go
| if mptRoot, err2 := rawdb.ReadDiskStateRoot(db, snapDiskRoot); err2 == nil { | ||
| log.Info("Pruner: resolved snapshot ZK root to MPT root", | ||
| "zkRoot", snapDiskRoot, "mptRoot", mptRoot) | ||
| snapDiskRoot = mptRoot | ||
| // Persist the corrected root so that subsequent snapshot.New | ||
| // calls (inside loadSnapshot) can locate the disk layer. | ||
| rawdb.WriteSnapshotRoot(db, snapDiskRoot) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== ReadDiskStateRoot implementation =="
fd 'accessors_state.go$' --exec sed -n '95,125p' {}
echo
echo "== NewPruner fallback segment =="
fd 'pruner.go$' --exec sed -n '100,135p' {}
echo
echo "== All ReadDiskStateRoot callsites =="
rg -n -C2 '\bReadDiskStateRoot\s*\(' --type=goRepository: morph-l2/go-ethereum
Length of output: 15914
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Search for ReadTrieNode implementation =="
rg -n 'func\s+ReadTrieNode' --type=go
echo
echo "== Check snapshot.New signature and early validation =="
fd 'snapshot.go$' -path '*/state/snapshot/*' --exec grep -n "func New" {}
echo
echo "== Check if snapshot.New validates root =="
fd 'snapshot.go$' -path '*/state/snapshot/*' --exec sed -n '1,100p' {} | head -60Repository: morph-l2/go-ethereum
Length of output: 454
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== ReadTrieNode implementation =="
sed -n '79,95p' core/rawdb/accessors_state.go
echo
echo "== ReadDiskStateRoot full implementation =="
sed -n '107,115p' core/rawdb/accessors_state.go
echo
echo "== Snapshot.New signature and validation =="
fd 'snapshot.go' -path '*/state/snapshot/*' | head -1 | xargs sed -n '1,150p' | grep -A 20 "^func New"
echo
echo "== Check if snapshot.New validates root early =="
fd 'snapshot.go' -path '*/state/snapshot/*' | head -1 | xargs sed -n '1,300p' | tail -200Repository: morph-l2/go-ethereum
Length of output: 1379
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Find snapshot.New function =="
rg -n 'func New\(' core/state/snapshot/ --type=go -A 10
echo
echo "== Check pruner fallback and snapshot loading context =="
sed -n '100,145p' core/state/pruner/pruner.goRepository: morph-l2/go-ethereum
Length of output: 3162
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Check if pruner code validates mptRoot before WriteSnapshotRoot =="
sed -n '110,130p' core/state/pruner/pruner.go
echo
echo "== Check other ReadDiskStateRoot callsites for validation patterns =="
rg -B2 -A5 'WriteSnapshotRoot|snapshot.New' core/state/snapshot/generate.go
echo
echo "== Check if mptRoot validation exists elsewhere in codebase =="
rg -n 'mptRoot.*Hash\{\}' --type=go
echo
echo "== Check snapshot journal handling =="
sed -n '160,180p' core/state/snapshot/journal.goRepository: morph-l2/go-ethereum
Length of output: 2194
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Check snapshot.New full implementation for root validation =="
sed -n '186,250p' core/state/snapshot/snapshot.go
echo
echo "== Check if snapshot.New or its callees validate root existence =="
rg -n 'func.*loadDiskLayer|func.*load' core/state/snapshot/snapshot.go | head -5
echo
echo "== Check snapshot/generate.go more context around WriteSnapshotRoot =="
sed -n '155,175p' core/state/snapshot/generate.go
echo
echo "== Look for any tests that trigger this pruner path =="
fd 'pruner.*test' --type=fRepository: morph-l2/go-ethereum
Length of output: 2019
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Find loadSnapshot implementation =="
rg -n 'func loadSnapshot' core/state/snapshot/ --type=go -A 20 | head -40
echo
echo "== Check if loadSnapshot validates root =="
rg -n 'func.*loadDiskLayer' core/state/snapshot/ --type=go -A 15 | head -40
echo
echo "== Summary: Check all places where mptRoot is used after ReadDiskStateRoot =="
rg -B3 -A8 'ReadDiskStateRoot.*err.*== nil' core/state/pruner/pruner.goRepository: morph-l2/go-ethereum
Length of output: 1779
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Check WriteSnapshotRoot implementation =="
rg -n 'func WriteSnapshotRoot' core/rawdb/ --type=go -A 5
echo
echo "== Check if snapshot.New/loadSnapshot handles bad roots gracefully =="
sed -n '130,170p' core/state/snapshot/journal.go
echo
echo "== Examine what happens after snapshot.New fails in pruner =="
sed -n '125,145p' core/state/pruner/pruner.go
echo
echo "== Check if there's recovery logic for corrupted roots =="
rg -n 'WriteSnapshotRoot' core/state/snapshot/ --type=go -B 3 -A 3Repository: morph-l2/go-ethereum
Length of output: 7832
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Full context of snapshot/generate.go root handling =="
sed -n '155,180p' core/state/snapshot/generate.go
echo
echo "== Verify if generate.go also has the same unvalidated persist issue =="
sed -n '162,175p' core/state/snapshot/generate.go
echo
echo "== Check if both pruner and generate paths have the same problem =="
echo "Pruner WriteSnapshotRoot location:"
grep -n 'WriteSnapshotRoot' core/state/pruner/pruner.go
echo
echo "Generate WriteSnapshotRoot location:"
grep -n 'WriteSnapshotRoot' core/state/snapshot/generate.goRepository: morph-l2/go-ethereum
Length of output: 2068
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Confirm the exact line numbers in pruner.go =="
sed -n '107,125p' core/state/pruner/pruner.go | cat -n
echo
echo "== Confirm the exact pattern in generate.go also needs fixing =="
sed -n '157,180p' core/state/snapshot/generate.go | cat -nRepository: morph-l2/go-ethereum
Length of output: 2543
Validate translated root before persisting to prevent corrupted snapshot state.
At lines 114–120, ReadDiskStateRoot returns the translated root without validating it. If the mapping is malformed or stale, line 120 persists an invalid snapshot root that causes all subsequent snapshot.New calls to fail on restart. The same vulnerability exists in core/state/snapshot/generate.go:162–173.
Add validation to confirm the translated root points to an actual trie node before persisting:
Suggested fix
if mptRoot, err2 := rawdb.ReadDiskStateRoot(db, snapDiskRoot); err2 == nil {
+ if mptRoot == (common.Hash{}) || len(rawdb.ReadTrieNode(db, mptRoot)) == 0 {
+ log.Warn("Pruner: translated MPT root is not usable, keeping persisted snapshot root",
+ "zkRoot", snapDiskRoot, "mptRoot", mptRoot)
+ } else {
log.Info("Pruner: resolved snapshot ZK root to MPT root",
"zkRoot", snapDiskRoot, "mptRoot", mptRoot)
snapDiskRoot = mptRoot
rawdb.WriteSnapshotRoot(db, snapDiskRoot)
+ }
}generateSnapshot may translate zkStateRoot to mptStateRoot internally (for MPT nodes syncing ZK-era blocks). Rebuild was still using the original root argument as the layers map key, creating a mismatch: key=zkRoot but diskLayer.root=mptRoot. This caused diffToDisk's stale-layer cleanup to miss the disk layer's children (children map is keyed by parent.Root()=mptRoot, but the stale disk layer was keyed by zkRoot), leaving orphaned diffLayers with a stale origin — triggering ErrSnapshotStale on subsequent reads. Fix: capture the returned base and use base.root as the map key. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Test generateSnapshot root translation, loadSnapshot ZK/MPT mismatch tolerance, and Rebuild map-key consistency to ensure prune-state succeeds on MPT nodes syncing ZK-era blocks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@core/state/snapshot/zk_mpt_compat_test.go`:
- Around line 80-102: The test currently manually shuts down the snapshot
generator with stop := make(chan *generatorStats); snap.genAbort <- stop; <-stop
at the end, which is skipped on t.Fatal and leaks goroutines; instead register a
t.Cleanup handler immediately after creating snap (or after generateSnapshot)
that sends a stop channel into snap.genAbort and waits for the reply (i.e.
t.Cleanup(func(){ stop := make(chan *generatorStats); snap.genAbort <- stop;
<-stop })), and remove the manual shutdown code at the end of the test; apply
the same change to the other occurrences referenced (lines around the other
tests) so the generator is always stopped even on failures.
- Around line 40-56: The test helper buildMPTTrie currently ignores errors from
functions that do return them; update it to check and fail the test on errors
from trie.NewSecure, rlp.EncodeToBytes, accTrie.Commit and triedb.Commit (use
t.Fatalf or t.Fatal with the wrapped error) while leaving accTrie.Update
unchecked since it does not return an error. Specifically, after calling
trie.NewSecure(common.Hash{}, triedb) verify the returned error before
proceeding, after rlp.EncodeToBytes(acc) check its error and abort on failure,
check the error returned by accTrie.Commit(nil) and handle it, and check/handle
any error returned by triedb.Commit(mptRoot, false, nil).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a447f6da-6291-4d5d-9f6f-586476870a4b
📒 Files selected for processing (1)
core/state/snapshot/zk_mpt_compat_test.go
| func buildMPTTrie(t *testing.T, diskdb *memorydb.Database) (common.Hash, *trie.Database) { | ||
| t.Helper() | ||
| triedb := trie.NewDatabase(diskdb) | ||
| accTrie, _ := trie.NewSecure(common.Hash{}, triedb) | ||
| for _, key := range []string{"acc-1", "acc-2", "acc-3"} { | ||
| acc := &Account{ | ||
| Balance: big.NewInt(1), | ||
| Root: emptyRoot.Bytes(), | ||
| KeccakCodeHash: emptyKeccakCode.Bytes(), | ||
| PoseidonCodeHash: emptyPoseidonCode.Bytes(), | ||
| } | ||
| val, _ := rlp.EncodeToBytes(acc) | ||
| accTrie.Update([]byte(key), val) | ||
| } | ||
| mptRoot, _, _ := accTrie.Commit(nil) | ||
| triedb.Commit(mptRoot, false, nil) | ||
| return mptRoot, triedb |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Verify trie/RLP APIs used here actually return errors that should be handled.
rg -nP --type go 'func\s+NewSecure\(' -C2
rg -nP --type go 'func\s+\(.*\)\s+Update\([^)]*\)\s+error' -C2
rg -nP --type go 'func\s+\(.*\)\s+Commit\([^)]*\)\s*\([^)]*error' -C2
rg -nP --type go 'func\s+EncodeToBytes\(' -C2Repository: morph-l2/go-ethereum
Length of output: 4538
🏁 Script executed:
# Find SecureTrie.Update method definition
rg -nP --type go 'func\s+\(\w+\s+\*SecureTrie\)\s+Update' -C3Repository: morph-l2/go-ethereum
Length of output: 485
🏁 Script executed:
# Find trie.Database.Commit method definition
rg -nP --type go 'func\s+\(\w+\s+\*Database\)\s+Commit' -C3Repository: morph-l2/go-ethereum
Length of output: 665
🏁 Script executed:
# Read the specific lines of the test file to confirm context
sed -n '40,70p' core/state/snapshot/zk_mpt_compat_test.goRepository: morph-l2/go-ethereum
Length of output: 1265
Fix error handling in test fixture builders.
The helpers currently discard returned errors from setup operations, which can hide initialization failures and produce misleading test failures.
The accTrie.Update method does not return an error (it logs internally), so error-checking logic should not be added there. The following error checks should be added:
🔧 Proposed fix
func buildMPTTrie(t *testing.T, diskdb *memorydb.Database) (common.Hash, *trie.Database) {
t.Helper()
triedb := trie.NewDatabase(diskdb)
- accTrie, _ := trie.NewSecure(common.Hash{}, triedb)
+ accTrie, err := trie.NewSecure(common.Hash{}, triedb)
+ if err != nil {
+ t.Fatalf("failed to create secure trie: %v", err)
+ }
for _, key := range []string{"acc-1", "acc-2", "acc-3"} {
acc := &Account{
Balance: big.NewInt(1),
Root: emptyRoot.Bytes(),
KeccakCodeHash: emptyKeccakCode.Bytes(),
PoseidonCodeHash: emptyPoseidonCode.Bytes(),
}
- val, _ := rlp.EncodeToBytes(acc)
+ val, err := rlp.EncodeToBytes(acc)
+ if err != nil {
+ t.Fatalf("failed to encode account: %v", err)
+ }
accTrie.Update([]byte(key), val)
}
- mptRoot, _, _ := accTrie.Commit(nil)
- triedb.Commit(mptRoot, false, nil)
+ mptRoot, _, err := accTrie.Commit(nil)
+ if err != nil {
+ t.Fatalf("failed to commit trie: %v", err)
+ }
+ if err := triedb.Commit(mptRoot, false, nil); err != nil {
+ t.Fatalf("failed to commit trie db: %v", err)
+ }
return mptRoot, triedb
}
-func writeDoneGenerator(db ethdb.KeyValueWriter) {
- blob, _ := rlp.EncodeToBytes(&journalGenerator{Done: true})
+func writeDoneGenerator(t *testing.T, db ethdb.KeyValueWriter) {
+ t.Helper()
+ blob, err := rlp.EncodeToBytes(&journalGenerator{Done: true})
+ if err != nil {
+ t.Fatalf("failed to encode generator journal: %v", err)
+ }
rawdb.WriteSnapshotGenerator(db, blob)
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@core/state/snapshot/zk_mpt_compat_test.go` around lines 40 - 56, The test
helper buildMPTTrie currently ignores errors from functions that do return them;
update it to check and fail the test on errors from trie.NewSecure,
rlp.EncodeToBytes, accTrie.Commit and triedb.Commit (use t.Fatalf or t.Fatal
with the wrapped error) while leaving accTrie.Update unchecked since it does not
return an error. Specifically, after calling trie.NewSecure(common.Hash{},
triedb) verify the returned error before proceeding, after
rlp.EncodeToBytes(acc) check its error and abort on failure, check the error
returned by accTrie.Commit(nil) and handle it, and check/handle any error
returned by triedb.Commit(mptRoot, false, nil).
| snap := generateSnapshot(diskdb, triedb, 16, zkRoot) | ||
|
|
||
| // diskLayer.root must be mptRoot, not zkRoot. | ||
| if snap.root != mptRoot { | ||
| t.Fatalf("diskLayer.root: got %x, want mptRoot %x", snap.root, mptRoot) | ||
| } | ||
| // SnapshotRoot persisted to DB must be mptRoot. | ||
| if stored := rawdb.ReadSnapshotRoot(diskdb); stored != mptRoot { | ||
| t.Fatalf("SnapshotRoot in DB: got %x, want mptRoot %x", stored, mptRoot) | ||
| } | ||
| // Generation must complete successfully. | ||
| select { | ||
| case <-snap.genPending: | ||
| case <-time.After(3 * time.Second): | ||
| t.Fatal("snapshot generation timed out — trie walk likely used zkRoot instead of mptRoot") | ||
| } | ||
| // The generated snapshot data must reproduce mptRoot exactly. | ||
| checkSnapRoot(t, snap, mptRoot) | ||
|
|
||
| stop := make(chan *generatorStats) | ||
| snap.genAbort <- stop | ||
| <-stop | ||
| } |
There was a problem hiding this comment.
Use t.Cleanup for generator shutdown to prevent leaked goroutines on failing paths.
Manual shutdown at the end of each test is skipped if an earlier t.Fatal fires, leaving background generator goroutines alive and making the suite flaky.
🔧 Proposed fix
+func stopGeneratorOnCleanup(t *testing.T, dl *diskLayer) {
+ t.Helper()
+ t.Cleanup(func() {
+ stop := make(chan *generatorStats)
+ dl.genAbort <- stop
+ <-stop
+ })
+}
func TestGenerateSnapshotTranslatesZkRoot(t *testing.T) {
...
snap := generateSnapshot(diskdb, triedb, 16, zkRoot)
+ stopGeneratorOnCleanup(t, snap)
...
- stop := make(chan *generatorStats)
- snap.genAbort <- stop
- <-stop
}
func TestGenerateSnapshotNoTranslation(t *testing.T) {
...
snap := generateSnapshot(diskdb, triedb, 16, mptRoot)
+ stopGeneratorOnCleanup(t, snap)
...
- stop := make(chan *generatorStats)
- snap.genAbort <- stop
- <-stop
}
func TestRebuildZkRootMapKeyConsistency(t *testing.T) {
...
snaps.Rebuild(zkRoot)
+ stopGeneratorOnCleanup(t, snaps.disklayer())
...
- stop := make(chan *generatorStats)
- snaps.disklayer().genAbort <- stop
- <-stop
}
func TestRebuildWithoutTranslation(t *testing.T) {
...
snaps.Rebuild(mptRoot)
+ stopGeneratorOnCleanup(t, snaps.disklayer())
...
- stop := make(chan *generatorStats)
- snaps.disklayer().genAbort <- stop
- <-stop
}Also applies to: 112-130, 201-244, 260-278
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@core/state/snapshot/zk_mpt_compat_test.go` around lines 80 - 102, The test
currently manually shuts down the snapshot generator with stop := make(chan
*generatorStats); snap.genAbort <- stop; <-stop at the end, which is skipped on
t.Fatal and leaks goroutines; instead register a t.Cleanup handler immediately
after creating snap (or after generateSnapshot) that sends a stop channel into
snap.genAbort and waits for the reply (i.e. t.Cleanup(func(){ stop := make(chan
*generatorStats); snap.genAbort <- stop; <-stop })), and remove the manual
shutdown code at the end of the test; apply the same change to the other
occurrences referenced (lines around the other tests) so the generator is always
stopped even on failures.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
core/blockchain.go (1)
318-347: Consider adding a log or sanity check when backward walk falls back to genesis.If the loop cannot find any block whose on-disk root matches
diskRootand whose state is accessible, it falls back to genesis (lines 336-338). This could indicate a data corruption scenario where the snapshot's disk root doesn't correspond to any available state. Consider adding a warning log or assertion when the finalnewHeadis genesis butdiskRootdoesn't match genesis state, to aid debugging.🔧 Suggested improvement
newHead = parent } - log.Warn("Rewound full block to snapshot state", - "from", head.NumberU64(), "to", newHead.NumberU64(), "snaproot", diskRoot) + if newHead.NumberU64() == 0 && diskRoot != newHead.Root() { + log.Error("Backward walk fell back to genesis without matching snapshot root", + "snaproot", diskRoot, "genesisRoot", newHead.Root()) + } + log.Warn("Rewound full block to snapshot state", + "from", head.NumberU64(), "to", newHead.NumberU64(), "snaproot", diskRoot)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/blockchain.go` around lines 318 - 347, The backward-walk in the block rewind (loop using newHead, head, rawdb.ReadDiskStateRoot and state.New) can fall back to bc.genesisBlock without verifying the snapshot diskRoot matches genesis state; add a sanity check and warning when newHead == bc.genesisBlock and the diskRoot was not matched earlier: attempt to resolve the genesis state's on-disk root (using rawdb.ReadDiskStateRoot and state.New on bc.genesisBlock.Root()), and if it still doesn't equal diskRoot or state.New fails, emit a clear log.Warn/Error including diskRoot, genesis hash and any error details to aid debugging and consider returning an error or metric increment to highlight possible data corruption.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@core/blockchain.go`:
- Around line 318-347: The backward-walk in the block rewind (loop using
newHead, head, rawdb.ReadDiskStateRoot and state.New) can fall back to
bc.genesisBlock without verifying the snapshot diskRoot matches genesis state;
add a sanity check and warning when newHead == bc.genesisBlock and the diskRoot
was not matched earlier: attempt to resolve the genesis state's on-disk root
(using rawdb.ReadDiskStateRoot and state.New on bc.genesisBlock.Root()), and if
it still doesn't equal diskRoot or state.New fails, emit a clear log.Warn/Error
including diskRoot, genesis hash and any error details to aid debugging and
consider returning an error or metric increment to highlight possible data
corruption.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: dcc2aa06-814a-4bce-bf16-0e14e9670b6c
📒 Files selected for processing (2)
core/blockchain.gocore/state/pruner/pruner.go
🚧 Files skipped from review as they are similar to previous changes (1)
- core/state/pruner/pruner.go
…rd walk ReadSnapshotRoot may return a zkStateRoot written by an older code path. Translate it via ReadDiskStateRoot before comparing with block roots in the post-prune repair walk, so the walk can find the correct block instead of falling through to genesis. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@core/blockchain.go`:
- Around line 330-354: The loop rewinds newHead until it finds a block whose
translated root equals diskRoot but then persists newHead.NumberU64(), which can
be lower if that block lacked trie state; instead, capture and keep the matched
snapshot-layer height when you detect blockRoot == diskRoot (before the
state.New check) into a separate variable (e.g., matchedSnapNumber), continue
the existing rewind logic to set bc.currentBlock and head hash as now, and call
rawdb.WriteSnapshotRecoveryNumber(bc.db, matchedSnapNumber) so the snapshot
recovery number persisted reflects the actual snapshot layer rather than the
possibly lower rewound full-block head; reference symbols: diskRoot, newHead,
head, state.New, rawdb.WriteSnapshotRecoveryNumber, rawdb.WriteHeadBlockHash,
bc.currentBlock.Store.
…rsisted root The old fallback used ReadSnapshotRoot which returns the root from the last restart's generateSnapshot call. If the node synced N more blocks since that restart, the snapshot and prune target would be N blocks behind the current head, causing a post-prune rollback of N blocks on restart. Always resolve the current headBlock root (translating zkStateRoot → mptStateRoot via ReadDiskStateRoot) and use that as both the snapshot generation and pruning target, so the node restarts at the head height. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When snapshot.New(headRoot) succeeds via the normal path (e.g. journal.go accepts the mptRoot translation), the fallback block was skipped and snapDiskRoot stayed zero. Prune() then tried to find 128 diff layers which never exist for the pruner CLI, causing "snapshot not old enough" error. Always capture the disk root from the successfully loaded snapshot tree so that Prune() uses the disk-layer root as the pruning target. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Revert the custom backward walk in NewBlockChain back to the standard setHeadBeyondRoot call, which already handles ZK→MPT root translation since commit 507d82f. The custom walk only rewound currentBlock without touching currentFastBlock/currentHeader, creating an inconsistent chain state. Also restore the original freezer truncation conditions. In the pruner, only fall back to the snapshot disk-layer root when fewer than 128 diff layers are available, preserving the standard HEAD-127 pruning target when sufficient layers exist. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… shutdown BlockChain.Stop() calls snaps.Journal(currentBlock.Root()) to persist the snapshot tree, but currentBlock.Root() is a zkStateRoot while the snapshot disk layer is keyed by mptStateRoot. When no diff layers exist (e.g. right after prune-state or fresh snapshot generation), Snapshot(zkRoot) returns nil and the journal write silently fails. On next startup the journal is missing (diffs=missing), which cascades into pruner failures. Resolve zkRoot to mptRoot via ReadDiskStateRoot before calling Journal() so the disk layer can always be found in the layers map. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
zkStateRoot → mptStateRootviaReadDiskStateRootat the entry ofgenerateSnapshot, soWriteSnapshotRootandtrie.Newboth use the correct on-disk MPT rootloadSnapshot, accept ZK/MPT root mismatch whenReadDiskStateRoot(zkRoot) == snapshotRoot, avoiding unnecessary rebuildsSnapshotRootfrom zkStateRoot to mptStateRoot and persist the correctionBackground
MPT nodes syncing ZK-era blocks store trie data under the locally-computed
mptStateRoot(Keccak256), but the block header carries azkStateRoot(Poseidon hash). The state system translates between these viaReadDiskStateRoot/WriteDiskStateRootinOpenTrieandtrie.Database.Commit.However, the snapshot generator calls
trie.New(root, triedb)directly — bypassing theOpenTrietranslation layer. Whenrootis azkStateRoot, the trie lookup fails immediately withMissingNodeError. The generator goroutine then blocks forever on<-genAbort, andwaitBuild()hangs the caller (pruner or blockchain startup).This means:
accounts=0)Test plan
go build ./core/state/pruner/ ./core/state/snapshot/geth snapshot prune-stateon an MPT node with ZK-era blocks — verify snapshot generates frommptStateRoot(no "Trie missing" error)zkStateRoot == mptStateRoot,ReadDiskStateRootreturns error, root unchanged)🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Improvements
Tests