perf(l1): refactor storage download to use StorageTrieTracker#6171
perf(l1): refactor storage download to use StorageTrieTracker#6171fedacking wants to merge 21 commits into
Conversation
🤖 Kimi Code ReviewReview SummaryThis PR refactors the snap sync storage download logic by introducing a new Issues Found1. Potential Panic in
|
🤖 Claude Code ReviewBased on my review of PR #6171, here's my code review: Code Review - PR #6171: Refactor storage download to use StorageTrieTrackerOverall AssessmentThis is a substantial refactoring that improves code organization and eliminates clones. The changes move from index-based referencing to a more structured approach using enums and dedicated tracker types. The refactoring is well-conceived, but there are several issues that need attention. Critical Issues1. Potential Panic in
|
Lines of code reportTotal lines added: Detailed view |
Greptile OverviewGreptile SummaryThis PR refactors the storage download subsystem to use a cleaner Key improvements:
Minor issue found:
The refactor maintains equivalent functionality while improving code clarity and reducing unnecessary data copies. Tests pass and the design aligns well with the documented plan in issue #6170. Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| crates/networking/p2p/snap/client.rs | Major refactor of storage download logic using new StorageTask/StorageTaskResult enums and JoinSet-based concurrency; replaced clone-heavy channel communication with structured data flow |
| crates/networking/p2p/sync.rs | Introduced new StorageTrieTracker with SmallTrie/BigTrie structures to replace index-based AccountStorageRoots; added interval computation helper for big tries |
Sequence Diagram
sequenceDiagram
participant Main as request_storage_ranges
participant Tracker as StorageTrieTracker
participant Queue as Task Queue
participant Worker as Worker Tasks
participant Peer as Peer Network
Main->>Tracker: take_small_batch(STORAGE_BATCH_SIZE)
Tracker-->>Main: SmallBatch tasks
Main->>Queue: Queue SmallBatch tasks
loop For each BigTrie
Main->>Tracker: Get intervals from big_tries
Main->>Queue: Queue BigInterval tasks
end
loop Until all tasks complete
Main->>Worker: Spawn worker with task
Worker->>Peer: GetStorageRanges request
Peer-->>Worker: StorageRanges response
alt SmallBatch completed
Worker-->>Main: SmallComplete
Main->>Main: Write to disk buffer
Main->>Queue: Re-queue remaining tries
else SmallBatch failed
Worker-->>Main: SmallFailed
Main->>Queue: Re-queue all tries
else Small promoted to big
Worker-->>Main: SmallPromotedToBig
Main->>Tracker: promote_to_big()
Main->>Main: Compute intervals
Main->>Queue: Queue BigInterval tasks
else BigInterval result
Worker-->>Main: BigIntervalResult
Main->>Main: Append slots to buffer
Main->>Queue: Re-queue remaining interval
end
end
Main->>Main: Flush disk buffers
| tx.send(empty_task_result).await.ok(); | ||
| return Ok(()); | ||
|
|
||
| if (slots.is_empty() && proof.is_empty()) || slots.is_empty() || slots.len() > tries.len() { |
There was a problem hiding this comment.
redundant condition: slots.is_empty() is checked twice
| if (slots.is_empty() && proof.is_empty()) || slots.is_empty() || slots.len() > tries.len() { | |
| if slots.is_empty() || proof.is_empty() || slots.len() > tries.len() { |
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/networking/p2p/snap/client.rs
Line: 1166:1166
Comment:
redundant condition: `slots.is_empty()` is checked twice
```suggestion
if slots.is_empty() || proof.is_empty() || slots.len() > tries.len() {
```
How can I resolve this? If you propose a fix, please make it concise.
🤖 Codex Code ReviewFindings
If you want, I can suggest concrete patches for the fixes above or add a small unit test for Automated review by OpenAI Codex · custom prompt |
Benchmark Block Execution Results Comparison Against Main
|
|
One thing I think would be very useful is adding monitoring to request_storage_ranges so we can observe what's happening during the storage download phase. Right now it's hard to tell the state of progress and the mix of work being done. Concretely, I'd like to see periodic debug! logs (or at the very least before/after each request_storage_ranges call) that include:
This would give us good visibility into whether storage download is making progress, whether tries are getting promoted from small to big, and how the interval-based download is evolving over successive attempts. The existing metrics infrastructure (METRICS) seems like the right place to wire these into. |
…ndle_healed_account
promote_to_big was trying to get accounts from small_tries, but they had already been taken out by take_small_batch. The big trie in the tracker ended up with zero accounts, causing BigInterval tasks to have empty account lists.
| current_account_storages.insert( | ||
| root, | ||
| AccountsWithStorage { | ||
| accounts: trie.accounts, | ||
| storages, | ||
| }, | ||
| ); |
There was a problem hiding this comment.
🟡 flush_completed_tries uses insert, overwriting previously accumulated big-trie storage data
flush_completed_tries at crates/networking/p2p/snap/client.rs:559 uses BTreeMap::insert to write completed small-trie data into current_account_storages. This overwrites any existing entry for the same storage root. Meanwhile, the BigIntervalResult handler at crates/networking/p2p/snap/client.rs:670-677 uses .entry().or_insert_with().storages.extend() to append slots incrementally.
Scenario where data is lost
Although the StorageTrieTracker keeps small and big tries in separate maps keyed by root, current_account_storages is a shared buffer that accumulates data from both code paths. If a big trie interval result writes slots for root X via extend, and then a later SmallComplete or SmallPromotedToBig result calls flush_completed_tries with a completed trie that happens to share root X (e.g. due to a race between healing adding a new small trie with the same root and an in-flight big interval completing), the insert call will silently discard all previously accumulated big-trie slots for that root.
Even if this race is unlikely today, using insert instead of entry().or_insert_with().extend() is inconsistent with the BigIntervalResult path and fragile against future changes. The fix is to use entry + extend (or at minimum or_insert) in flush_completed_tries to preserve any previously accumulated data.
Impact: Potential silent loss of downloaded storage slots for accounts sharing a storage root, requiring re-download or healing.
| current_account_storages.insert( | |
| root, | |
| AccountsWithStorage { | |
| accounts: trie.accounts, | |
| storages, | |
| }, | |
| ); | |
| current_account_storages | |
| .entry(root) | |
| .and_modify(|existing| { | |
| existing.storages.extend(storages.iter().cloned()); | |
| }) | |
| .or_insert_with(|| AccountsWithStorage { | |
| accounts: trie.accounts, | |
| storages, | |
| }); |
Was this helpful? React with 👍 or 👎 to provide feedback.
Re-reviewed: all comments addressed in updated commits.
Without a cap, the number of in-flight storage range workers was bounded only by available peers × allowed requests per peer, which could reach thousands and consume tens of GB of memory. Add a MAX_STORAGE_RANGE_WORKERS constant (1000, ~2 MB each ≈ 2 GB) and block-wait for a worker to finish before spawning new ones when at capacity.
- Add §1.18 observability tooling (PR #6470) - Add §1.19 pivot update reliability (PR #6475, issue #6474) - Add §1.20 big-account within-trie parallelization (issue #6477) - Add §1.21 small-account batching (issue #6476) - Add §1.22 decoded TrieLayerCache (PR #6348) - Add §1.23 bloom filter for non-existent storage (PR #6288) - Add §1.24 adaptive request sizing + bisection (PR #6181) - Add §1.25 concurrent bytecode + storage (PR #6205) - Add §1.26 phase completion markers (PR #6189) - Add §2.18 StorageTrieTracker refactor (PR #6171) - Update current-state bottleneck table with small-account and pivot-update findings - Reprioritize timeline: pivot-update crash fix is now priority 0 - Add two risks (pivot crash masks perf work, DB corruption on every crash) - Bump doc version to 1.3
Summary
AccountStorageRootswithStorageTrieTrackerthroughout snap sync, eliminating index-based referencing and theaccounts_by_root_hashintermediate structureStorageTask/StorageTaskResultenums that move trie data into tasks and back in results, removing clones and simplifying the download loopJoinSet+try_join_next) instead of channels for worker communication inrequest_storage_rangesBigTrie::compute_intervalshelper from the inline chunking logic<) incan_try_more_requestsmeant a score ratio of 0.0 resulted inrequests < 0.0, which is always false — effectively blacklisting the peer. Changing to<=ensures every connected peer can always handle at least 1 concurrent request.The full plan for this PR is documented in #6170
Test plan
cargo check -p ethrex-p2pcompiles cleanly (default + rocksdb features)cargo clippy -p ethrex-p2ppasses with no warnings (default + rocksdb features)cargo test -p ethrex-p2p— all 38 tests pass