feat: sparse trie as cache#21583
Merged
Merged
Conversation
Add prune() method to SparseTrieInterface trait for cross-payload sparse trie caching. The method converts nodes beyond a specified depth to Hash nodes, reducing memory while preserving root hash. Implementation details: - SerialSparseTrie: DFS traversal with node-depth semantics - ParallelSparseTrie: Prunes across upper/lower subtries - ConfiguredSparseTrie: Delegates to inner implementation - SparseStateTrie: Evicts storage tries by node count, then prunes Also adds revealed_node_count() to track non-Hash nodes and DEFAULT_SPARSE_TRIE_PRUNE_DEPTH/DEFAULT_MAX_PRESERVED_STORAGE_TRIES constants for configuration. Amp-Thread-ID: https://ampcode.com/threads/T-019bfa26-6d57-72df-9b0e-89dc32090861
- Save cleared revealed_paths HashSets to cleared_revealed_paths for reuse instead of discarding them (matching the pattern used by cleared_tries) - Add metrics recording for prune operations: - prune_account_nodes_converted - prune_storage_nodes_converted - prune_storage_tries_cleared - prune_storage_tries_retained - post_prune_account_nodes - post_prune_storage_nodes Amp-Thread-ID: https://ampcode.com/threads/T-019bfad0-85ab-703d-a712-01dbf2098358
Extract branch_changes_on_leaf_removal and extension_changes_on_leaf_removal into a shared leaf_removal module in reth-trie-sparse. These pure functions compute structural transformations needed when removing leaves from sparse tries. Both SerialSparseTrie and ParallelSparseTrie now use the shared helpers, eliminating ~94 lines of duplicated logic while maintaining identical behavior. Amp-Thread-ID: https://ampcode.com/threads/T-019bfaf4-f717-7331-80bb-95da8f58ac9b
- Fix branch_node_masks retention in parallel prune to match serial (use starts_with_pruned instead of is_strict_descendant) - Use sort_unstable for pruned roots (stability not needed) - Improve prune() trait doc with edge case behavior - Move TrieMask import to test module where it's used Amp-Thread-ID: https://ampcode.com/threads/T-019bfaf4-e801-71ce-b83b-4f2fadf0dd37
After pruning retained storage tries, their revealed_paths sets were not being cleared. This could cause subsequent multiproof/witness reveals to incorrectly skip nodes that were pruned away, leading to blinded-node errors. Also clarified docstring: precondition requires root() specifically, and documented that prune clears update tracking state. Amp-Thread-ID: https://ampcode.com/threads/T-019bfd7a-bb64-732f-b725-3df2431ab50b
Move prune() from SparseTrie trait to a new SparseTrieExt extension trait as specified in RETH-178. This makes pruning an opt-in capability: - Create SparseTrieExt trait extending SparseTrie in traits.rs - Only ParallelSparseTrie implements SparseTrieExt - SerialSparseTrie keeps prune() as inherent method (not trait) - SparseStateTrie::prune() now requires A: SparseTrieExt, S: SparseTrieExt bounds Amp-Thread-ID: https://ampcode.com/threads/T-019bfdab-5e65-71ac-a8d2-73ccb6eb6409
Combine the two-phase prune algorithm (collect roots, then convert) into a single DFS pass that converts eligible nodes to Hash stubs during traversal. Uses SmallVec to collect children before mutation to satisfy the borrow checker. Amp-Thread-ID: https://ampcode.com/threads/T-019bfdc9-30c0-7328-b1c9-2f9ac8df5b7b
…rseTrie impl - Remove SerialSparseTrie::prune() inherent method (~120 lines) - Remove serial prune tests from sparse/trie.rs (~290 lines) - Update parallel tests to use only ParallelSparseTrie - Update SparseTrieExt trait doc to reflect only ParallelSparseTrie implements it - Add large_account_value helper to parallel tests This eliminates code duplication since prune() is only needed for ParallelSparseTrie in production (via SparseStateTrie::prune). Amp-Thread-ID: https://ampcode.com/threads/T-019bfdba-500a-76ed-a7ce-91506d242e24
Remove the shared leaf_removal.rs module and keep the original inline code in SerialSparseTrie. The helper functions are kept as methods only in ParallelSparseTrie. Amp-Thread-ID: https://ampcode.com/threads/T-019bfdde-ac35-76ae-9881-df28f26ddbe0
Removes prune_storage_tries_retained as it's derivable from other metrics. The post_prune_storage_nodes metric already captures retained size, and prune_storage_tries_cleared captures eviction activity. Amp-Thread-ID: https://ampcode.com/threads/T-019bfdec-0ab8-75bc-959d-05f705fb701a
- Remove ShrinkConfig struct and DEFAULT_SHRINK_* constants (not in RETH-178 spec) - Remove shrink_config field and related methods from ParallelSparseTrie - Restore #[derive(Eq)] on ParallelSparseTrie (no more f64 equality issues) - Fix early return bug: clear updates/prefix_set at start of prune() to ensure bookkeeping is always reset even when nothing is pruned Amp-Thread-ID: https://ampcode.com/threads/T-019bfdfa-8ad0-756e-a93e-58bffd7c0db2
- Add explicit preconditions section (must call root() first) - Document max_depth == 0 behavior - Clarify depth counts nodes, not nibbles (extension nodes count as 1) - Document that prefix_set and updates are cleared - Simplify inline comment for sort Amp-Thread-ID: https://ampcode.com/threads/T-019bfdff-fab0-77e2-b9c5-d10321e0b243
Remove prune-related metrics to simplify PR for review: - Remove prune_account_nodes_converted, prune_storage_nodes_converted, prune_storage_tries_cleared, post_prune_account_nodes, post_prune_storage_nodes fields and histograms - Remove record_prune() method - Simplify prune() implementation by removing #[cfg(feature = "metrics")] blocks Metrics can be tuned and added back after core algorithm review. Amp-Thread-ID: https://ampcode.com/threads/T-019bfe0d-5ace-71b4-b66a-415b3962dd97
…parallelization - Use bit manipulation to iterate only set bits in branch state_mask (trailing_zeros + clear lowest bit pattern), avoiding 16 iterations per branch - Collect revealed subtrie indices before parallelization, only use rayon when >=4 subtries need processing to reduce scheduling overhead - Add stronger fast-path: clear entire lower subtries when upper prune root is a prefix of subtrie path (O(1) vs O(n) retain scan) Amp-Thread-ID: https://ampcode.com/threads/T-019bfe03-dfc5-7772-a3d9-a582075d3175
- Replace O(n) revealed_node_count() scan with O(1) nodes.len() for capacity estimation - Fix SmallVec bulk-initialization: use new() + push instead of from_buf_and_len - Narrow stack depth type from usize (8 bytes) to u8 (1 byte) Amp-Thread-ID: https://ampcode.com/threads/T-019bfe23-0ba2-75ce-8d23-69b24212af5b
mediocregopher
approved these changes
Jan 29, 2026
Member
mediocregopher
left a comment
There was a problem hiding this comment.
Two small things but LGTM for merging, we can optimize after
| /// Those are being moved into `account_updates` once storage roots | ||
| /// are revealed and/or calculated. | ||
| /// | ||
| /// Invariant: for each entry in `pending_account_updates` account must either be already |
Member
There was a problem hiding this comment.
Is this comment right? I thought the account couldn't be in account_updates until we had a storage root, and if we had a storage root the update wouldn't be pending
Member
Author
There was a problem hiding this comment.
yeah it could be there as Touched or Changed with outdated storage root
| MultiProofMessage::StateUpdate(_, state) => { | ||
| self.on_state_update(state); | ||
| } | ||
| MultiProofMessage::EmptyProof { sequence_number: _, state } => { |
Member
There was a problem hiding this comment.
are empty proofs possible if we're bypassing the multiproof task?
Member
Author
There was a problem hiding this comment.
yeah it's unreachable now. we can clean it up later via a new message enum or by removing multiprooftask and message entirely
This was referenced May 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a new
SparseTrieCacheTaskthat uses the in-memory sparse trie to drive proof fetching instead of relying on theMultiProofTask