perf(storage): batch trie updates across blocks in save_blocks by gakonst · Pull Request #21142 · paradigmxyz/reth

gakonst · 2026-01-16T18:51:51Z

Summary

Batches trie updates across all blocks in save_blocks instead of writing per-block.

Problem

Per #eng-perf profiling, write_trie_updates was taking ~25% of persistence time. The current implementation calls write_trie_updates_sorted once per block, opening/closing cursors N times.

In back-to-back (b2b) scenarios with 75-250 accumulated blocks, this overhead compounds significantly.

Solution

Accumulate trie updates across blocks using the existing extend_ref method, then write them all in a single batch:

// Accumulate across blocks
let mut accumulated_trie_updates: Option<TrieUpdatesSorted> = None;

for block in blocks {
    // ... other per-block writes ...
    
    match &mut accumulated_trie_updates {
        Some(acc) => acc.extend_ref(&trie_data.trie_updates),
        None => accumulated_trie_updates = Some((*trie_data.trie_updates).clone()),
    }
}

// Single batch write at end
if let Some(trie_updates) = &accumulated_trie_updates {
    self.write_trie_updates_sorted(trie_updates)?;
}

Expected Impact

~50% reduction in write_trie_updates time for b2b scenarios
Reduces cursor open/close overhead from N to 1
Reduces MDBX transaction overhead

Testing

All existing reth-provider tests pass
Can be benchmarked with real block replay via reth-bench-compare

Closes RETH-168

Previously, `write_trie_updates_sorted` was called once per block in the save_blocks loop. This opened/closed cursors N times for N blocks. This change accumulates trie updates across all blocks using `extend_ref` and writes them in a single batch at the end. This reduces: - Cursor open/close overhead from N to 1 - MDBX transaction overhead For back-to-back block processing with 75-250 accumulated blocks (per #eng-perf profiling), this significantly reduces the ~25% of persist time spent in write_trie_updates. Expected improvement: ~50% reduction in write_trie_updates for b2b scenarios.

DaniPopes · 2026-01-16T19:56:36Z

this is ok but its the same as the other "accumulation" functions, we shouldn't accumulate into a sorted vec because it sorts on each iteration

Address review feedback: replace iterative extend_ref (which re-sorts on each call, O(n*k)) with merge_batch which uses k-way merge for O(n log k) complexity. Also removes unnecessary intermediate clones by collecting Arc refs and passing them directly to merge_batch. Amp-Thread-ID: https://ampcode.com/threads/T-019bc863-a64e-73bd-93d5-fc65229fa862 Co-authored-by: Amp <amp@ampcode.com>

Add merge_batch_hybrid() to TrieUpdatesSorted and HashedPostStateSorted which uses a hybrid algorithm: - Small k (< 64): extend_ref loop with low constant factors - Large k (≥ 64): k-way merge_batch for O(n log k) complexity This consolidates the threshold logic that was duplicated in lazy_overlay.rs and makes it available for save_blocks batching. Amp-Thread-ID: https://ampcode.com/threads/T-019bc863-a64e-73bd-93d5-fc65229fa862 Co-authored-by: Amp <amp@ampcode.com>

mediocregopher

This LGTM

yongkangc · 2026-01-16T20:26:55Z

crates/chain-state/src/lazy_overlay.rs

-                prefix_sets: Default::default(),
-            }
+        // Collect all trie data first (blocks are newest-to-oldest)
+        let trie_data: Vec<_> = blocks.iter().map(|b| b.wait_cloned()).collect();


not sure if we can remove a collect here?

yongkangc · 2026-01-16T20:31:05Z

crates/trie/common/src/hashed_state.rs

+    pub fn merge_batch_hybrid<'a>(states: impl IntoIterator<Item = &'a Self>) -> Self {
+        const MERGE_BATCH_THRESHOLD: usize = 64;
+
+        let states: Vec<_> = states.into_iter().collect();


we can skip this collect again

because we collected before this alrready?

yongkangc · 2026-01-16T20:31:33Z

crates/trie/common/src/updates.rs

+    pub fn merge_batch_hybrid<'a>(updates: impl IntoIterator<Item = &'a Self>) -> Self {
+        const MERGE_BATCH_THRESHOLD: usize = 64;
+
+        let updates: Vec<_> = updates.into_iter().collect();


same here, i think we can prob remove this collect

yongkangc · 2026-01-16T21:37:38Z

crates/storage/provider/src/providers/database/provider.rs

+            if save_mode.with_state() {
+                let start = Instant::now();
+                // Collect Arc refs first to extend their lifetime
+                let trie_updates: Vec<_> = blocks.iter().map(|b| b.trie_updates()).collect();


same here re collect. check for perf regression compared to previous

Addresses review feedback from yongkangc: avoid internal collect() in merge_batch_hybrid by changing signature to accept &[&Self] slice. Callers that already have collected data (lazy_overlay, provider) now pass the slice directly, avoiding a redundant collection inside the merge function. Amp-Thread-ID: https://ampcode.com/threads/T-019bc863-a64e-73bd-93d5-fc65229fa862 Co-authored-by: Amp <amp@ampcode.com>

yongkangc · 2026-01-16T21:41:03Z

crates/trie/common/src/hashed_state.rs

+        }
+
+        if states.len() == 1 {
+            return (*states[0]).clone();


is this clone here bad? not 100% sure why we need to return a clone.

Address yongkangc review feedback: - Remove unused merge_batch_hybrid functions from reth-trie-common - Inline optimized hybrid logic in lazy_overlay and provider - Use Arc::make_mut for copy-on-write in small-k path (avoids clone if refcount is 1) - Use Arc::try_unwrap to avoid final clone when possible - Only collect for large-k path where k-way merge needs materialized data - Single block case returns data directly without allocation Amp-Thread-ID: https://ampcode.com/threads/T-019bc863-a64e-73bd-93d5-fc65229fa862 Co-authored-by: Amp <amp@ampcode.com>

crates/storage/provider/src/providers/database/provider.rs

This refactor addresses code duplication identified in PR #21142 by extracting the hybrid merge algorithm into a reusable trait and helper functions. ## Changes ### New HybridMergeSorted trait (reth_trie::utils) - Trait for sorted types supporting hybrid merging (extend_ref + merge_batch) - Implemented for TrieUpdatesSorted and HashedPostStateSorted - Two helper functions: - `hybrid_merge_arc_slice`: merges Arc-wrapped items from a slice - `hybrid_merge_arc_vec`: consumes Vec for Arc::try_unwrap optimization ### Algorithm details - 0 items: returns default - 1 item: returns Arc clone (no data allocation) - < threshold: uses extend_ref loop with Arc::make_mut (copy-on-write) - >= threshold: uses k-way merge_batch for O(n log k) complexity ### Updated call sites 1. **lazy_overlay.rs**: merge_blocks now uses hybrid_merge_arc_slice 2. **provider.rs**: save_blocks now batches trie updates using hybrid_merge_arc_vec, reducing cursor open/close overhead from N to 1 ## Allocation behavior - No regression vs original code - Small batches: Arc::make_mut provides copy-on-write - Large batches: single allocation for k-way merge result - Arc::try_unwrap used to avoid clone when refcount is 1 ## Expected performance impact - ~50% reduction in write_trie_updates time for b2b scenarios (batched writes) - Maintains same allocation characteristics as original code

Batches trie updates across all blocks in `save_blocks` instead of writing per-block, reducing cursor open/close overhead from N to 1. ## Changes ### lazy_overlay.rs - Move MERGE_BATCH_THRESHOLD constant inside function - Use data directly instead of Arc::clone (avoids unnecessary refcount bump) - Move Arc::make_mut into the loop for proper copy-on-write semantics ### provider.rs Add batched trie updates with hybrid merge algorithm: - 0 blocks: default - 1 block: Arc::try_unwrap to avoid clone if refcount is 1 - < 30 blocks: extend_ref with Arc::make_mut (copy-on-write) - >= 30 blocks: k-way merge_batch for O(n log k) complexity ## Allocation Behavior (No Regression) Small batches avoid collect() by using Arc::make_mut directly in the loop. Only large batches (>= 30) collect Arcs for k-way merge. ## Expected Impact - ~50% reduction in write_trie_updates time for b2b scenarios - Maintains same allocation characteristics as original code ## Related - Based on optimizations from Slack #eng-perf thread - Learned from PR #21142 review: avoid unnecessary collect(), use Arc::make_mut

mediocregopher · 2026-01-19T12:56:05Z

Closes #20611

* tag 'v1.10.1': (49 commits) chore: bump version to 1.10.1 (paradigmxyz#21188) chore: rename extend_ref methods on sorted data structures (paradigmxyz#21043) fix(flashblocks): Add flashblock ws connection retry period (paradigmxyz#20510) chore(bench): add --disable-tx-gossip to benchmark node args (paradigmxyz#21171) refactor(stages): reuse history index cache buffers in `collect_history_indices` (paradigmxyz#21017) feat(download): resumable snapshot downloads with auto-retry (paradigmxyz#21161) ci: update to tempoxyz (paradigmxyz#21176) chore: apply spelling and typo fixes (paradigmxyz#21182) docs: document minimal storage mode in pruning FAQ (paradigmxyz#21025) chore(deps): weekly `cargo update` (paradigmxyz#21167) feat(execution-types): add receipts_iter helper (paradigmxyz#21162) revert: undo Chain crate, add LazyTrieData to trie-common (paradigmxyz#21155) feat(engine): add new_payload_interval metric (start-to-start) (paradigmxyz#21159) feat(engine): add time_between_new_payloads metric (paradigmxyz#21158) fix(storage-api): gate reth-chain dependency behind std feature perf(storage): batch trie updates across blocks in save_blocks (paradigmxyz#21142) refactor: use ExecutionOutcome::single instead of tuple From (paradigmxyz#21152) chore(chain-state): reorganize deferred_trie.rs impl blocks (paradigmxyz#21151) feat(primitives-traits): add try_recover_signers for parallel batch recovery (paradigmxyz#21103) perf: make Chain use DeferredTrieData (paradigmxyz#21137) ...

gakonst requested review from joshieDo, rakita and shekhirin as code owners January 16, 2026 18:51

gakonst added C-perf A change motivated by improving speed, memory usage or disk footprint S-needs-benchmark This set of changes needs performance benchmarking to double-check that they help labels Jan 16, 2026

github-project-automation bot moved this to Backlog in Reth Tracker Jan 16, 2026

github-project-automation bot added this to Reth Tracker Jan 16, 2026

gakonst and others added 2 commits January 16, 2026 20:03

gakonst requested review from Rjected, fgimenez, mattsse and mediocregopher as code owners January 16, 2026 20:09

mediocregopher approved these changes Jan 16, 2026

View reviewed changes

github-project-automation bot moved this from Backlog to In Progress in Reth Tracker Jan 16, 2026

yongkangc reviewed Jan 16, 2026

View reviewed changes

crates/storage/provider/src/providers/database/provider.rs Outdated Show resolved Hide resolved

Apply suggestion from @yongkangc

764d5d5

yongkangc approved these changes Jan 16, 2026

View reviewed changes

mediocregopher approved these changes Jan 16, 2026

View reviewed changes

mattsse approved these changes Jan 17, 2026

View reviewed changes

yongkangc added this pull request to the merge queue Jan 17, 2026

Merged via the queue into main with commit c11c130 Jan 17, 2026
44 checks passed

yongkangc deleted the georgios/batch-trie-updates-v2 branch January 17, 2026 07:25

github-project-automation bot moved this from In Progress to Done in Reth Tracker Jan 17, 2026

gakonst mentioned this pull request Jan 17, 2026

refactor(trie): extract hybrid merge algorithm for sorted trie types #21157

Closed

cwkang1998 mentioned this pull request Jan 19, 2026

perf(storage): flatten TrieUpdates before persist #20785

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(storage): batch trie updates across blocks in save_blocks#21142

perf(storage): batch trie updates across blocks in save_blocks#21142
yongkangc merged 6 commits intomainfrom
georgios/batch-trie-updates-v2

gakonst commented Jan 16, 2026 •

edited by emmajam

Loading

Uh oh!

DaniPopes commented Jan 16, 2026

Uh oh!

mediocregopher left a comment

Uh oh!

yongkangc Jan 16, 2026

Uh oh!

yongkangc Jan 16, 2026

Uh oh!

yongkangc Jan 16, 2026

Uh oh!

yongkangc Jan 16, 2026

Uh oh!

yongkangc Jan 16, 2026

Uh oh!

Uh oh!

Uh oh!

mediocregopher commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

gakonst commented Jan 16, 2026 • edited by emmajam Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Expected Impact

Testing

Related

Uh oh!

DaniPopes commented Jan 16, 2026

Uh oh!

mediocregopher left a comment

Choose a reason for hiding this comment

Uh oh!

yongkangc Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

yongkangc Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

yongkangc Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

yongkangc Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

yongkangc Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mediocregopher commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

gakonst commented Jan 16, 2026 •

edited by emmajam

Loading