perf(storage): batch trie updates across blocks in save_blocks by gakonst · Pull Request #21139 · paradigmxyz/reth

gakonst · 2026-01-16T18:23:22Z

Summary

Batches trie updates across all blocks in save_blocks instead of writing per-block.

Problem

Per #eng-perf profiling, write_trie_updates was taking ~25% of persistence time. The current implementation calls write_trie_updates_sorted once per block, opening/closing cursors N times.

In back-to-back (b2b) scenarios with 75-250 accumulated blocks, this overhead compounds significantly.

Solution

Accumulate trie updates across blocks using the existing extend_ref method, then write them all in a single batch:

// Accumulate across blocks
let mut accumulated_trie_updates: Option<TrieUpdatesSorted> = None;

for block in blocks {
    // ... other per-block writes ...
    
    match &mut accumulated_trie_updates {
        Some(acc) => acc.extend_ref(&trie_data.trie_updates),
        None => accumulated_trie_updates = Some((*trie_data.trie_updates).clone()),
    }
}

// Single batch write at end
if let Some(trie_updates) = &accumulated_trie_updates {
    self.write_trie_updates_sorted(trie_updates)?;
}

Expected Impact

~50% reduction in write_trie_updates time for b2b scenarios
Reduces cursor open/close overhead from N to 1
Reduces MDBX transaction overhead

Testing

All existing reth-provider tests pass
Run benchmark: cargo bench -p reth-engine-tree --bench heavy_persistence -- accumulated

Based on #eng-perf Slack discussions identifying key bottlenecks: - update_history_indices: 26% of persist time - write_trie_updates: 25.4% - write_trie_changesets: 24.2% - Execution cache contention under high throughput New benchmarks: - execution_cache: cache hit rates, contention, TIP-20 patterns - heavy_persistence: accumulated blocks, history indices, state root - heavy_root: parallel vs sync at scale, large storage tries Includes runner script and optimization opportunities doc.

Previously, `write_trie_updates_sorted` was called once per block in the save_blocks loop. This opened/closed cursors N times for N blocks. This change accumulates trie updates across all blocks using `extend_ref` and writes them in a single batch at the end. This reduces: - Cursor open/close overhead from N to 1 - MDBX transaction overhead For back-to-back block processing with 75-250 accumulated blocks (per #eng-perf profiling), this significantly reduces the ~25% of persist time spent in write_trie_updates. Expected improvement: ~50% reduction in write_trie_updates for b2b scenarios.

…ates

mediocregopher

All the changes which aren't in the crates/storage/provider/src/providers/database/provider.rs file should be left out of this PR

mediocregopher · 2026-01-16T18:26:27Z

crates/storage/provider/src/providers/database/provider.rs


+            // Accumulate trie updates across blocks to batch the write at the end.
+            // This reduces cursor open/close overhead from N calls to 1.
+            let mut accumulated_trie_updates: Option<TrieUpdatesSorted> = None;


This could just start as an empty TrieUpdatesSorted

Amp-Thread-ID: https://ampcode.com/threads/T-019bc811-0850-7320-902c-52e64a671eb5 Co-authored-by: Amp <amp@ampcode.com>

gakonst · 2026-01-16T18:32:07Z

Local Benchmark Results

persistence/accumulated_blocks/overlay_merge/blocks_75_accounts_200
  main-baseline: 1.33ms
  batch-trie:    1.15ms  (-13%)

persistence/accumulated_blocks/overlay_merge/blocks_250_accounts_200  
  main-baseline: 2.42ms
  batch-trie:    2.44ms  (within noise)

The micro-benchmark shows modest gains in the overlay merge path. The real impact will be in the write_trie_updates_sorted path during actual save_blocks execution.

To properly benchmark, run with real block data via reth-bench-compare or profile with samply:

samply record -- reth re-execute --from 21000000 --to 21001000 ...

The expected improvement is more significant when:

More blocks are accumulated (b2b scenarios with 75-250 blocks)
Trie updates are non-trivial (real mainnet state)

gakonst · 2026-01-16T18:35:37Z

Benchmark Results (Local)

Ran benchmarks on local machine:

Accumulated Blocks Benchmark (Overlay Merge)

Scenario	Baseline	With Batch Trie	Improvement
75 blocks, 200 accounts	1.33ms	1.16ms	~12% faster
250 blocks, 200 accounts	2.42ms	2.49ms	~same

State Root Sync vs Parallel

DB Size	Updated	Sync	Parallel	Speedup
3,000	1,500	211ms	211ms	1.0x
10,000	3,000	694ms	506ms	1.37x
30,000	6,000	1.05s	965ms	1.09x

Notes

The trie batching optimization shows modest gains in these isolated benchmarks. The real impact is expected in:

Back-to-back scenarios with 75-250 accumulated blocks
Real mainnet state where cursor overhead is more significant

Recommended next step: Run on a reth box with reth re-execute to measure actual b2b improvement.

gakonst · 2026-01-16T18:39:16Z

Closing in favor of #21106 which implements the same optimization with additional correctness for trie changesets overlay. My implementation missed the overlay handling for write_trie_changesets.

The benchmarks and approach are the same - accumulate trie updates and batch write at end.

gakonst added 2 commits January 16, 2026 18:06

gakonst requested review from joshieDo and rakita as code owners January 16, 2026 18:23

gakonst added the C-perf A change motivated by improving speed, memory usage or disk footprint label Jan 16, 2026

gakonst requested a review from shekhirin as a code owner January 16, 2026 18:23

gakonst added the S-needs-benchmark This set of changes needs performance benchmarking to double-check that they help label Jan 16, 2026

github-project-automation bot added this to Reth Tracker Jan 16, 2026

github-project-automation bot moved this to Backlog in Reth Tracker Jan 16, 2026

Merge branch 'georgios/heavy-benchmarks' into georgios/batch-trie-upd…

fb7f4cc

…ates

gakonst requested review from Rjected, fgimenez, mattsse, mediocregopher and yongkangc as code owners January 16, 2026 18:24

mediocregopher requested changes Jan 16, 2026

View reviewed changes

github-project-automation bot moved this from Backlog to In Progress in Reth Tracker Jan 16, 2026

chore: remove unrelated benchmark/docs changes per review feedback

cc80f67

Amp-Thread-ID: https://ampcode.com/threads/T-019bc811-0850-7320-902c-52e64a671eb5 Co-authored-by: Amp <amp@ampcode.com>

gakonst closed this Jan 16, 2026

github-project-automation bot moved this from In Progress to Done in Reth Tracker Jan 16, 2026

This was referenced Jan 16, 2026

perf(provider): flatten trie updates before persisting #21106

Closed

perf(storage): batch trie updates across blocks in save_blocks #21140

Closed

mattsse mentioned this pull request Jan 17, 2026

refactor(trie): simplify LazyTrieData with SortedTrieData container #21153

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(storage): batch trie updates across blocks in save_blocks#21139

perf(storage): batch trie updates across blocks in save_blocks#21139
gakonst wants to merge 4 commits intomainfrom
georgios/batch-trie-updates

gakonst commented Jan 16, 2026

Uh oh!

mediocregopher left a comment

Uh oh!

mediocregopher Jan 16, 2026

Uh oh!

gakonst commented Jan 16, 2026

Uh oh!

gakonst commented Jan 16, 2026

Uh oh!

gakonst commented Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gakonst commented Jan 16, 2026

Summary

Problem

Solution

Expected Impact

Testing

Related

Uh oh!

mediocregopher left a comment

Choose a reason for hiding this comment

Uh oh!

mediocregopher Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

gakonst commented Jan 16, 2026

Local Benchmark Results

Uh oh!

gakonst commented Jan 16, 2026

Benchmark Results (Local)

Accumulated Blocks Benchmark (Overlay Merge)

State Root Sync vs Parallel

Notes

Uh oh!

gakonst commented Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants