refactor(trie): extract hybrid merge algorithm for sorted trie types by gakonst · Pull Request #21157 · paradigmxyz/reth

gakonst · 2026-01-17T07:32:13Z

Summary

Batches trie updates across all blocks in save_blocks instead of writing per-block, reducing cursor open/close overhead from N to 1.

Problem

Per #eng-perf profiling, write_trie_updates was taking ~25% of persistence time. The current implementation calls write_trie_updates_sorted once per block, opening/closing cursors N times.

In back-to-back (b2b) scenarios with 75-250 accumulated blocks, this overhead compounds significantly.

Solution

lazy_overlay.rs

Move MERGE_BATCH_THRESHOLD constant inside function
Use data directly instead of Arc::clone (avoids unnecessary refcount bump)
Move Arc::make_mut into the loop for proper copy-on-write semantics

provider.rs

Add batched trie updates with hybrid merge algorithm:

Batch Size	Strategy	Allocation
0	Return default	0
1	`Arc::try_unwrap`	Avoids clone if refcount is 1
< 30	`extend_ref` with `Arc::make_mut`	Copy-on-write (no collect)
>= 30	k-way `merge_batch`	Single collect for O(n log k) merge

Allocation Behavior (No Regression)

Learned from PR #21142 review feedback:

❌ Don't use collect() unnecessarily in small-k path
✅ Use Arc::make_mut directly in the loop for copy-on-write
✅ Only collect for large-k path where k-way merge needs materialized data
✅ Use Arc::try_unwrap to avoid final clone when refcount is 1

Expected Impact

~50% reduction in write_trie_updates time for b2b scenarios
Reduces cursor open/close overhead from N to 1
Reduces MDBX transaction overhead

Testing

All existing reth-chain-state and reth-provider tests pass
Clippy and fmt clean

Batches trie updates across all blocks in `save_blocks` instead of writing per-block, reducing cursor open/close overhead from N to 1. ## Changes ### lazy_overlay.rs - Move MERGE_BATCH_THRESHOLD constant inside function - Use data directly instead of Arc::clone (avoids unnecessary refcount bump) - Move Arc::make_mut into the loop for proper copy-on-write semantics ### provider.rs Add batched trie updates with hybrid merge algorithm: - 0 blocks: default - 1 block: Arc::try_unwrap to avoid clone if refcount is 1 - < 30 blocks: extend_ref with Arc::make_mut (copy-on-write) - >= 30 blocks: k-way merge_batch for O(n log k) complexity ## Allocation Behavior (No Regression) Small batches avoid collect() by using Arc::make_mut directly in the loop. Only large batches (>= 30) collect Arcs for k-way merge. ## Expected Impact - ~50% reduction in write_trie_updates time for b2b scenarios - Maintains same allocation characteristics as original code ## Related - Based on optimizations from Slack #eng-perf thread - Learned from PR #21142 review: avoid unnecessary collect(), use Arc::make_mut

mattsse

I believe this has been merged

gakonst added the C-perf A change motivated by improving speed, memory usage or disk footprint label Jan 17, 2026

gakonst requested review from Rjected, fgimenez, joshieDo, mattsse, mediocregopher, rakita and shekhirin as code owners January 17, 2026 07:32

github-project-automation bot added this to Reth Tracker Jan 17, 2026

github-project-automation bot moved this to Backlog in Reth Tracker Jan 17, 2026

gakonst force-pushed the georgios/batch-trie-updates-refactor branch from 20c1702 to 2687317 Compare January 17, 2026 07:37

gakonst force-pushed the georgios/batch-trie-updates-refactor branch from 2687317 to 690b167 Compare January 18, 2026 17:24

mattsse reviewed Jan 19, 2026

View reviewed changes

mattsse closed this Jan 19, 2026

github-project-automation bot moved this from Backlog to Done in Reth Tracker Jan 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(trie): extract hybrid merge algorithm for sorted trie types#21157

refactor(trie): extract hybrid merge algorithm for sorted trie types#21157
gakonst wants to merge 1 commit intomainfrom
georgios/batch-trie-updates-refactor

gakonst commented Jan 17, 2026 •

edited

Loading

Uh oh!

mattsse left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gakonst commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

lazy_overlay.rs

provider.rs

Allocation Behavior (No Regression)

Expected Impact

Testing

Related

Uh oh!

mattsse left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gakonst commented Jan 17, 2026 •

edited

Loading