feat(persistence): batch write hashed_state by duyquang6 · Pull Request #19990 · paradigmxyz/reth

duyquang6 · 2025-11-26T13:52:23Z

As discussed #19739 (comment)
Batch write of hashed_state is safe, so I created this PR to cherry-pick old reverted commit

Changes

batch write hashed_state

*Note: The after result here is after we improve extend_sorted_vec, which is upcoming PR

before

erc20 transfers spam: ~100ms

native transfers spam: ~50ms

after

erc20 transfers spam: ~80ms

native transfers spam: ~30ms

mattsse

pedantic doc nit

crates/trie/common/src/utils.rs

mediocregopher · 2025-11-27T13:05:19Z

crates/trie/common/src/utils.rs

-        target.sort_unstable_by(|a, b| a.0.cmp(&b.0));
-    }
+        })
+        .collect();


The previous implementation was specifically designed to avoid having to do a big collect like this; the resulting memory allocation from this collect dwarfs any ostensible speedup you get from not having to sort. I just did a bench comparing your implementation to the previous and this new one is about 2x slower for synthetic datasets:

You can see the bench here if you're curious

very useful bench @mediocregopher
when I first use old version with aggregated hashed state to bench, the result is not good. This is why I think there something wrong with extend_ref or extend_sorted_vec

when I use your bench compare with custom own merge version (not used merge_join_by), it only shine at other target size smaller than other size, but overall case, old version still win. That give me some hint to use this function better, is keep target size and other size similar or larger so might benefit old version

shine case (new better)

but overall (size similar or target size > other) old still better

I dump the raw data of HashedPostStateSorted when bench with native-transfer

here is bench result of extend_ref, new version of both is better than in this testcase

can double check the bench here - already attach hashed state raw data
Could be raw data might have properties that benchmark doesn’t fully cover 🤔 ?

@duyquang6 these are interesting results, your benches for extend_sorted_vec_comparison/t10_o1000 conflict with what I originally saw in mine, but now I'm able to replicate, so there's some inconsistency there that I still need to figure out.

If yours is faster for t10_o1000 I expect it's because it's doing the full allocation up-front, whereas mine is likely doing two larger allocations at the end with the extend calls.

What do you think about trying out something like:

// Where "50" is a made up number that needs to be tuned if other.len() > target.len() * 50 { return extend_sorted_vec_custom(target, other); } extend_sorted_vec(target, other)

This way maybe we better cover all cases.

duyquang6 · 2025-12-05T15:37:14Z

I dump the raw data of HashedPostStateSorted when bench with native-transfer

here is bench result of extend_ref, new version of both is better than in this testcase
can double check the bench [here](https://github.com/duyquang6/reth/blob/bench-sorted-extend/crates/trie/common/benches/extend_ref.rs#L286) - already attach hashed state raw data Could be raw data might have properties that benchmark doesn’t fully cover 🤔 ?

draft temporarily. I will invest more time on finding root cause why there is different, will update later

mediocregopher · 2025-12-23T15:19:39Z

Closes #20609

duyquang6 · 2025-12-25T02:58:32Z

Hi, I'm currently busy with other work and won't have time for this PR for a few weeks. If anyone wants to take this over to unblock #20609, feel free - otherwise I'll get back to it when I have bandwidth
cc sir @mediocregopher

nvm, i got some bandwidth today, will work on this

duyquang6 · 2025-12-26T06:45:30Z

I benchmarked with Vec<B256> instead of Vec<u64> since that matches the real use case. Here is results from M1 Pro, differ significantly from u64 benchmarks:

The custom merge version is ~30% faster than the current extend_sorted_vec for B256 data.

Result of Vec<u64>:

Summary:

Tested three approaches for merging sorted vectors:

default: In-place overwrites for duplicates, collects new items, sorts at end
merge: Classic single-pass merge into new vector, O(n+m)
itertool_merge: Uses itertools::merge_join_by

Key findings:

Scenario	Winner	Speedup
Large B256 keys (100k entries)	custom_merge	~30-40% faster
Small u64 keys with high overlap	default	~20% faster
Small datasets (<100 entries)	All similar	negligible

Why merge wins for B256 (use case on HashedPostState):

B256 comparision is more expensive that u64, that make difference
At 100k accounts: merge ~0.8ms vs default ~1.2ms

Benchmark code: https://github.com/duyquang6/reth/blob/bench-sorted-extend/crates/trie/common/benches/extend_sorted_vec.rs

Implementation: https://github.com/duyquang6/reth/blob/bench-sorted-extend/crates/trie/common/src/utils.rs

should I split 2 PR - since with batch write, we can resolve #20609 first ?:

Batch write hashed state (this PR)
Improve extend_sorted_vec (if needed)

sir @mediocregopher @mattsse

cliff0412 · 2025-12-29T02:15:29Z

may i know what tool u used to do erc20/native transfers spam?

duyquang6 · 2025-12-29T07:59:07Z

may i know what tool u used to do erc20/native transfers spam?

Hi, we wrote a custom Rust script for benchmarking transaction throughput

mediocregopher · 2026-01-28T12:22:56Z

We've gone with a different approach in #21422 and confirmed small perf improvement based on that. Further improvements can be based on that work, going to close this for now

duyquang6 requested review from Rjected, mediocregopher, rakita and shekhirin as code owners November 26, 2025 13:52

github-project-automation bot added this to Reth Tracker Nov 26, 2025

duyquang6 requested a review from joshieDo as a code owner November 26, 2025 13:52

github-project-automation bot moved this to Backlog in Reth Tracker Nov 26, 2025

duyquang6 changed the title ~~perf: improve extend_sorted_vec & write batch for hashed_state~~ perf: improve extend_sorted_vec & batch write hashed_state Nov 26, 2025

duyquang6 force-pushed the push-rsnqslrrpszs branch 3 times, most recently from f6cfcb2 to 029724d Compare November 27, 2025 02:20

mattsse added C-perf A change motivated by improving speed, memory usage or disk footprint A-db Related to the database labels Nov 27, 2025

mattsse requested changes Nov 27, 2025

View reviewed changes

crates/trie/common/src/utils.rs Show resolved Hide resolved

crates/trie/common/src/utils.rs Outdated Show resolved Hide resolved

github-project-automation bot moved this from Backlog to In Progress in Reth Tracker Nov 27, 2025

duyquang6 force-pushed the push-rsnqslrrpszs branch 4 times, most recently from 40f7bef to ec2786b Compare November 27, 2025 11:39

mediocregopher reviewed Nov 27, 2025

View reviewed changes

yongkangc mentioned this pull request Dec 1, 2025

perf: parallelize HashedPostStateSorted::from_reverts hashing/sorting #20049

Closed

duyquang6 marked this pull request as draft December 5, 2025 15:33

mediocregopher linked an issue Dec 23, 2025 that may be closed by this pull request

Flatten HashedPostState before persisting #20609

Closed

duyquang6 added 2 commits December 26, 2025 13:48

perf: improve extend_sorted_vec & write batch for HashedPostState

0dabbb8

chore: add unit-test

e7eb205

duyquang6 force-pushed the push-rsnqslrrpszs branch from ec2786b to 4208a5c Compare December 26, 2025 06:50

fix: comment

95c86c1

duyquang6 force-pushed the push-rsnqslrrpszs branch from 4208a5c to 95c86c1 Compare December 26, 2025 06:52

duyquang6 marked this pull request as ready for review December 26, 2025 06:53

duyquang6 changed the title ~~perf: improve extend_sorted_vec & batch write hashed_state~~ feat: batch write hashed_state Dec 26, 2025

duyquang6 changed the title ~~feat: batch write hashed_state~~ feat(persistence): batch write hashed_state Dec 26, 2025

mattsse force-pushed the main branch from 3cad676 to 0f3d369 Compare January 10, 2026 15:16

mediocregopher closed this Jan 28, 2026

github-project-automation bot moved this from In Progress to Done in Reth Tracker Jan 28, 2026

Conversation

duyquang6 commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

before

erc20 transfers spam: ~100ms

native transfers spam: ~50ms

after

erc20 transfers spam: ~80ms

native transfers spam: ~30ms

Uh oh!

mattsse left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mediocregopher Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

duyquang6 Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

duyquang6 Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mediocregopher Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

duyquang6 commented Dec 5, 2025

Uh oh!

mediocregopher commented Dec 23, 2025

Uh oh!

duyquang6 commented Dec 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

duyquang6 commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cliff0412 commented Dec 29, 2025

Uh oh!

duyquang6 commented Dec 29, 2025

Uh oh!

mediocregopher commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

duyquang6 commented Nov 26, 2025 •

edited

Loading

duyquang6 Nov 28, 2025 •

edited

Loading

duyquang6 Nov 28, 2025 •

edited

Loading

duyquang6 commented Dec 25, 2025 •

edited

Loading

duyquang6 commented Dec 26, 2025 •

edited

Loading