Skip to content

feat(trie): MerkleChangeSets pipeline sync stage#18809

Merged
mediocregopher merged 28 commits into18460-trie-changesetsfrom
mediocregopher/18464-trie-cs-pipeline
Oct 3, 2025
Merged

feat(trie): MerkleChangeSets pipeline sync stage#18809
mediocregopher merged 28 commits into18460-trie-changesetsfrom
mediocregopher/18464-trie-cs-pipeline

Conversation

@mediocregopher
Copy link
Member

Closes #18464

This implements a new sync stage for computing the Accounts/StoragesTrieChangeSets tables. Only changesets up to the finalized block (or 64 blocks ago, if no block is marked finalized) are generated, as these are the only ones necessary for the Engine API simplifications which this change is a part of.

Unwinding is implemented by simply clearing the changesets tables past the target block.

When implementing trie changesets we'll need to be able to query the
HashedPostState revert just for specific blocks (in order to compute
their PrefixSets). This allows for doing that with minimal other
changes.

A range helper for BlockNumberAddress is added for convenience.
These methods will be used in various places when it's necessary to
delete changeset data. This includes pruning, unwinding, and also when
populating the data during pipeline sync in certain cases.

A new type BlockNumberHashedAddressRange is implemented for convenience.
This allows for passing optional overlays into the
`write_trie_changesets` and `write_storage_trie_changesets` provider
methods.

The overlay is a TrieUpdates which is used to augment the state of the
trie db tables. Using the overlay we can write changesets as if the DB
is at a previous block, which will be used during pipeline sync.

Implementing this change required refactoring the
StorageTrieCurrentValuesIter utility to accept a TrieCursor rather than
a normal DbCursor. It also required implementing a TrieCursorIter which
wraps a TrieCursor into an Iterator, for passing in to
`storage_trie_wiped_changeset_iter`. Using both of these changes we
could use an InMemoryTrieCursor instead of a direct db cursor.
These methods will be used in various places when it's necessary to
delete changeset data. This includes pruning, unwinding, and also when
populating the data during pipeline sync in certain cases.

A new type BlockNumberHashedAddressRange is implemented for convenience.
@github-project-automation github-project-automation bot moved this to Backlog in Reth Tracker Oct 1, 2025
@github-actions github-actions bot added A-db Related to the database A-engine Related to the engine implementation A-trie Related to Merkle Patricia Trie implementation C-enhancement New feature or request C-perf A change motivated by improving speed, memory usage or disk footprint labels Oct 1, 2025
@mediocregopher mediocregopher changed the title MerkleChangeSets pipeline sync stage feat(trie): MerkleChangeSets pipeline sync stage Oct 1, 2025
Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not the expert when it comes to lower level merkle stuff, but I believe I could follow along here and thanks to the docs all of this makes sense to me

still need @Rjected and @shekhirin for reviews here

Comment on lines +149 to +163
// We need to distinguish a full revert and a per-block revert. A full revert reverts
// changes starting at db tip all the way to a block. A per-block revert only reverts
// a block's changes.
//
// We need to calculate the full HashedPostState reverts for every block in the target
// range. The full HashedPostState revert for block N can be calculated as:
//
//
// ```
// // where `extend` overwrites any shared keys
// state_revert(N) = state_revert(N + 1).extend(per_block_state_revert(N))
// ```
//
// We need per-block reverts to calculate the prefix set for each individual block. By using
// the per-block reverts to calculate full reverts on-the-fly we can save a bunch of memory.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is very helpful

@jenpaff jenpaff linked an issue Oct 1, 2025 that may be closed by this pull request
///
/// Handles Merkle trie changesets for storage and accounts.
#[value(name = "merkle-changesets")]
MerkleChangeSets,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is really useful

Copy link
Member

@yongkangc yongkangc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall logic flows quite easily for me which is great! left some commnets. and questions

Co-authored-by: Alexey Shekhirin <5773434+shekhirin@users.noreply.github.com>
@github-project-automation github-project-automation bot moved this from Backlog to In Progress in Reth Tracker Oct 3, 2025
mediocregopher and others added 2 commits October 3, 2025 15:40
Co-authored-by: Alexey Shekhirin <5773434+shekhirin@users.noreply.github.com>
Copy link
Member

@shekhirin shekhirin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Send it 🚀

@mediocregopher mediocregopher merged commit bc60ab1 into 18460-trie-changesets Oct 3, 2025
33 checks passed
@mediocregopher mediocregopher deleted the mediocregopher/18464-trie-cs-pipeline branch October 3, 2025 14:04
@github-project-automation github-project-automation bot moved this from In Progress to Done in Reth Tracker Oct 3, 2025
@jenpaff jenpaff moved this from Done to Completed in Reth Tracker Oct 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-db Related to the database A-engine Related to the engine implementation A-trie Related to Merkle Patricia Trie implementation C-enhancement New feature or request C-perf A change motivated by improving speed, memory usage or disk footprint

Projects

Status: Completed

Development

Successfully merging this pull request may close these issues.

Trie ChangeSets: Pipeline sync

4 participants