Skip to content

feat: Write trie changesets to DB on engine persistence#18584

Merged
mediocregopher merged 3 commits into18460-trie-changesetsfrom
mediocregopher/18465-trie-cs-engine-persistence
Sep 25, 2025
Merged

feat: Write trie changesets to DB on engine persistence#18584
mediocregopher merged 3 commits into18460-trie-changesetsfrom
mediocregopher/18465-trie-cs-engine-persistence

Conversation

@mediocregopher
Copy link
Member

Closes #18465

This introduces two new provider methods for writing the trie changeset data based on the current Accounts/StorageTrie dataset. These will be used in the engine API (as added by this PR) as well as in the pipeline sync.

There is also a slight refactor; I added a helper for generating the storage changesets in the same module space as the existing helper for storage reverts, which was alone in a module called bundle_state. I renamed the module to changeset_utils since everything in side has to do with populating changeset tables.

Closes #18465

This introduces two new provider methods for writing the trie changeset
data based on the current Accounts/StorageTrie dataset. These will be
used in the engine API (as added by this PR) as well as in the pipeline
sync.

There is also a slight refactor; I added a helper for generating the
storage changesets in the same module space as the existing helper for
storage reverts, which was alone in a module called `bundle_state`. I
renamed the module to `changeset_utils` since everything in side has to
do with populating changeset tables.
@github-project-automation github-project-automation bot moved this to Backlog in Reth Tracker Sep 19, 2025
@github-actions github-actions bot added A-db Related to the database A-engine Related to the engine implementation A-trie Related to Merkle Patricia Trie implementation C-enhancement New feature or request C-perf A change motivated by improving speed, memory usage or disk footprint labels Sep 19, 2025
trie.take_present().ok_or(ProviderError::MissingTrieUpdates(block_hash))?;

// sort trie updates and insert changesets
// TODO(mediocregopher): We should rework `write_trie_updates` to also accept a
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be done as part of 18467 I think

@mediocregopher mediocregopher marked this pull request as ready for review September 22, 2025 10:02
Copy link
Member

@Rjected Rjected left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like the rename, wondering if we can make the loop simpler

Comment on lines +58 to +100
loop {
match (self.paths.peek(), self.cursor_current.as_mut()) {
(None, _) => {
// If there are no more paths then there is no further possible output.
return None
}
(Some(path), None) => {
// If there is a path but the cursor is empty then that path has no node.
let path = *path;
self.paths.next();
return Some(Ok((path, None)))
}
(Some(path), Some((cursor_path, cursor_node))) => {
// There is both a path and a cursor value, compare their paths.
match path.cmp(cursor_path) {
Ordering::Less => {
// If the path is behind the cursor then there is no value for that
// path, increment the path iter and produce None.
let path = *path;
self.paths.next();
return Some(Ok((path, None)))
}
Ordering::Equal => {
// If the target path and cursor's path match then there is a value for
// that path, increment the path iter and return the value. We don't
// increment the cursor here, that will be handled on the next call to
// `next` in the `Ordering::Greater` branch if the path iter is not
// None.
let cursor_node = core::mem::take(cursor_node);
self.paths.next();
return Some(Ok((*cursor_path, Some(cursor_node))))
}
Ordering::Greater => {
// If the path is ahead of the cursor then seek the cursor forward and
// loop. The cursor will either seek to the path or beyond it.
let path = *path;
if let Err(err) = self.seek_cursor(path) {
return Some(Err(err))
}
}
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this ever loop more than once? the final branch is the only one that doesn't return, and it looks like as long as seek_cursor works as expected, we should return right after that. If so, maybe we can do something like:

let curr_cursor = self.cursor_current.as_mut();

let Some(curr_path) = self.paths.peek() else {
   return None
};

if curr_cursor.is_some_and(|(cursor_path, _)| curr_path > cursor_path) {
    if let Err(err) = self.seek_cursor(*path) {
        return Some(Err(err))
    }
}

// do big match on curr_cursor

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I refactored this to get rid of the loop, and ended up being able to remove the Peekable over the paths iterator as well 🔥

Comment on lines +131 to +135
let merged = merge_join_by(curr_values_of_changed, all_nodes, |a, b| match (a, b) {
(Err(_), _) => Ordering::Less,
(_, Err(_)) => Ordering::Greater,
(Ok(a), Ok(b)) => a.0.cmp(&b.0),
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see, so this merge join fn is essentially the diff operation

@github-project-automation github-project-automation bot moved this from Backlog to In Progress in Reth Tracker Sep 23, 2025
Copy link
Member

@Rjected Rjected left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mediocregopher mediocregopher merged commit a7d0798 into 18460-trie-changesets Sep 25, 2025
37 checks passed
@mediocregopher mediocregopher deleted the mediocregopher/18465-trie-cs-engine-persistence branch September 25, 2025 14:33
@github-project-automation github-project-automation bot moved this from In Progress to Done in Reth Tracker Sep 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-db Related to the database A-engine Related to the engine implementation A-trie Related to Merkle Patricia Trie implementation C-enhancement New feature or request C-perf A change motivated by improving speed, memory usage or disk footprint

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants