feat: Write trie changesets to DB on engine persistence#18584
Merged
mediocregopher merged 3 commits into18460-trie-changesetsfrom Sep 25, 2025
Merged
Conversation
Closes #18465 This introduces two new provider methods for writing the trie changeset data based on the current Accounts/StorageTrie dataset. These will be used in the engine API (as added by this PR) as well as in the pipeline sync. There is also a slight refactor; I added a helper for generating the storage changesets in the same module space as the existing helper for storage reverts, which was alone in a module called `bundle_state`. I renamed the module to `changeset_utils` since everything in side has to do with populating changeset tables.
mediocregopher
commented
Sep 19, 2025
| trie.take_present().ok_or(ProviderError::MissingTrieUpdates(block_hash))?; | ||
|
|
||
| // sort trie updates and insert changesets | ||
| // TODO(mediocregopher): We should rework `write_trie_updates` to also accept a |
shekhirin
reviewed
Sep 22, 2025
Rjected
reviewed
Sep 22, 2025
Member
Rjected
left a comment
There was a problem hiding this comment.
i like the rename, wondering if we can make the loop simpler
Comment on lines
+58
to
+100
| loop { | ||
| match (self.paths.peek(), self.cursor_current.as_mut()) { | ||
| (None, _) => { | ||
| // If there are no more paths then there is no further possible output. | ||
| return None | ||
| } | ||
| (Some(path), None) => { | ||
| // If there is a path but the cursor is empty then that path has no node. | ||
| let path = *path; | ||
| self.paths.next(); | ||
| return Some(Ok((path, None))) | ||
| } | ||
| (Some(path), Some((cursor_path, cursor_node))) => { | ||
| // There is both a path and a cursor value, compare their paths. | ||
| match path.cmp(cursor_path) { | ||
| Ordering::Less => { | ||
| // If the path is behind the cursor then there is no value for that | ||
| // path, increment the path iter and produce None. | ||
| let path = *path; | ||
| self.paths.next(); | ||
| return Some(Ok((path, None))) | ||
| } | ||
| Ordering::Equal => { | ||
| // If the target path and cursor's path match then there is a value for | ||
| // that path, increment the path iter and return the value. We don't | ||
| // increment the cursor here, that will be handled on the next call to | ||
| // `next` in the `Ordering::Greater` branch if the path iter is not | ||
| // None. | ||
| let cursor_node = core::mem::take(cursor_node); | ||
| self.paths.next(); | ||
| return Some(Ok((*cursor_path, Some(cursor_node)))) | ||
| } | ||
| Ordering::Greater => { | ||
| // If the path is ahead of the cursor then seek the cursor forward and | ||
| // loop. The cursor will either seek to the path or beyond it. | ||
| let path = *path; | ||
| if let Err(err) = self.seek_cursor(path) { | ||
| return Some(Err(err)) | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } |
Member
There was a problem hiding this comment.
does this ever loop more than once? the final branch is the only one that doesn't return, and it looks like as long as seek_cursor works as expected, we should return right after that. If so, maybe we can do something like:
let curr_cursor = self.cursor_current.as_mut();
let Some(curr_path) = self.paths.peek() else {
return None
};
if curr_cursor.is_some_and(|(cursor_path, _)| curr_path > cursor_path) {
if let Err(err) = self.seek_cursor(*path) {
return Some(Err(err))
}
}
// do big match on curr_cursor
Member
Author
There was a problem hiding this comment.
Good call, I refactored this to get rid of the loop, and ended up being able to remove the Peekable over the paths iterator as well 🔥
Comment on lines
+131
to
+135
| let merged = merge_join_by(curr_values_of_changed, all_nodes, |a, b| match (a, b) { | ||
| (Err(_), _) => Ordering::Less, | ||
| (_, Err(_)) => Ordering::Greater, | ||
| (Ok(a), Ok(b)) => a.0.cmp(&b.0), | ||
| }); |
Member
There was a problem hiding this comment.
i see, so this merge join fn is essentially the diff operation
shekhirin
approved these changes
Sep 23, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #18465
This introduces two new provider methods for writing the trie changeset data based on the current Accounts/StorageTrie dataset. These will be used in the engine API (as added by this PR) as well as in the pipeline sync.
There is also a slight refactor; I added a helper for generating the storage changesets in the same module space as the existing helper for storage reverts, which was alone in a module called
bundle_state. I renamed the module tochangeset_utilssince everything in side has to do with populating changeset tables.