refactor(core): semantic journalling #28880#2081
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Refactors the state journaling system to make journal operations more semantic and to centralize snapshot/revision bookkeeping inside the journal, alongside some storage journaling semantics improvements and related test updates.
Changes:
- Moves snapshot/revert revision tracking from
StateDBintojournal(snapshot,revertToSnapshot,reset, copy behavior). - Introduces more semantic journal helper methods (e.g.,
logChange,refundChange,storageChange,touchChange, access list helpers) and updates call sites. - Refines storage dirty/pending handling to better model “revert-to-origin” behavior, and updates snapshot/fuzz tests accordingly.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| core/state/statedb.go | Delegates snapshot/revert and various mutations to semantic journal methods; changes AddPreimage behavior. |
| core/state/journal.go | Adds revision tracking + snapshot/reset APIs; introduces semantic append helpers; adjusts change structs. |
| core/state/state_object.go | Updates storage access/mutation flow to track original values and clean up dirty/pending slots when reverted. |
| core/state/statedb_test.go | Renames an action label and adds a guard around SetCode in randomized snapshot tests. |
| core/state/statedb_fuzz_test.go | Renames an action label in fuzz-driven state tests. |
Comments suppressed due to low confidence (2)
core/state/statedb_test.go:350
- The action label was renamed to "SetStorage", but the action still calls StateDB.SetState (set a single storage slot), not StateDB.SetStorage (replace entire storage). This makes snapshot test failure output misleading; consider renaming to something like "SetState"/"SetSlot" to match the invoked API.
name: "SetStorage",
fn: func(a testAction, s *StateDB) {
var key, val common.Hash
binary.BigEndian.PutUint16(key[:], uint16(a.args[0]))
binary.BigEndian.PutUint16(val[:], uint16(a.args[1]))
s.SetState(addr, key, val)
},
core/state/statedb_fuzz_test.go:79
- The action label was renamed to "SetStorage", but the fuzz action still calls StateDB.SetState (set a single storage slot). This name collides with the existing StateDB.SetStorage API and can be confusing when interpreting fuzz traces; consider renaming to "SetState"/"SetSlot".
{
name: "SetStorage",
fn: func(a testAction, s *StateDB) {
var key, val common.Hash
binary.BigEndian.PutUint16(key[:], uint16(a.args[0]))
binary.BigEndian.PutUint16(val[:], uint16(a.args[1]))
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
240e56a to
f37fcb8
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func (s *stateObject) getState(key common.Hash) (common.Hash, common.Hash) { | ||
| origin := s.GetCommittedState(key) | ||
| value, dirty := s.dirtyStorage[key] | ||
| if dirty { | ||
| return value | ||
| return value, origin |
There was a problem hiding this comment.
getState now calls GetCommittedState before checking dirtyStorage, meaning a read-only GetState of an already-dirty slot still incurs the committed-state lookup (and potentially a trie read the first time). Consider restoring the fast path for GetState (check dirtyStorage first) and only loading the committed/origin value in the write path (SetState) when it’s actually needed for journalling.
| logs: make(map[common.Hash][]*types.Log, len(s.logs)), | ||
| logSize: s.logSize, | ||
| preimages: maps.Clone(s.preimages), | ||
| journal: s.journal.copy(), | ||
| validRevisions: slices.Clone(s.validRevisions), | ||
| nextRevisionId: s.nextRevisionId, | ||
| } |
There was a problem hiding this comment.
StateDB.Copy still states that snapshots of the copied state cannot be applied, but journal.copy() now clones validRevisions/nextRevisionId, which makes snapshot IDs and the revision stack carry over into the copy. Either reset snapshot metadata on copy (to preserve the documented behavior) or update the Copy docs/expectations accordingly.
This is a follow-up to ethereum#29520, and a preparatory PR to a more thorough change in the journalling system. This PR hides the journal-implementation details away, so that the statedb invokes methods like `JournalCreate`, instead of explicitly appending journal-events in a list. This means that it's up to the journal whether to implement it as a sequence of events or aggregate/merge events. This PR also makes it so that management of valid snapshots is moved inside the journal, exposed via the methods `Snapshot() int` and `RevertToSnapshot(revid int, s *StateDB)`. JournalSetCode journals the setting of code: it is implicit that the previous values were "no code" and emptyCodeHash. Therefore, we can simplify the setCode journal. The self-destruct journalling is a bit strange: we allow the selfdestruct operation to be journalled several times. This makes it so that we also are forced to store whether the account was already destructed. What we can do instead, is to only journal the first destruction, and after that only journal balance-changes, but not journal the selfdestruct itself. This simplifies the journalling, so that internals about state management does not leak into the journal-API. Preimages were, for some reason, integrated into the journal management, despite not being a consensus-critical data structure. This PR undoes that. --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>
f37fcb8 to
b904d22
Compare
Proposed changes
This is a follow-up to #1934, and a preparatory PR to a more thorough change in the journalling system.
API methods instead of append operations
This PR hides the journal-implementation details away, so that the statedb invokes methods like JournalCreate, instead of explicitly appending journal-events in a list. This means that it's up to the journal whether to implement it as a sequence of events or aggregate/merge events.
Snapshot-management inside the journal
This PR also makes it so that management of valid snapshots is moved inside the journal, exposed via the methods Snapshot() int and RevertToSnapshot(revid int, s *StateDB).
SetCode
JournalSetCode journals the setting of code: it is implicit that the previous values were "no code" and emptyCodeHash. Therefore, we can simplify the setCode journal.
Selfdestruct
The self-destruct journalling is a bit strange: we allow the selfdestruct operation to be journalled several times. This makes it so that we also are forced to store whether the account was already destructed.
What we can do instead, is to only journal the first destruction, and after that only journal balance-changes, but not journal the selfdestruct itself.
This simplifies the journalling, so that internals about state management does not leak into the journal-API.
Preimages
Preimages were, for some reason, integrated into the journal management, despite not being a consensus-critical data structure. This PR undoes that.
Ref: ethereum#28880
Types of changes
What types of changes does your code introduce to XDC network?
Put an
✅in the boxes that applyImpacted Components
Which parts of the codebase does this PR touch?
Put an
✅in the boxes that applyChecklist
Put an
✅in the boxes once you have confirmed below actions (or provide reasons on not doing so) that