Skip to content

feat(stateless): make witness generation conform to the draft specs#22289

Merged
mattsse merged 9 commits into
paradigmxyz:mainfrom
jsign:jsign/witness-generation-testing
Apr 13, 2026
Merged

feat(stateless): make witness generation conform to the draft specs#22289
mattsse merged 9 commits into
paradigmxyz:mainfrom
jsign:jsign/witness-generation-testing

Conversation

@jsign
Copy link
Copy Markdown
Contributor

@jsign jsign commented Feb 17, 2026

This PR contains changes to Reth to conform to the witness generations specs draft. These specs are still in flux (PR 1 and PR 3). We are working with the EF STEEL team to officialize them.

I'm leaving this PR in draft for a bit to get some feedback about the changes, mostly reg the last paragraph of this PR description.

You don't need to understand the specs fully to review this PR, since all changes must make sense on their own, which I'll help to explain in this PR description and comments.

Note that other required fixes aren't in this repo but in the newly created paradigmxyz/stateless (PR there to be created soon).

Below, I’ll summarise the changes, but I’ll also create PR comments extending the explanation a bit more for each case:

  • Avoid including empty bytecode case in the witness (i.e. rlp encoded 0x80 ).
  • Avoid including empty MPT node references (i.e. rlp encoded 0x80).
  • Avoid including bytecodes generated during execution (i.e., only pre-state bytecodes are needed by a stateless validator).
  • Avoid including storage tries root node if the accessed account never accessed any storage slots.
  • Change both ParallelTrie and StatelessTrie (but this now lies in stateless repo thus in other PR) post-root calculation to apply insertions/updates first and deletions after to minimise MPT nodes included in the witness — this is related to minimising cases of sibling branch compression inclusions.
  • Conform to the ordering of nodes, bytecodes and ancestors in ExecutionWitness as defined in the spec (basically, lexicographic order for the first two, and blockNum asc for the latter).

There is only one change that makes this new execution witness version a breaking change: the new post-state root update ordering (5th bullet point above). If someone uses an execution witness generated by Reth done with this PR, with an older stateless trie implementation, it might find that the post-state root can’t be calculated because the older version of the trie might ask for more MPT nodes than strictly needed.

We could fix this by adding more code to still support both witness generation styles for a while until people update their code (prob also involves modifying RPC debug_executionWitness to have an extra optional param to ask for “canonical form” or the current/mainnet style, or create a new one). If we also want to keep the behavior for other fields (i.e. still keep the bytecodes bloating and other cases, it can become a bit messy since basically all code fixes in this PR should be guarded in some kind of if canonical_witness or similar. I guess doable but can get tricky.

Note that the reverse is okay — the new stateless trie (that now lives in paradigmxyz/stateless) can support older-style execution witnesses, since these extra MPT nodes are totally okay to be there (they won’t be used). I want to gauge a bit what the Reth team prefers to do on this front.

Copy link
Copy Markdown
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ty, some suggestions

also please let me know if you have any suggestions for how to make this easier, will try my best to keep the stateless crate and this here in sync

Comment thread crates/revm/src/witness.rs Outdated
Comment thread crates/revm/src/witness.rs Outdated
@github-project-automation github-project-automation Bot moved this from Backlog to In Progress in Reth Tracker Feb 17, 2026
Copy link
Copy Markdown
Contributor Author

@jsign jsign left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattsse adding some extra comments to help understanding.

Comment thread crates/revm/src/witness.rs
Comment thread crates/revm/src/witness.rs Outdated
Comment thread crates/storage/provider/src/providers/state/historical.rs Outdated
Comment thread crates/trie/db/tests/witness.rs Outdated
Comment thread crates/trie/trie/src/witness.rs Outdated
Comment thread crates/trie/trie/src/witness.rs Outdated
Comment thread crates/trie/trie/src/witness.rs Outdated
@jsign jsign changed the title Make witness generation conform to the draft specs feat(stateless): make witness generation conform to the draft specs Feb 17, 2026
@jsign jsign marked this pull request as ready for review February 18, 2026 00:22
@jsign jsign requested a review from mattsse February 18, 2026 20:39
Copy link
Copy Markdown
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense

maybe @mediocregopher wants to do a quick sanity review

@github-actions github-actions Bot added the S-stale This issue/PR is stale and will close with no further activity label Mar 17, 2026
@mediocregopher
Copy link
Copy Markdown
Member

Regarding the backwards compat question: I don't really have enough context on how TrieWitness is being used by other clients to understand how disruptive this would be for others. Since it's just a matter of doing upserts or removals first I think including some kind of flag from RPC wouldn't be significantly disruptive.

Regarding changes as a whole I don't see any problems at a logical level. However since #22922 the TrieWitness has been completely rewritten and you'll need to redo your changes to it, unfortunately. I had already separated upserts/removals in that PR, but in the opposite order in order to retain maximum backwards compatibility. As already mentioned we can reverse this easily enough if that's the direction the winds are blowing. I don't think there's any other features mentioned here that will be difficult to accomplish with the new implementation, it just needs to be done.

@jsign
Copy link
Copy Markdown
Contributor Author

jsign commented Apr 5, 2026

@mediocregopher, thanks! I'll be working to rebase this PR on top of latest main and add a flag to support both legacy and canonical witness. I'll ping again whenever it's ready for some eyes.

Copy link
Copy Markdown
Contributor Author

@jsign jsign left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mediocregopher, this is ready for new eyes:

  • Rebased on top of master
  • Adjusted for the new V2 stuff
  • Added ExecutionWitnessMode so both styles Legacy (current) and Canonical (new) behavior can co-exist.

I resolved all previous comments, and created new ones so you only see fresh ones after this re-work.

cc @figtracer

&self,
_input: TrieInput,
_target: HashedPostState,
_mode: reth_trie::ExecutionWitnessMode,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExecutionWitnessMode is a new enum to allow generating/using both witness types: Legacy (current), and Canonical (new).

At the debug_executionWitness level there is a new parameter that allows to select this. It is an Option so not providing it defaults to Legacy for backwards compat.

)
.collect();
/// Records the state after execution using the given witness generation mode.
pub fn record_executed_state<DB>(&mut self, statedb: &State<DB>, mode: ExecutionWitnessMode) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we have both.

.map(|hm| hm.into_values().collect())
.with_execution_witness_mode(mode);
let witness =
if mode.is_canonical() { witness } else { witness.always_include_root_node() };
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note always_include_root_node() now depends on mode.

if mode.is_canonical() { witness } else { witness.always_include_root_node() };
witness.compute(target).map_err(ProviderError::from).map(|hm| {
let mut values: Vec<_> = hm.into_values().collect();
if mode.is_canonical() {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, reg sorting.

Comment on lines +55 to +65
let canonical_empty_witness = TrieWitness::new(
DatabaseTrieCursorFactory::<_, A>::new(provider.tx_ref()),
DatabaseHashedCursorFactory::new(provider.tx_ref()),
)
.with_execution_witness_mode(ExecutionWitnessMode::Canonical)
.compute(HashedPostState {
accounts: HashMap::from_iter([(hashed_address, Some(Account::default()))]),
storages: HashMap::default(),
})
.unwrap();
assert!(canonical_empty_witness.is_empty());
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a refresher: compared to Legacy above, the empty root hash must not be in the witness since only by the hash clients can know which is the RLP-representation.

Comment on lines +252 to +284
let target_state = HashedPostState {
accounts: HashMap::from_iter([(
hashed_address,
Some(Account { balance: U256::from(2), ..Default::default() }),
)]),
storages: HashMap::default(),
};

let legacy_witness = TrieWitness::new(
DatabaseTrieCursorFactory::<_, A>::new(provider.tx_ref()),
DatabaseHashedCursorFactory::new(provider.tx_ref()),
)
.with_execution_witness_mode(ExecutionWitnessMode::Legacy)
.compute(target_state.clone())
.unwrap();
assert!(legacy_witness.contains_key(&state_root));
for node in multiproof.account_subtree.values() {
assert_eq!(legacy_witness.get(&keccak256(node)), Some(node));
}
assert!(legacy_witness.contains_key(&storage_root));

let canonical_witness = TrieWitness::new(
DatabaseTrieCursorFactory::<_, A>::new(provider.tx_ref()),
DatabaseHashedCursorFactory::new(provider.tx_ref()),
)
.with_execution_witness_mode(ExecutionWitnessMode::Canonical)
.compute(target_state)
.unwrap();
assert!(canonical_witness.contains_key(&state_root));
for node in multiproof.account_subtree.values() {
assert_eq!(canonical_witness.get(&keccak256(node)), Some(node));
}
assert!(!canonical_witness.contains_key(&storage_root));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is covering another behavior difference between Legacy and Canonical: if the storage trie of an account doesn't change:

  • Legacy always include the storage trie root node RLP (not sure why, but that's how it works)
  • Canonical, doesn't since isn't required for anything.

}

#[test]
fn canonical_mode_handles_mixed_storage_inserts_and_removals() {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last remaining diff between Legacy and Canonical: how apply state diffs when calculating post-state root.

As quick context:

  • Legacy: applies removals first and then insert/updates -- my interpretation is that this is done to provide the most "compatibility" with users since it will maximize the number of potential nodes that might be required.
  • Canonical: applies insert/updates first and then removals. This is done to minimize branch compressions, thus minimize the witness size as much as possible. This implies that whoever is using this witness should also be a "canonical" one in the sense (i.e. apply insert/updates firsta nd removals after).

let address = Address::random();
let hashed_address = keccak256(address);

// Pre-state storage root is a branch with two children at nibbles 1 and 2.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR: we have a branch node with two children, and we have some state-diff that will add a new children and remove an existing one.

This setup allows to see if the sibiling is or isn't in the witness depending on the mode.

Comment thread crates/revm/src/witness.rs Outdated
Comment thread crates/trie/common/src/execution_witness.rs Outdated
Comment thread crates/trie/common/src/execution_witness.rs Outdated
@mediocregopher mediocregopher requested a review from mattsse April 8, 2026 10:01
jsign added 3 commits April 8, 2026 10:52
Signed-off-by: jsign <jsign.uy@gmail.com>
Copy link
Copy Markdown
Member

@mediocregopher mediocregopher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small nit but lgtm

Comment thread crates/trie/common/src/execution_witness.rs Outdated
Co-authored-by: Brian Picciano <me@mediocregopher.com>
@mattsse mattsse enabled auto-merge April 13, 2026 09:39
@mattsse mattsse added this pull request to the merge queue Apr 13, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Apr 13, 2026
@mattsse mattsse merged commit a05960a into paradigmxyz:main Apr 13, 2026
34 checks passed
@github-project-automation github-project-automation Bot moved this from In Progress to Done in Reth Tracker Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-stale This issue/PR is stale and will close with no further activity

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants