refactor(engine): simplify InvalidBlockWitnessHook::on_invalid_block for better testability#18696
refactor(engine): simplify InvalidBlockWitnessHook::on_invalid_block for better testability#18696yongkangc merged 11 commits intoparadigmxyz:mainfrom 0xKarl98:main
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR refactors the InvalidBlockWitnessHook::on_invalid_block method by breaking it down into smaller, focused functions to improve testability and maintainability. The refactoring separates concerns into distinct methods for data collection, proof generation, witness operations, and validation.
- Extracts block re-execution logic into
re_execute_blockmethod - Splits validation logic into separate methods for bundle state and state root validation
- Adds comprehensive test coverage for the newly extracted methods
Reviewed Changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| crates/engine/invalid-block-hooks/src/witness.rs | Refactored main hook implementation by extracting helper methods and adding extensive test coverage |
| crates/engine/invalid-block-hooks/Cargo.toml | Added test dependencies required for the new test suite |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| #[cfg(test)] | ||
| mod tests { | ||
| use super::*; | ||
| use reth_revm::db::{BundleAccount,BundleState}; |
There was a problem hiding this comment.
Missing space after comma in the import statement. Should be BundleAccount, BundleState.
| struct DataCollector; | ||
|
|
||
| impl DataCollector { | ||
| fn collect(mut db: State<StateProviderDatabase<Box<dyn StateProvider>>>) -> eyre::Result<(BTreeMap<B256, Bytes>, BTreeMap<B256, Bytes>, reth_trie::HashedPostState, BundleState)> { |
There was a problem hiding this comment.
[nitpick] The return type tuple is complex and unclear. Consider creating a struct to represent the collected data with named fields like codes, preimages, hashed_state, and bundle_state.
| struct ProofGenerator; | ||
|
|
||
| impl ProofGenerator { | ||
| fn generate(codes: BTreeMap<B256, Bytes>, preimages: BTreeMap<B256, Bytes>, hashed_state: reth_trie::HashedPostState, state_provider: Box<dyn StateProvider>) -> eyre::Result<ExecutionWitness> { |
There was a problem hiding this comment.
[nitpick] This function has too many parameters. Consider using a struct to group related parameters or accepting the output from DataCollector::collect directly.
| Ok(path) | ||
| } | ||
|
|
||
| fn save_diff<T: std::fmt::Debug>(&self, filename: &str, old: &T, new: &T) -> eyre::Result<PathBuf> { |
There was a problem hiding this comment.
The generic constraint should use PartialEq in addition to Debug to ensure the values can be compared, or document that this function only displays differences without verifying they actually differ.
| fn save_diff<T: std::fmt::Debug>(&self, filename: &str, old: &T, new: &T) -> eyre::Result<PathBuf> { | |
| fn save_diff<T: std::fmt::Debug + PartialEq>(&self, filename: &str, old: &T, new: &T) -> eyre::Result<PathBuf> { |
yongkangc
left a comment
There was a problem hiding this comment.
overall much simpler, left some questions.
Think the logic is not exactly equivalent?
| } | ||
|
|
||
| fn save_file<T: Serialize>(&self, filename: String, value: &T) -> eyre::Result<PathBuf> { | ||
| fn save_json<T: serde::Serialize>(&self, filename: &str, data: &T) -> eyre::Result<PathBuf> { |
There was a problem hiding this comment.
| fn save_json<T: serde::Serialize>(&self, filename: &str, data: &T) -> eyre::Result<PathBuf> { | |
| fn save_file<T: Serialize>(&self, filename: &str, data: &T) -> eyre::Result<PathBuf> { |
Think it might be better to reduce changes here. Don't see a reason to change file name?
There was a problem hiding this comment.
Yeah i was kind of thinking to combine save_json and save_diff into one function , as to reduce the loc here .
But seems should keep them the same as before would be better
| fn save_diff<T: std::fmt::Debug>(&self, filename: &str, old: &T, new: &T) -> eyre::Result<PathBuf> { | ||
| let path = self.output_directory.join(filename); | ||
| let content = Comparison::new(old, new).to_string(); | ||
| File::create(&path)?.write_all(content.as_bytes())?; | ||
| Ok(path) | ||
| } |
There was a problem hiding this comment.
Is there a reason why we cant keep it simliar to the prev implementation here?
i.e
/// Saves the diff of two values into a file with the given name in the output directory.
fn save_diff<T: PartialEq + Debug>(
&self,
filename: String,
original: &T,
new: &T,
) -> eyre::Result<PathBuf> {
let path = self.output_directory.join(filename);
let diff = Comparison::new(original, new);
File::create(&path)?.write_all(diff.to_string().as_bytes())?;
Ok(path)
}There was a problem hiding this comment.
Yeah will restore it
| &response, | ||
| )?; | ||
| /// Handles witness generation, saving, and comparison with healthy node | ||
| fn handle_witness_operations( |
There was a problem hiding this comment.
i like the idea of this function, makes it much easier to see
| let codes = db | ||
| .cache | ||
| .contracts | ||
| .values() | ||
| .map(|code| code.original_bytes()) | ||
| .chain( | ||
| // cache state does not have all the contracts, especially when | ||
| // a contract is created within the block | ||
| // the contract only exists in bundle state, therefore we need | ||
| // to include them as well | ||
| bundle_state.contracts.values().map(|code| code.original_bytes()), | ||
| ) | ||
| .collect(); | ||
|
|
||
| // Grab all account proofs for the data accessed during block execution. | ||
| // | ||
| // Note: We grab *all* accounts in the cache here, as the `BundleState` prunes | ||
| // referenced accounts + storage slots. | ||
| let mut hashed_state = db.database.hashed_post_state(&bundle_state); | ||
| for (address, account) in db.cache.accounts { |
There was a problem hiding this comment.
where did all these function logic go?
There was a problem hiding this comment.
This locates at the DataCollector impl
|
also @0xKarl98 could you also fix the ci as well? |
|
@yongkangc May you review this again ? |
| struct DataCollector; | ||
|
|
||
| #[derive(Debug, PartialEq, Eq)] | ||
| struct BundleStateSorted { | ||
| /// Account state | ||
| pub state: BTreeMap<Address, BundleAccountSorted>, | ||
| /// All created contracts in this block. | ||
| pub contracts: BTreeMap<B256, Bytecode>, | ||
| /// Changes to revert | ||
| /// | ||
| /// **Note**: Inside vector is *not* sorted by address. | ||
| /// | ||
| /// But it is unique by address. | ||
| pub reverts: Vec<Vec<(Address, AccountRevertSorted)>>, | ||
| /// The size of the plain state in the bundle state | ||
| pub state_size: usize, | ||
| /// The size of reverts in the bundle state | ||
| pub reverts_size: usize, | ||
| } | ||
| impl DataCollector { |
There was a problem hiding this comment.
can we remove this struct? seems like the struct might be a shallwo abstraction
There was a problem hiding this comment.
You mean DataCollector ? If so , will do , and will also deal with proof generator in the same way
|
@yongkangc Have done that requested changes |
| pub reverts_size: usize, | ||
| } | ||
| impl DataCollector { | ||
| fn collect( |
There was a problem hiding this comment.
could we leave clear and concise docstring for this?
There was a problem hiding this comment.
This part of code has been removed in previous touchups
| type CollectionResult = | ||
| (BTreeMap<B256, Bytes>, BTreeMap<B256, Bytes>, reth_trie::HashedPostState, BundleState); | ||
|
|
||
| fn sort_bundle_state_for_comparison(bundle_state: &BundleState) -> serde_json::Value { |
There was a problem hiding this comment.
could we leave a line of docstring here thats clear and concise explaining what this does?
Rjected
left a comment
There was a problem hiding this comment.
thanks! mostly style comments and suggestions about using serde_json::Value so far
| type CollectionResult = | ||
| (BTreeMap<B256, Bytes>, BTreeMap<B256, Bytes>, reth_trie::HashedPostState, BundleState); | ||
|
|
||
| /// Converts bundle state to sorted JSON format for deterministic comparison | ||
| fn sort_bundle_state_for_comparison(bundle_state: &BundleState) -> serde_json::Value { | ||
| serde_json::json!({ | ||
| "state": bundle_state.state.iter().map(|(addr, acc)| { | ||
| (addr, serde_json::json!({ | ||
| "info": acc.info, | ||
| "original_info": acc.original_info, | ||
| "storage": BTreeMap::from_iter(acc.storage.clone()), | ||
| "status": acc.status | ||
| })) | ||
| }).collect::<BTreeMap<_, _>>(), | ||
| "contracts": BTreeMap::from_iter(bundle_state.contracts.clone()), | ||
| "reverts": bundle_state.reverts.iter().map(|block| { | ||
| block.iter().map(|(addr, rev)| { | ||
| (addr, serde_json::json!({ | ||
| "account": rev.account, | ||
| "storage": BTreeMap::from_iter(rev.storage.clone()), | ||
| "previous_status": rev.previous_status, | ||
| "wipe_storage": rev.wipe_storage | ||
| })) | ||
| }).collect::<BTreeMap<_, _>>() | ||
| }).collect::<Vec<_>>(), | ||
| "state_size": bundle_state.state_size, | ||
| "reverts_size": bundle_state.reverts_size | ||
| }) |
There was a problem hiding this comment.
IMO we should not be manually constructing a serde_json::Value ever, if this is supposed to be a serializable version of BundleStateSorted then we should just derive serialize for it
| // Convert bundle state to sorted format for deterministic comparison | ||
| let bundle_state_sorted = sort_bundle_state_for_comparison(re_executed_state); | ||
| let output_state_sorted = sort_bundle_state_for_comparison(original_state); |
There was a problem hiding this comment.
I preferred the old structs for this
| if witness != &healthy_node_witness { | ||
| let diff_path = self.save_diff( | ||
| format!("{}.witness.diff", block_prefix), | ||
| witness, | ||
| &healthy_node_witness, | ||
| )?; |
There was a problem hiding this comment.
for this and the other fns where format! calls are used directly as arguments, I preferred the old way because it was more explicit e.g.:
let filename = format!(....);
let diff_path = self.save_diff(filename, ...);
There was a problem hiding this comment.
Please check it out again
yongkangc
left a comment
There was a problem hiding this comment.
lgtm! @Rjected @shekhirin @mediocregopher any other thoughts?
|
@yongkangc Think we can merge it now ? |
.github/assets/check_wasm.sh
Outdated
| reth-cli-util | ||
| reth-db | ||
| reth-db-api | ||
| reth-ecies | ||
| reth-network-api | ||
| reth-nippy-jar | ||
| reth-node-types | ||
| reth-rpc-server-types | ||
| reth-trie-db | ||
| reth-zstd-compressors |
There was a problem hiding this comment.
If not add them , run bash .github/assets/check_wasm.sh will return error and this will resolve the failed WASM while didn't touch other files
This WASM job failure didn't get encountered several days ago
There was a problem hiding this comment.
And i have fixed other new lint errors after you approved the merge
There was a problem hiding this comment.
We should probably not add these here, and instead figure out why they failed
There was a problem hiding this comment.
Have started investigated the failure reason and try not add these :)
|
@yongkangc Last review , i have just pulled @mediocregopher's new fix on CI |
|
@yongkangc @mediocregopher @Rjected For this build Volc , i've run the same command on local , it ran successfully : Can you figure why this fails on github actions ? |
Rjected
left a comment
There was a problem hiding this comment.
lgtm other than the check_wasm additions
.github/assets/check_wasm.sh
Outdated
| reth-cli-util | ||
| reth-db | ||
| reth-db-api | ||
| reth-ecies | ||
| reth-network-api | ||
| reth-nippy-jar | ||
| reth-node-types | ||
| reth-rpc-server-types | ||
| reth-trie-db | ||
| reth-zstd-compressors |
There was a problem hiding this comment.
We should probably not add these here, and instead figure out why they failed
Think so , i've swiched codebase to where before i made the merge , i.e. Apply Rejected Suggestions But this problem persists , there might be some other root causes for this |
|
@yongkangc @Rjected @shekhirin I re-run the Github CI tests , and it turns out that the wasm check run well this time , then i have removed them . For book/build in CI fails , i checked recent merged commits , they also get operation cancelled in this check So i think this pr is ready to merge and we'd better to set up another PR to resolve this timeout in book/build CI check !
|
yeah given this was failing on main I think that job failing is ok / unrelated to changes here. |
|
@yongkangc May you help merge it again ? :) |
* docs: yellowpaper sections in consensus implementation (paradigmxyz#18881) * fix(era-utils): fix off-by-one for Excluded end bound in process_iter (paradigmxyz#18731) Co-authored-by: Roman Hodulák <roman.hodulak@polyglot-software.com> * refactor: eliminate redundant allocation in precompile cache example (paradigmxyz#18886) * chore: relax `ChainSpec` impls (paradigmxyz#18894) * chore: make clippy happy (paradigmxyz#18900) * fix(trie): Reveal extension child when extension is last remaining child of a branch (paradigmxyz#18891) * chore(node): simplify EngineApiExt bounds by removing redundant constraints (paradigmxyz#18905) * refactor(engine): separate concerns in on_forkchoice_updated for better maintainability (paradigmxyz#18661) Co-authored-by: Nathaniel Bajo <nathanielbajo@Nathaniels-MacBook-Pro.local> Co-authored-by: YK <chiayongkang@hotmail.com> Co-authored-by: Brian Picciano <me@mediocregopher.com> * feat(provider): add get_account_before_block to ChangesetReader (paradigmxyz#18898) * refactor: replace collect().is_empty() with next().is_none() in tests (paradigmxyz#18902) Co-authored-by: Alexey Shekhirin <5773434+shekhirin@users.noreply.github.com> * ci: cache hive simulator images to reduce prepare-hive job time (paradigmxyz#18899) * docs: duplicate comment in Eip4844PoolTransactionError (paradigmxyz#18858) * chore: align node_config threshold constant (paradigmxyz#18914) * feat: wait for new blocks when build is in progress (paradigmxyz#18831) Co-authored-by: Roman Hodulák <roman.hodulak@polyglot-software.com> * perf(tree): worker pooling for storage in multiproof generation (paradigmxyz#18887) Co-authored-by: Brian Picciano <me@mediocregopher.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Alexey Shekhirin <5773434+shekhirin@users.noreply.github.com> * chore(grafana): use precompile address as legend (paradigmxyz#18913) * refactor: remove needless collect() calls in trie tests (paradigmxyz#18937) * fix(examples): change method to launch with debug capabilities (paradigmxyz#18946) * fix(example): launch with debug capabilities (paradigmxyz#18947) * fix(testsuite): Fix unused updates in e2e-test-utils (paradigmxyz#18953) * fix(payload): correct Debug label for PayloadTimestamp in PayloadServiceCommand (paradigmxyz#18954) * chore(rpc): Moves `SequencerMetrics` into `reth-optimism-rpc` (paradigmxyz#18921) * refactor: replace println! with structured logging in test_vectors (paradigmxyz#18956) * refactor(cli): use structured logging (tracing) in p2p command (paradigmxyz#18957) * perf(tree): add elapsed time to parallel state root completion log (paradigmxyz#18959) * fix(trie): Properly upsert into StoragesTrie in repair-trie (paradigmxyz#18941) * fix: misleading error message in db list: show actual table name (paradigmxyz#18896) * fix: remove noisy stderr prints in ERA1 cleanup (EraClient::delete_outside_range) (paradigmxyz#18895) * fix: use max B256 for upper bound in empty-storage check (paradigmxyz#18962) * ci: remove reproducible build from release.yml (paradigmxyz#18958) * chore(rpc): Remove redundant U256::from in suggested_priority_fee (paradigmxyz#18969) * chore(ci): update eest 7594 issue link in hive expected failures file (paradigmxyz#18976) * perf(tests): remove redundant format! in ef-tests run_only (paradigmxyz#18909) * feat(cli): enable traces export via `tracing-otlp` cli arg (paradigmxyz#18242) Co-authored-by: Dan Cline <6798349+Rjected@users.noreply.github.com> * feat: allow otlp level to be configurable (paradigmxyz#18981) * chore(optimism): remove unnecessary Debug bounds from header generics (paradigmxyz#18989) * refactor: convert satisfy_base_fee_ids to use closure (paradigmxyz#18979) * chore: bump otlp crates (paradigmxyz#18984) * fix(network): prevent metric leak in outgoing message queue on session teardown (paradigmxyz#18847) * chore: remove unused imports in blockchain_provider (paradigmxyz#18867) Co-authored-by: Matthias Seitz <matthias.seitz@outlook.de> * fix(stateless): enforce BLOCKHASH ancestor header limit (paradigmxyz#18920) * chore(evm): mark ExecuteOutput as unused and slated for removal (paradigmxyz#18754) * refactor: unify `Pipeline` creation codepaths (paradigmxyz#18955) * fix(engine): flatten storage cache (paradigmxyz#18880) * perf(tree): worker pooling for account proofs (paradigmxyz#18901) Co-authored-by: Brian Picciano <me@mediocregopher.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Alexey Shekhirin <5773434+shekhirin@users.noreply.github.com> * refactor(storage): fix ChainStateKey enum variant name (paradigmxyz#18992) * refactor(trie): remove proof task manager (paradigmxyz#18934) Co-authored-by: Brian Picciano <me@mediocregopher.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Alexey Shekhirin <5773434+shekhirin@users.noreply.github.com> * fix: required optimism primitives features in db-api (paradigmxyz#19005) * refactor(engine): simplify InvalidBlockWitnessHook::on_invalid_block for better testability (paradigmxyz#18696) * chore: replace poll_next_unpin loop with poll_recv_many (paradigmxyz#18978) * fix: Set Era pipeline stage to last checkpoint when there is no target (paradigmxyz#19000) * ci: Add tests for Paris scenario in hive.yml (paradigmxyz#19013) * chore: bump book timeout (paradigmxyz#19016) * feat: add metrics for safe and finalized block heights (paradigmxyz#18987) * chore(privitives-traits): remove unused serde derives and camelCase attribute (paradigmxyz#19014) * chore: refactor loop in `add_new_transactions` (paradigmxyz#19006) * chore(ci): bump hive eest to v5.3.0 (paradigmxyz#19021) * feat(devp2p): make eth p2p networkId configurable (paradigmxyz#19020) Co-authored-by: frankudoags <frankudoags.com> * chore: remove unused Args struct from exex-subscription example (paradigmxyz#19019) * feat: add pending sequence as pub (paradigmxyz#19022) * chore: bump alloy-core (paradigmxyz#19026) * fix: unused warnings for tracing (paradigmxyz#19025) * fix: respect cli blob size setting (paradigmxyz#19024) * feat(engine): deprecate TestPipelineBuilder::with_executor_results (paradigmxyz#19017) * perf: background init of workers (paradigmxyz#19012) * chore(ci): update expected failures (paradigmxyz#19034) * fix: use header type generic for mask (paradigmxyz#19037) * fix: correct `Compact` impl for `Option` (paradigmxyz#19042) * chore: increase versioned hash index cache (paradigmxyz#19038) * chore(primitives-traits): relax SignerRecoverable bounds for Extended<B,T> (paradigmxyz#19045) * feat: bump revm (paradigmxyz#18999) * fix: resolve upstream merge Signed-off-by: Gregory Edison <gregory.edison1993@gmail.com> * bump revm * update deps and fix lints * update openvm deps * skip exex wal storage test * pin revm tag scroll-v91 * bump openvm compat --------- Signed-off-by: Gregory Edison <gregory.edison1993@gmail.com> Co-authored-by: josé v <52646071+Peponks9@users.noreply.github.com> Co-authored-by: Forostovec <ilonaforostovec22@gmail.com> Co-authored-by: Roman Hodulák <roman.hodulak@polyglot-software.com> Co-authored-by: Skylar Ray <137945430+sky-coderay@users.noreply.github.com> Co-authored-by: Arsenii Kulikov <klkvrr@gmail.com> Co-authored-by: Léa Narzis <78718413+lean-apple@users.noreply.github.com> Co-authored-by: Brian Picciano <me@mediocregopher.com> Co-authored-by: radik878 <radikpadik76@gmail.com> Co-authored-by: William Nwoke <willteey@gmail.com> Co-authored-by: Nathaniel Bajo <nathanielbajo@Nathaniels-MacBook-Pro.local> Co-authored-by: YK <chiayongkang@hotmail.com> Co-authored-by: Dan Cline <6798349+Rjected@users.noreply.github.com> Co-authored-by: Merkel Tranjes <140164174+rnkrtt@users.noreply.github.com> Co-authored-by: Alexey Shekhirin <5773434+shekhirin@users.noreply.github.com> Co-authored-by: Federico Gimenez <fgimenez@users.noreply.github.com> Co-authored-by: stevencartavia <112043913+stevencartavia@users.noreply.github.com> Co-authored-by: emmmm <155267286+eeemmmmmm@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: MIHAO PARK <wetkeyboard17@gmail.com> Co-authored-by: Tilak Madichetti <tilak.madichetti@gmail.com> Co-authored-by: Emilia Hane <elsaemiliaevahane@gmail.com> Co-authored-by: maradini77 <140460067+maradini77@users.noreply.github.com> Co-authored-by: sashaodessa <140454972+sashaodessa@users.noreply.github.com> Co-authored-by: Alvarez <140459501+prestoalvarez@users.noreply.github.com> Co-authored-by: MozirDmitriy <dmitriymozir@gmail.com> Co-authored-by: drhgencer <gancer16@gmail.com> Co-authored-by: Matthias Seitz <matthias.seitz@outlook.de> Co-authored-by: anim001k <140460766+anim001k@users.noreply.github.com> Co-authored-by: Julian Meyer <julian.meyer@coinbase.com> Co-authored-by: Karl Yu <43113774+0xKarl98@users.noreply.github.com> Co-authored-by: Jennifer <jenpaff0@gmail.com> Co-authored-by: Ivan Wang <314130948@qq.com> Co-authored-by: GarmashAlex <garmasholeksii@gmail.com> Co-authored-by: Udoagwa Franklin <54338168+frankudoags@users.noreply.github.com> Co-authored-by: Luca Provini <luca.provini@usemerkle.com> Co-authored-by: Galoretka <galoretochka@gmail.com> Co-authored-by: sashass1315 <sashass1315@gmail.com> Co-authored-by: frisitano <francesco.risitano95@gmail.com>


Closes: #18513
@yongkangc Please check if it meets the simplification requirement