Skip to content

chore(trie): add e2e for reorg#18293

Closed
yongkangc wants to merge 49 commits intomainfrom
yk/e2e_reorg_trie
Closed

chore(trie): add e2e for reorg#18293
yongkangc wants to merge 49 commits intomainfrom
yk/e2e_reorg_trie

Conversation

@yongkangc
Copy link
Member

@yongkangc yongkangc commented Sep 5, 2025

Summary

  • add trie_updates_test.rs, an end-to-end regression test that reproduces the trie-update loss fixed in refactor(engine): persistence logic #18318
  • ensure canonical blocks retain their trie updates after a short reorg by exercising the e2e TestBuilder flow
  • guard against a recurrence of the state-root mismatch that stalled mainnet nodes in early September 2024

Scenario Under Test (trie_updates_test.rs)

  1. Spin up a single-node mainnet chain spec with a pre-funded storage contract whose slots hash to shared prefixes, creating branch nodes identical to the production incident.
  2. Produce block_1, which writes to those slots, capture the block, and assert its trie updates are persisted.
  3. Produce block_2, which rewrites the same slots (mirroring the canonical block after the 23003311 fork), capture it, and assert trie updates are still present.
  4. The assertions use AssertMissingTrieUpdates so that any regression that drops trie updates for canonical descendants stops the test.

@github-project-automation github-project-automation bot moved this to Backlog in Reth Tracker Sep 5, 2025
@yongkangc yongkangc force-pushed the yk/e2e_reorg_trie branch 2 times, most recently from e3dca93 to 07c7a42 Compare September 5, 2025 12:53
@yongkangc yongkangc marked this pull request as ready for review September 5, 2025 13:00
@yongkangc yongkangc marked this pull request as draft September 8, 2025 04:00
Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at the new test I get the idea what this does

unsure if we need all of this additional setup, looks like we do, wdyt @fgimenez

@yongkangc
Copy link
Member Author

looking at the new test I get the idea what this does

unsure if we need all of this additional setup, looks like we do, wdyt @fgimenez

@mattsse realising that my approach might have to change to a simpler one due to issues mocking of genesis state and recreate the other blocks, as it results in blockhash mismatch.

instead of the current approach in the test, we would have to do something like
     1. Genesis: Deploy storage contract with initial storage
     2. Block 1 (canonical): Transaction that sets storage slot 0x0f to value A
     3. Block 1' (fork): Transaction that sets storage slot 0x0f to value B (creates trie node)
     4. Reorg: Make Block 1' canonical
     5. Block 2: Transaction that clears storage slot 0x0f (should delete trie node)

@yongkangc yongkangc self-assigned this Sep 8, 2025
@yongkangc yongkangc moved this from Backlog to In Progress in Reth Tracker Sep 8, 2025
@fgimenez
Copy link
Member

fgimenez commented Sep 9, 2025

instead of the current approach in the test, we would have to do something like
1. Genesis: Deploy storage contract with initial storage
2. Block 1 (canonical): Transaction that sets storage slot 0x0f to value A
3. Block 1' (fork): Transaction that sets storage slot 0x0f to value B (creates trie node)
4. Reorg: Make Block 1' canonical
5. Block 2: Transaction that clears storage slot 0x0f (should delete trie node)

sgtm, this should be doable with the current actions plus maybe the new action that allows adding custom txs?

/// This action sends transactions to the node's transaction pool and then
/// triggers block production to include them.
#[derive(Debug)]
pub struct ProduceBlockWithTransactions {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@yongkangc yongkangc marked this pull request as ready for review September 10, 2025 06:48
@yongkangc yongkangc marked this pull request as draft September 10, 2025 07:54
@yongkangc yongkangc marked this pull request as ready for review September 10, 2025 08:07
@yongkangc yongkangc requested a review from rkrasiuk as a code owner September 10, 2025 08:07
// For single-block chains, preserve trie updates for testing
// This maintains backwards compatibility while allowing tests to access trie
// updates
let new = if new.len() == 1 {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shekhirin @mediocregopher @Rjected would appreciate a look over here as its what is required to replicate the test.

However as with anything with the trie state would need a double look

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my understanding of single block chains - the chain contains exactly one executed block

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afaict this is ok, the only behavior change is that CanonStateNotification::Commit contains trie updates for single block chains now, which the documentation states is the case anyway.

- Added transaction utility functions for e2e testing
- Implemented new action module for transaction operations
- Enhanced engine API and block production actions
- Created comprehensive trie corruption test suite
- Added test JSON payloads and testdata for reorg scenarios
- Updated test paths to use testdata directory structure
- Introduced SendPayloadFromJson action to facilitate sending payloads from embedded JSON data.
- Updated engine API to support new action and modified existing payload handling logic.
- Enhanced test suite to utilize SendPayloadFromJson for improved test scenarios.
- Added reth-trie-common as a dependency in Cargo.toml and Cargo.lock.
- Reformatted comments for clarity and consistency.
- Consolidated import statements for better organization.
- Enhanced logging messages for improved debugging.
- Streamlined function definitions and removed redundant code.
- Updated test structure to ensure better maintainability.
- Removed unused module `transaction_utils` from the testsuite.
- Updated comments for clarity and consistency across action implementations.
- Changed method signatures to `const fn` for `with_node_idx` and `with_expected_status` for better performance.
- Streamlined error handling and logging messages in transaction-related actions.
- Consolidated import statements for better organization in the trie corruption test.
- Reformatted logging statements for better readability and consistency in `ProduceBlockWithTransactionsViaEngineAPI`.
- Enhanced clarity in the code structure by adjusting the formatting of variable assignments and comments, improving overall maintainability.
- Updated comments to provide clearer context for synthetic trie updates during testing scenarios.
- Fix clippy warnings by handling empty VERGEN_CARGO_FEATURES macro
- Add missing HeaderProvider trait import in node.rs
- Fix Engine API version compatibility issues:
  - Implement dynamic version detection based on payload attributes
  - Add v2 to v3 payload conversion for internal storage compatibility
  - Handle parent_beacon_block_root field properly for different forks
- Fix documentation warning by escaping Mutex in doc comment
- Update test chain specs to use Cancun fork for v3 API support
- Fix clippy warnings: remove unused imports, fix documentation backticks, remove unnecessary conversions
- Fix Engine API compatibility: use v2 API instead of v3 for better compatibility
- Update test configurations to use Shanghai instead of Cancun
- Remove parent_beacon_block_root from payload attributes to match v2 API
- Convert ExecutionPayloadV3 to ExecutionPayloadInputV2 for API calls
);
}
} else {
println!("Node {}: No trie updates in committed chain", node_idx);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove after debugging

- Modified multiple test cases to activate the Cancun fork, replacing previous Shanghai activations.
- Ensured compatibility with the latest test configurations for improved testing accuracy.
- Replaced the obsolete `trie_fork_test` with `trie_updates_test` to focus on validating trie updates in canonical blocks.
- Adjusted the `ProduceBlockWithTransactionsViaEngineAPI` to ensure proper handling of trie updates.
- Updated the `parent_beacon_block_root` in transaction payload attributes to align with the latest block structure.
- Enhanced test configurations to ensure compatibility with the Cancun fork and improve testing accuracy.
- Renamed and restructured storage configurations to improve clarity and organization in trie update tests.
- Introduced `BlockStorageValues` to encapsulate storage values for individual blocks, enhancing the readability of test setups.
- Updated comments to reflect changes in storage slot configurations and their purpose in testing scenarios.
- Updated comments and naming conventions in the `trie_updates_test` to enhance clarity regarding storage slot configurations.
- Adjusted descriptions to better reflect the purpose of the storage slots in relation to Merkle Patricia Trie behavior during updates.
- Removed unused imports from the `trie` module in the test suite actions, retaining only `AssertMissingTrieUpdates` for clarity and maintainability.
- Removed unnecessary state root fallback and always compare trie updates settings in the `trie_updates_test` setup for improved clarity and simplicity.
- Retained the legacy state root configuration to maintain compatibility with trie update optimizations.
  engine payloads by relying on the shared payload-
  attributes trait instead of hard Into conversions.
- Remove unused tracing::error import
- Replace error! macro with eprintln! to fix compilation
- Remove all emojis from output messages per project standards
- Maintain test functionality while fixing CI issues
Required for compilation after rebase. The op-alloy-rpc-types-engine
dependency needs the serde feature enabled to match the configuration
of alloy-rpc-types-engine.
Copy link
Member

@mediocregopher mediocregopher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a bunch of comments, mostly nitpicks.

One thing I'd note is that another change that should be possible as part of #18460 is that ExExNotifications will have trie updates on them as well. I'm not sure if that would be easier to work into the tests than the canon notifications, maybe it's the same, but just something that came to mind reading this.

// For single-block chains, preserve trie updates for testing
// This maintains backwards compatibility while allowing tests to access trie
// updates
let new = if new.len() == 1 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afaict this is ok, the only behavior change is that CanonStateNotification::Commit contains trie updates for single block chains now, which the documentation states is the case anyway.

// Try multiple possible locations
let candidates = vec![
std::path::Path::new(file_path).to_path_buf(),
std::path::Path::new("../../..").join(file_path),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these extra candidates trying to search? If file_path is the crates dir won't these be looking outside the reth project?


/// Action that asserts whether a block has missing trie updates
#[derive(Debug)]
pub struct AssertMissingTrieUpdates {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: really we're asserting that the updates are present, so AssertTrieUpdatesPresent would be a better name

.get(&self.block_tag)
.ok_or_else(|| eyre::eyre!("Unknown block tag {}", self.block_tag))?;

eprintln!("\n=== CHECKING TRIE UPDATES STATUS FOR BLOCK {} ===", self.block_tag);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these left in intentionally? They seem like they'd be noisy

where
Engine: EngineTypes,
{
fn execute<'a>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method will get simplified quite a bit once #18460 is done, but overall definitely still valuable to be able to assert that any trie updates were generated 👍

node_clients.push(crate::testsuite::NodeClient::new(rpc, auth, url));
}

// We'll handle node context setting in a different way since nodes don't implement Clone
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment still relevant?

);

tokio::spawn(async move {
println!("Starting trie update forwarder for node {}", node_idx);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover println

@@ -37,7 +39,7 @@ async fn test_apply_with_import() -> Result<()> {
.unwrap(),
)
.london_activated()
.shanghai_activated()
.cancun_activated()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this deliberate? cancun is already activated here

.with_tree_config(
TreeConfig::default()
.with_persistence_threshold(0) // Persist blocks immediately so trie updates are kept
.with_legacy_state_root(true), /* Use legacy state root to bypass trie update
Copy link
Member

@mediocregopher mediocregopher Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this with_legacy_state_root should be necessary, would be better to leave it out if possible

vergen_build_timestamp: Cow::Owned(env!("VERGEN_BUILD_TIMESTAMP").to_string()),
vergen_cargo_target_triple: Cow::Owned(env!("VERGEN_CARGO_TARGET_TRIPLE").to_string()),
vergen_cargo_features: Cow::Owned(env!("VERGEN_CARGO_FEATURES").to_string()),
vergen_cargo_features: Cow::Owned(String::new()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably not deliberate?

The test directory was named 'e2e-testsuite' but Cargo.toml expected
'e2e_testsuite' (with underscore). This mismatch was causing CI failures.

- Renamed tests/e2e-testsuite/ to tests/e2e_testsuite/
- Updated Cargo.toml path to match renamed directory
- Remove unnecessary abstractions and helper structs
- Focus test on the actual reorg scenario from the incident
- Test now properly creates fork → canonical → next block sequence
- Use actual contract address from the incident
- Remove unused genesis.json file
- Make test clearer and more maintainable
@yongkangc yongkangc moved this from In Progress to Backlog in Reth Tracker Oct 6, 2025
@yongkangc yongkangc closed this Oct 15, 2025
@github-project-automation github-project-automation bot moved this from Backlog to Done in Reth Tracker Oct 15, 2025
@jenpaff jenpaff moved this from Done to Out of Scope in Reth Tracker Oct 15, 2025
@mediocregopher mediocregopher deleted the yk/e2e_reorg_trie branch February 25, 2026 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Out of Scope

Development

Successfully merging this pull request may close these issues.

4 participants