Skip to content

Conversation

@marioevz
Copy link
Member

@marioevz marioevz commented Jan 6, 2026

πŸ—’οΈ Description

WIP: Based on #1937 but using #1934.

πŸ”— Related Issues or PRs

N/A.

βœ… Checklist

  • All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    uvx tox -e static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered adding an entry to CHANGELOG.md.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).
  • Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.
  • Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.
  • Ported Tests: All converted JSON/YML tests from ethereum/tests or tests/static have been assigned @ported_from marker.

Cute Animal Picture

Put a link to a cute animal picture inside the parenthesis-->

@marioevz marioevz force-pushed the feat/depth-bench-without-deploys branch from 63f135d to c92cfae Compare January 6, 2026 22:28
Comment on lines 23 to 24
- AttackOrchestrator.sol and Verifier.sol:
https://gist.github.com/CPerezz/8686da933fa5c045fbdf7c31e20e6c71
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why were those removed?? Just curious as they are indeed the contracts used within this test to perform the attack and verify the execution

Copy link
Member Author

@marioevz marioevz Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For AttackOrchestrator.sol, Solidity was using more opcodes than necessary and made it difficult to see which ones were actually being used, which in turn made it difficult to estimate gas.
For Verifier.sol, we could do the verification via the post object in the test, so no need to create another contract and call it and use gas that could be instead used by the attacks.

Copy link
Contributor

@CPerezz CPerezz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been thinking on some of the suggestions from @jochem-brouwer in
#1976 and I think some of them apply here. Would you like me to address them?

@jochem-brouwer
Copy link
Member

@CPerezz I think you are refering to #1961 and you are right, those indeed apply here. I'll write comments on it and will publish a review in a hour or so πŸ˜„ πŸ‘

Copy link
Member

@jochem-brouwer jochem-brouwer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CPerezz is I think referring to #1961. I think the format there cleans up the test and makes it easier to reason about. I left comments and pointers to there.

The main thing I think we should do ASAP is wrap this attack template in a helper file somewhere. So attacking specific contracts using CREATE2 factory is a common pattern, however the attack itself (use an EVM opcode on them, or CALL them with certain data) differs.

(If something is unclear let me know πŸ˜„ πŸ‘ Happy to help or give pointers)

self.value.to_bytes(32, "big")
+ self.start.to_bytes(32, "big")
+ self.end.to_bytes(32, "big")
+ self.initcode_hash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe all 4 of these can be removed from calldata as with the method in #1961. This also raises the point for a further refactor short- to mid-term where we create a template for the following pattern:

We execute certain code with one input argument: the target address. This target address is a target of a CREATE2 factory. This is done in a loop with the exit condition: if the gas is below some threshold then exit the loop. The salt is increased every time the loop is re-ran. After the loop, the next salt is stored into storage. The initial salt is read from this storage.

This has the following benefits:

  1. We do not have to calculate the start salt and the end salt here. To calculate the start and end salt we need to know the gas costs of each loop (which are usually dynamic and hard to calculate)
  2. Encoding this data into calldata makes the transactions parallelizable. Using storage this is not possible, because the next transaction depends on the initial storage slot left to the previous one.
  3. Although we get slightly waste of gas in the EVM, this is negligible compared to the operations we use in these tests (storage writes)

The value here can be read from the attack contract (this seems to be the target key to write. Small note: the usage of the name "value" here is confusing, because it could also point to tx value e.g. CALLVALUE). Start/end bytes are handled by the contract (in EVM). initcode_hash is also a constant and is something we can hardcode in the contract (no need to put it in calldata)

What to do is to take the contract from #1961: these lines: https://github.com/ethereum/execution-specs/pull/1961/changes#diff-88ac263a5a41126dcb0c95cc6939a105f972f0a9fd526ecaae4f085f01f96d0aR118-R152 and edit it such that it hardcodes the initcode, and changes the EXTCODESIZE in the loop to Op.CALL which calls the attack(uint256) here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this CREATE2 attack pattern is something we have seen in many places (and also in some small variations, with the same end goal) - so we should template this attack at some point so we can re-use it and iterate faster using the same code πŸ˜„ πŸ‘

Copy link
Member Author

@marioevz marioevz Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside of the code created in build_attack_contract is that we have to deploy a new fresh contract each time we run execute again, because it stores the last salt to storage so we cannot start from zero when the test is executed from the beginning.

This version can reuse the same contract not only across re-execution of the same test but in in all tests (as long as the factory pre-deploy address is the same).

I think the slightly higher calldata cost is worth it in order to speed up test execution.

+ self.initcode_hash
)

def calculate_inner_call_cost(self, fork: Fork) -> int:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: with the new format this can thus be removed as we let EVM handle the logic if we want to do "one more loop" or if we want to exit

return inner_call_cost

def calculate_gas(self, fork: Fork) -> int:
"""Calculate the exact gas this attack transaction will use."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, gas calculations are not necessary anymore with this other logic

to=ATTACK_ORCHESTRATOR_ADDRESS,
gas_limit=self.calculate_tx_gas_limit(fork),
sender=sender,
data=self.calldata(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we switch here to new format, calldata is empty, and can thefore use BenchmarkTestFiller to handle the split logic

def add_post_verification(
self, post: Alloc, mined_contract_file: MinedContractFile
) -> None:
"""Add the post-verification transaction to the post-state."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The post verification from #1961:

  • Writes current salt to slot 0
  • Writes EXTCODESIZE of target to slot 1.

The EXTCODESIZE check verifies that the target contract is deployed. Additionally, can also verify that slot 0 is "at least some number" such that we know at least X amount of attacks have ran.
(Note: if tx fails or OOGs these slots could thus be written but are then reverted)

@marioevz marioevz force-pushed the feat/depth-bench-without-deploys branch from c92cfae to 47f4a63 Compare January 9, 2026 17:17
This PR introduces comprehensive benchmarks to test Ethereum clients under
worst-case scenarios involving extremely deep state and account tries.

The attack scenario:
- Pre-deployed contracts with deep storage tries (depth=9) maximizing traversal costs
- CREATE2-based deterministic addressing for reproducible benchmarks
- AttackOrchestrator contract that batches up to 2,510 attacks per transaction
- Tests measure state root recomputation impact when modifying deep slots

Key components:
- depth_9.sol, depth_10.sol: Contracts with deep storage tries
- s9_acc3.json: Pre-computed CREATE2 addresses and auxiliary accounts (15k contracts)
- AttackOrchestrator.sol: Optimized attack coordinator (3,650 gas per attack)
- deep_branch_testing.py: EEST test harness for pre-deployed contracts
- README.md: Complete documentation and setup instructions

Performance optimizations:
- Reduced gas forwarding from 50k to 3,650 per attack (8.3x throughput increase)
- MAX_ATTACKS_PER_TX increased from 303 to 2,510
- Precise EVM opcode cost analysis with safety margins
- Read init_code_hash directly from JSON instead of recompiling

Deployment setup and instructions available at:
https://gist.github.com/CPerezz/44d521c0f9e6adf7d84187a4f2c11978

This benchmark helps identify performance bottlenecks in state trie handling
and validates client implementations under extreme depth conditions.
The attack() call was forwarding only 3650 gas, which is insufficient
for SSTORE operations on cold storage slots. SSTORE requires:
- 2100 gas for cold slot access
- 2900 gas for zero-to-nonzero write
- Plus dispatch overhead (~200 gas)

Updated to forward 5300 gas to ensure SSTORE succeeds.
Adds a minimal Verifier contract that checks if a target contract's
deepest storage slot was updated to the expected attack value. This
enables the test to verify attack success without expensive post-state
checks on all attacked contracts.

The verify() function calls getDeepest() on the target and compares
the returned value against the expected attack value.
… gas

Major refactor of the depth benchmark test for execute mode:

- Remove stubs dependency; derive contract addresses directly from
  init_code_hash + Nick's deployer using CREATE2 formula
- Deploy AttackOrchestrator and Verifier as part of test execution
- Dynamically compute NUM_CONTRACTS based on gas_benchmark_value
- Add verification transaction at end of block to confirm attack success
- Fix gas constants based on empirical measurements:
  - GAS_PER_ATTACK: 8014 -> 8050 (measured ~8042)
  - MAX_ATTACKS_PER_TX: 1990 -> 1980 (safety margin)
  - TX_OVERHEAD: 22900 -> 22600 (more accurate)

The previous gas constants caused all attack transactions to run out
of gas, as the 28 gas/attack shortfall compounded over 1990 attacks
to ~55k gas deficit.
- Embed AttackOrchestrator and Verifier bytecode directly in Python
- Add download_mined_asset() to fetch JSON/SOL files from GitHub
- Cache downloaded files locally in .cache/ directory
- Remove local .sol and .json asset files (now downloaded on demand)
- Update test parameters to use (10, 6) available from GitHub
- Add gist reference for contract sources

Contract sources: https://gist.github.com/CPerezz/8686da933fa5c045fbdf7c31e20e6c71
Mined assets: https://github.com/CPerezz/worst_case_miner/tree/master/mined_assets
- Remove unused ATTACK_SELECTOR constant
- Extract magic numbers to named constants (gas limits, fees, etc.)
- Add zero contracts validation to prevent edge case bugs
- Fix unused fork parameter (rename to _fork)
- Replace print warning with warnings.warn
- Fix docstring math discrepancy (~2,742 not 2,750)
- Fix line length issues and add proper type annotations
@marioevz marioevz force-pushed the feat/depth-bench-without-deploys branch from 8afabff to 42e4830 Compare January 9, 2026 18:06
@codecov
Copy link

codecov bot commented Jan 9, 2026

Codecov Report

βœ… All modified and coverable lines are covered by tests.
βœ… Project coverage is 86.33%. Comparing base (8c9e889) to head (fee6e2e).
⚠️ Report is 3 commits behind head on forks/amsterdam.

Additional details and impacted files
@@               Coverage Diff                @@
##           forks/amsterdam    #1976   +/-   ##
================================================
  Coverage            86.33%   86.33%           
================================================
  Files                  538      538           
  Lines                34557    34557           
  Branches              3222     3222           
================================================
  Hits                 29835    29835           
  Misses                4148     4148           
  Partials               574      574           
Flag Coverage Ξ”
unittests 86.33% <ΓΈ> (ΓΈ)

Flags with carried forward coverage won't be shown. Click here to find out more.

β˜” View full report in Codecov by Sentry.
πŸ“’ Have feedback on the report? Share it here.

πŸš€ New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • πŸ“¦ JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@marioevz marioevz force-pushed the feat/depth-bench-without-deploys branch from 42e4830 to fee6e2e Compare January 9, 2026 18:57
@marioevz marioevz marked this pull request as ready for review January 9, 2026 19:15
@LouisTsai-Csie LouisTsai-Csie self-requested a review January 19, 2026 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants