-
Notifications
You must be signed in to change notification settings - Fork 415
feat(test-benchmark): add worst-case depth attack benchmarks for Ethereum state tries using deterministic deploy #1976
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: forks/amsterdam
Are you sure you want to change the base?
Conversation
63f135d to
c92cfae
Compare
| - AttackOrchestrator.sol and Verifier.sol: | ||
| https://gist.github.com/CPerezz/8686da933fa5c045fbdf7c31e20e6c71 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why were those removed?? Just curious as they are indeed the contracts used within this test to perform the attack and verify the execution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For AttackOrchestrator.sol, Solidity was using more opcodes than necessary and made it difficult to see which ones were actually being used, which in turn made it difficult to estimate gas.
For Verifier.sol, we could do the verification via the post object in the test, so no need to create another contract and call it and use gas that could be instead used by the attacks.
CPerezz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been thinking on some of the suggestions from @jochem-brouwer in
#1976 and I think some of them apply here. Would you like me to address them?
jochem-brouwer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CPerezz is I think referring to #1961. I think the format there cleans up the test and makes it easier to reason about. I left comments and pointers to there.
The main thing I think we should do ASAP is wrap this attack template in a helper file somewhere. So attacking specific contracts using CREATE2 factory is a common pattern, however the attack itself (use an EVM opcode on them, or CALL them with certain data) differs.
(If something is unclear let me know π π Happy to help or give pointers)
packages/testing/src/execution_testing/cli/pytest_commands/plugins/execute/pre_alloc.py
Outdated
Show resolved
Hide resolved
| self.value.to_bytes(32, "big") | ||
| + self.start.to_bytes(32, "big") | ||
| + self.end.to_bytes(32, "big") | ||
| + self.initcode_hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe all 4 of these can be removed from calldata as with the method in #1961. This also raises the point for a further refactor short- to mid-term where we create a template for the following pattern:
We execute certain code with one input argument: the target address. This target address is a target of a CREATE2 factory. This is done in a loop with the exit condition: if the gas is below some threshold then exit the loop. The salt is increased every time the loop is re-ran. After the loop, the next salt is stored into storage. The initial salt is read from this storage.
This has the following benefits:
- We do not have to calculate the start salt and the end salt here. To calculate the start and end salt we need to know the gas costs of each loop (which are usually dynamic and hard to calculate)
- Encoding this data into calldata makes the transactions parallelizable. Using storage this is not possible, because the next transaction depends on the initial storage slot left to the previous one.
- Although we get slightly waste of gas in the EVM, this is negligible compared to the operations we use in these tests (storage writes)
The value here can be read from the attack contract (this seems to be the target key to write. Small note: the usage of the name "value" here is confusing, because it could also point to tx value e.g. CALLVALUE). Start/end bytes are handled by the contract (in EVM). initcode_hash is also a constant and is something we can hardcode in the contract (no need to put it in calldata)
What to do is to take the contract from #1961: these lines: https://github.com/ethereum/execution-specs/pull/1961/changes#diff-88ac263a5a41126dcb0c95cc6939a105f972f0a9fd526ecaae4f085f01f96d0aR118-R152 and edit it such that it hardcodes the initcode, and changes the EXTCODESIZE in the loop to Op.CALL which calls the attack(uint256) here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: this CREATE2 attack pattern is something we have seen in many places (and also in some small variations, with the same end goal) - so we should template this attack at some point so we can re-use it and iterate faster using the same code π π
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The downside of the code created in build_attack_contract is that we have to deploy a new fresh contract each time we run execute again, because it stores the last salt to storage so we cannot start from zero when the test is executed from the beginning.
This version can reuse the same contract not only across re-execution of the same test but in in all tests (as long as the factory pre-deploy address is the same).
I think the slightly higher calldata cost is worth it in order to speed up test execution.
| + self.initcode_hash | ||
| ) | ||
|
|
||
| def calculate_inner_call_cost(self, fork: Fork) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: with the new format this can thus be removed as we let EVM handle the logic if we want to do "one more loop" or if we want to exit
| return inner_call_cost | ||
|
|
||
| def calculate_gas(self, fork: Fork) -> int: | ||
| """Calculate the exact gas this attack transaction will use.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, gas calculations are not necessary anymore with this other logic
| to=ATTACK_ORCHESTRATOR_ADDRESS, | ||
| gas_limit=self.calculate_tx_gas_limit(fork), | ||
| sender=sender, | ||
| data=self.calldata(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we switch here to new format, calldata is empty, and can thefore use BenchmarkTestFiller to handle the split logic
| def add_post_verification( | ||
| self, post: Alloc, mined_contract_file: MinedContractFile | ||
| ) -> None: | ||
| """Add the post-verification transaction to the post-state.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The post verification from #1961:
- Writes current salt to slot 0
- Writes EXTCODESIZE of target to slot 1.
The EXTCODESIZE check verifies that the target contract is deployed. Additionally, can also verify that slot 0 is "at least some number" such that we know at least X amount of attacks have ran.
(Note: if tx fails or OOGs these slots could thus be written but are then reverted)
c92cfae to
47f4a63
Compare
This PR introduces comprehensive benchmarks to test Ethereum clients under worst-case scenarios involving extremely deep state and account tries. The attack scenario: - Pre-deployed contracts with deep storage tries (depth=9) maximizing traversal costs - CREATE2-based deterministic addressing for reproducible benchmarks - AttackOrchestrator contract that batches up to 2,510 attacks per transaction - Tests measure state root recomputation impact when modifying deep slots Key components: - depth_9.sol, depth_10.sol: Contracts with deep storage tries - s9_acc3.json: Pre-computed CREATE2 addresses and auxiliary accounts (15k contracts) - AttackOrchestrator.sol: Optimized attack coordinator (3,650 gas per attack) - deep_branch_testing.py: EEST test harness for pre-deployed contracts - README.md: Complete documentation and setup instructions Performance optimizations: - Reduced gas forwarding from 50k to 3,650 per attack (8.3x throughput increase) - MAX_ATTACKS_PER_TX increased from 303 to 2,510 - Precise EVM opcode cost analysis with safety margins - Read init_code_hash directly from JSON instead of recompiling Deployment setup and instructions available at: https://gist.github.com/CPerezz/44d521c0f9e6adf7d84187a4f2c11978 This benchmark helps identify performance bottlenecks in state trie handling and validates client implementations under extreme depth conditions.
The attack() call was forwarding only 3650 gas, which is insufficient for SSTORE operations on cold storage slots. SSTORE requires: - 2100 gas for cold slot access - 2900 gas for zero-to-nonzero write - Plus dispatch overhead (~200 gas) Updated to forward 5300 gas to ensure SSTORE succeeds.
Adds a minimal Verifier contract that checks if a target contract's deepest storage slot was updated to the expected attack value. This enables the test to verify attack success without expensive post-state checks on all attacked contracts. The verify() function calls getDeepest() on the target and compares the returned value against the expected attack value.
β¦ gas Major refactor of the depth benchmark test for execute mode: - Remove stubs dependency; derive contract addresses directly from init_code_hash + Nick's deployer using CREATE2 formula - Deploy AttackOrchestrator and Verifier as part of test execution - Dynamically compute NUM_CONTRACTS based on gas_benchmark_value - Add verification transaction at end of block to confirm attack success - Fix gas constants based on empirical measurements: - GAS_PER_ATTACK: 8014 -> 8050 (measured ~8042) - MAX_ATTACKS_PER_TX: 1990 -> 1980 (safety margin) - TX_OVERHEAD: 22900 -> 22600 (more accurate) The previous gas constants caused all attack transactions to run out of gas, as the 28 gas/attack shortfall compounded over 1990 attacks to ~55k gas deficit.
- Embed AttackOrchestrator and Verifier bytecode directly in Python - Add download_mined_asset() to fetch JSON/SOL files from GitHub - Cache downloaded files locally in .cache/ directory - Remove local .sol and .json asset files (now downloaded on demand) - Update test parameters to use (10, 6) available from GitHub - Add gist reference for contract sources Contract sources: https://gist.github.com/CPerezz/8686da933fa5c045fbdf7c31e20e6c71 Mined assets: https://github.com/CPerezz/worst_case_miner/tree/master/mined_assets
- Remove unused ATTACK_SELECTOR constant - Extract magic numbers to named constants (gas limits, fees, etc.) - Add zero contracts validation to prevent edge case bugs - Fix unused fork parameter (rename to _fork) - Replace print warning with warnings.warn - Fix docstring math discrepancy (~2,742 not 2,750) - Fix line length issues and add proper type annotations
8afabff to
42e4830
Compare
Codecov Reportβ
All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## forks/amsterdam #1976 +/- ##
================================================
Coverage 86.33% 86.33%
================================================
Files 538 538
Lines 34557 34557
Branches 3222 3222
================================================
Hits 29835 29835
Misses 4148 4148
Partials 574 574
Flags with carried forward coverage won't be shown. Click here to find out more. β View full report in Codecov by Sentry. π New features to boost your workflow:
|
42e4830 to
fee6e2e
Compare
ποΈ Description
WIP: Based on #1937 but using #1934.
π Related Issues or PRs
N/A.
β Checklist
toxchecks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:uvx tox -e statictype(scope):.mkdocs servelocally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.@ported_frommarker.Cute Animal Picture