Witness generation fixes when comparing to Reth#3
Conversation
… enough gas for it
| msg_data = tx.data | ||
| code = get_account(block_env.state, tx.to).code | ||
| track_bytecode_access(block_env.state, code) | ||
| track_bytecode_access(block_env.state, code, tx.to) |
There was a problem hiding this comment.
track_bytecode_access now checks if the provided address in the new parameter existed in the pre-state of the block. If it doesn't exist, this bytecode won't be in the witness since the bytecode was generated during block execution.
Probably with the official state tracker we might approach it differently -- just explaining the code change here.
| ( | ||
| disable_precompiles, | ||
| code_address, | ||
| code, |
There was a problem hiding this comment.
This was the most tricky thing to fix in the spec, which is related to the point I mentioned in the PR description about avoiding bloating the execution witness in OOG situations.
The access_delegation function always did an account access and bytecode fetching before charging gas for a potential CALL/CALLCODE/DELEGATECALL/STATICCALL.
While this is nice to simplify things in the spec, revm is way more optimzed to avoid this unrequired state access before it is sure it can charge a potential COLD_ACCCESS. If you don't have enough gas for that charge, in theory you shouldn't have done the db access since is a DoS vector.
This means the spec must do the same, since if you still do it (apart that is a DoS vector) it will make the state tracker include it in the witness.
Most of this file changes are consequences of avoiding this access in access_delegation, and leaving that db access after the gas charging has passed -- done inside the generic_call which is "as late as possible" when the account data is needed (instead of "sooner than needed" which created this problem).
| evm.accessed_addresses.add(beneficiary) | ||
| gas_cost += GAS_COLD_ACCOUNT_ACCESS | ||
|
|
||
| charge_gas(evm, gas_cost) |
There was a problem hiding this comment.
We need to charge gas in parts here. Because in L573 we do an account access. If you already didn't have gas to reach that point, that state access must be avoided to avoid bloating the witness.
|
|
||
| charge_gas(evm, gas_cost) | ||
|
|
||
| if evm.message.is_static: |
There was a problem hiding this comment.
We must move this check as soon as possible too -- if you do SELFDESTRUCT in a STATICCALL context you must fail since if the value isn't zero, this isn't allowed.
This is also done to avoid the bloating that is_account_alive / get_account in L574 would do if we don't fail fast enough.
| if authority_code and not is_valid_delegation(authority_code): | ||
| continue | ||
| if authority_code: | ||
| track_bytecode_access(state, authority_code, authority) |
There was a problem hiding this comment.
This tracking of bytecode access in 7702 was missing.
| ) | ||
|
|
||
| modify_state(block_env.state, wd.address, increase_recipient_balance) | ||
| if wd.amount != 0: |
There was a problem hiding this comment.
If the withdrawal is for 0 ETH, then we avoid this call to avoid bloating the witness. The stateles validator can skip this withdrawal if won't modify the state, thus the witness data isn't required.
| return state._witness_state is not None | ||
|
|
||
|
|
||
| def _debug_print_witness_state(ws: WitnessState) -> None: |
There was a problem hiding this comment.
This is just a helper that I needed to debug all things -- we can remove it eventually.
| # 1. Traverse all accessed and dirty storage keys on the pre-state | ||
| # MPTs to capture pre-state trie nodes in the witness. This must | ||
| # happen before any writes since writes mutate the tree in-place. | ||
| all_storage_reads: Dict[Address, Set[Bytes32]] = {} | ||
| for address, keys in ws.accessed_storage.items(): | ||
| all_storage_reads.setdefault(address, set()).update(keys) | ||
| for address, keys in ws.dirty_storage.items(): | ||
| all_storage_reads.setdefault(address, set()).update(keys) |
There was a problem hiding this comment.
This is related to the first bullet I mentioned in the PR description.
Any pre-state MPT proof should be captured before any state tree change. If we don't do this, we might capture "transient MPT nodes" that we edited while post-state root calculation.
| value: Bytes | ||
| _hash: Optional[Bytes] = None # Cached hash, invalidated on change | ||
| _rlp: Optional[Bytes] = None # Cached RLP encoding | ||
| _dirty: bool = False # True if created during execution (not pre-state) |
There was a problem hiding this comment.
As mentioned in the PR description, this is a new boolena marker for all types of nodes in the MPT tree.
This is required, since whenever we capture potential sibilings in branch compressions we must be sure they existed in the pre-state tree. If they were edited or created during post-state root calculation, they must not be in the witness.
| @@ -0,0 +1,132 @@ | |||
| """ | |||
There was a problem hiding this comment.
I think these tests are a bit self describing.
|
|
||
| Returns | ||
| ------- | ||
| delegation : `Tuple[bool, Address, Bytes, Uint]` |
There was a problem hiding this comment.
small nit: Tuple[bool, Address, Uint]
This PR contains fixes to the witness generation spec:
dirtyfield intoIncrementalMPTnodes to track which ones were created or edited during post-state root calculation. This is a subtle need to correctly add sibiling nodes during branch compressions. Before the fix, we always included these sibiling nodes. This is not correct, since we must only include sibiling nodes if and only if they existed in the pre-state trie. If they were created during post-state root execution (say, in a previous storage slot update) then they are “runtime created nodes” that the stateless execution will have thus not require in the witness.Apart from spec fixes, I added two new tests that are very simple but cover two corner cases that wasn’t detected (by pure chance) in the ~9k existing tests:
I'm tempted to add more tests, but trying to keep that away from this PR and can keep stacking on top in separate ones.
Note: all these bugs were found by running the 9k tests with spec execution witness against Reth. Reth would run the test in stateful mode, generate its own witness and compare it against the fixture execution witness. Reth also had other bugs that the spec helped to find, but those live in other PRs.
Note: I keep this as a separate PR on top of #1 to help have a timeline on how things happened, and also keep #1 frozen since I know some people in STEEL already review it.