Skip to content
This repository was archived by the owner on Jul 5, 2024. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,5 +15,5 @@
- [Random Linear Combination](./design/random-linear-combinaion.md)
- [Full Runnable Code](./design/random-linear-combinaion/full-runnable-code.md)
- [Recursion](./design/recursion.md)
- [State Write Reversion Note 1](./design/state-write-reversion.md)
- [State Write Reversion Note 2](./design/state-write-reversion2.md)
- [Reversible Write Reversion Note 1](./design/reversible-write-reversion.md)
- [Reversible Write Reversion Note 2](./design/reversible-write-reversion2.md)
Comment thread
han0110 marked this conversation as resolved.
40 changes: 20 additions & 20 deletions src/architecture/evm-circuit.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
EVM circuit iterates over transactions included in the proof to verify that each execution step of a transaction is valid. Basically the scale of a step is the same as in the EVM, so usually we handle one opcode per step, except those opcodes like `SHA3` or `CALLDATACOPY` that operate on variable size of memory, which would require multiple "virtual" steps.

> The scale of a step somehow could be different depends on the approach, an extreme case is to implement a VM with reduced instruction set (like TinyRAM) to emulate EVM, which would have a much smaller step, but not sure how it compares to current approach.
>
>
> **han**

To verify if a step is valid, we first enumerate all possible execution results of a step in the EVM including success and error cases, and then build a custom constraint to verify that the step transition is correct for each execution result.
Expand Down Expand Up @@ -62,15 +62,15 @@ flowchart LR
- **ReturnStep** = [ ExplicitReturn | Error ]
- Set of states that halt the execution of a call and return to the caller
or go to the next tx.
- **ExecStep**:
- **ExecStep**:
- 1-1 mapping with a GethExecStep for opcodes that map to a single gadget
with a single step. Example: `ADD`, `MUL`, `DIV`, `CREATE2`.
- **ExecMetaStep**:
- **ExecMetaStep**:
- N-1 mapping with a GethExecStep for opcodes that share the same gadget
(due to similarity) with a single step. For example `{ADD, SUB}`,
`{PUSH*}`, `{DUP*}` and `{SWAP*}`.
A good example on how these are grouped is the `StackOnlyOpcode` struct.
- **ExecSubStep**:
- **ExecSubStep**:
- 1-N mapping with a GethExecStep for opcodes that deal with dynamic size
arrays for which multiple steps are generated.
- `CALLDATACOPY` -> CopyToMemory
Expand All @@ -79,15 +79,15 @@ flowchart LR
- `EXTCODECOPY` -> IN PROGRESS
- `SHA3` -> IN PROGRESS
- `LOGN` -> CopyToLog
- **ExplicitReturn**:
- **ExplicitReturn**:
- 1-1 mapping with a GethExecStep for opcodes that return from a call
without exception.
- **Error** = [ ErrorEnoughGas | ErrorOutOfGas ]
- Set of states that are associated with exceptions caused by opcodes.
- **ErrorEnoughGas**:
- **ErrorEnoughGas**:
- Set of error states that are unrelated to out of gas. Example:
`InvalidOpcode`, `StackOverflow`, `InvalidJump`.
- **ErrorOutOfGas**:
- **ErrorOutOfGas**:
- Set of error states for opcodes that run out of gas. For each opcode
(sometimes group of opcodes) that has dynamic memory gas usage, there is
a specific **ErrorOutOfGas** error state.
Expand All @@ -97,9 +97,9 @@ flowchart LR
- End of a block (serves also as padding for the rest of the state step slots)


> In the current implementation, we ask the opcode implementer to also implement error cases, which seems to be a redundant effort.
> In the current implementation, we ask the opcode implementer to also implement error cases, which seems to be a redundant effort.
> But by doing this, they can focus more on opcode's success case. Also error cases are usually easier to verify, so I think it also reduces the overall implementation complexity.
>
>
> **han**

## Random access data
Expand All @@ -112,11 +112,11 @@ For read-write access data, EVM circuit looks up State circuit with a sequential

For read-only access data, EVM circuit looks-up Bytecode circuit, Tx circuit and Call circuit directly.

## State write reversion
## Reversible write reversion

In EVM, state writes can be reverted if any call fails. There are many kinds of state writes, a complete list can be found [here](https://github.com/ethereum/go-ethereum/blob/master/core/state/journal.go#L87-L141).
In EVM, reversible writes can be reverted if any call fails. There are many kinds of reversible writes, a complete list can be found [here](https://github.com/ethereum/go-ethereum/blob/master/core/state/journal.go#L87-L141).

In EVM circuit, each call is attached with a flag (`is_persistent`) to know if it succeeds or not. So ideally, we only need to do reversion on those kinds of state writes which affect future execution before reversion:
In EVM circuit, each call is attached with a flag (`is_persistent`) to know if it succeeds or not. So ideally, we only need to do reversion on these kinds of reversible writes which affect future execution before reversion:

- `TxAccessListAccount`
- `TxAccessListStorageSlot`
Expand All @@ -131,20 +131,20 @@ On some others we don't need to do reversion because they don't affect future ex
- `AccountDestructed`

> Another tag is `TxLog`, which also doesn't affect future execution. It should be explained where to write such record to after we decide where to build receipt trie.
>
>
> **han**

To enable state write reversion, we need some meta information of a call:
To enable reversible write reversion, we need some meta information of a call:

1. `is_persistent` - To know if we need reversion or not.
2. `rw_counter_end_of_reversion` - To know at which point in the future we should revert.
3. `state_write_counter` - To know how many state writes we have done until now.
3. `reversible_write_counter` - To know how many reversible writes we have done until now.

Then at each state write, we first check if `is_persistent` is `0`, if so we do an extra state write at `rw_counter_end_of_reversion - state_write_counter` with the old value, which reverts the state write in a reverse order.
Then at each reversible write, we first check if `is_persistent` is `0`, if so we do an extra reversible write at `rw_counter_end_of_reversion - reversible_write_counter` with the old value, which reverts the reversible write in a reverse order.

For more notes on state write reversion see:
- [Design Notes, State Write Reversion Note 1](../design/state-write-reversion.md)
- [Design Notes, State Write Reversion Note 2](../design/state-write-reversion2.md)
For more notes on reversible write reversion see:
- [Design Notes, Reversible Write Reversion Note 1](../design/reversible-write-reversion.md)
- [Design Notes, Reversible Write Reversion Note 2](../design/reversible-write-reversion2.md)

## Opcode fetching

Expand All @@ -164,7 +164,7 @@ Before we fetch opcode from any source, it checks if the index is in the given r
EVM supports internal call triggered by opcodes. In EVM circuit, the opcodes (like `CALL` or `CREATE`) that trigger internal call, will:
- Save their own `call_state` into State circuit.
- Setup next call's context.
- Initialize next step's `call_state` to start a new environment.
- Initialize next step's `call_state` to start a new environment.

Then the opcodes (like `RETURN` or `REVERT`) and error cases that halt, will restore caller's `call_state` and set it back to next step.

Expand Down
8 changes: 4 additions & 4 deletions src/architecture/state-circuit.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ To prevent any malicious insertion of access record, we also verify the amount o

## Read-write unit grouping

The first thing to ensure data is consistent between different writes is to give each data an unique identifier, then group data chunks by the unique identifier. And finally, then sort them by order of access `rw_counter`.
The first thing to ensure data is consistent between different writes is to give each data an unique identifier, then group data chunks by the unique identifier. And finally, then sort them by order of access `rw_counter`.

Here are all kinds of data with their unique identifier:

Expand All @@ -43,7 +43,7 @@ EVM's memory expands implicitly, for example, when the memory is empty and it en

The implicit expansion behavior makes even the simple `MLOAD` and `MSTORE` complicated in EVM circuit, so we have a trick to outsource the effort to State circuit by constraining the first record of each memory unit to be a write or have value `0`. It saves the variable amount of effort to expand memory and ignore those never used memory, only used memory addresses will be initlized with `0` so as lazy initialization.

> This concept is also used in another case: the opcode `SELFDESTRUCT` also has ability to update the variable amount of data. It resets the `balance`, `nonce`, `code_hash`, and every `storage_slot` even if it's not used in the step. So for each state under account, we can add a `revision_id` handle such case, see [Design Notes, State Write Reversion Note2, SELFDESTRUCT](./state-write-reversion2.md#selfdestruct) for details.
> This concept is also used in another case: the opcode `SELFDESTRUCT` also has ability to update the variable amount of data. It resets the `balance`, `nonce`, `code_hash`, and every `storage_slot` even if it's not used in the step. So for each state under account, we can add a `revision_id` handle such case, see [Design Notes, Reversible Write Reversion Note2, SELFDESTRUCT](./reversible-write-reversion2.md#selfdestruct) for details.
> ==TODO== Convert this into an issue for discussion
>
> **han**
Expand All @@ -56,8 +56,8 @@ The implicit expansion behavior makes even the simple `MLOAD` and `MSTORE` compl

==TODO== Explain each tag

<!--
##### `tx_access_list_account`
<!--
##### `tx_access_list_account`
##### `tx_access_list_storage_slot`
##### `tx_refund`
##### `account_nonce`
Expand Down
80 changes: 80 additions & 0 deletions src/design/reversible-write-reversion.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Reversible Write Reversion

Reversible write reversion might be the most tricky part to explain of the EVM circuit. This note aims to illustrate how the current approach works with some diagrams, and collect all the other approaches for comparison.

## Revert or not

With full execution trace of a block, if we iterate over it once, we can know if each call (including create) is successful or not, and then determine which reversible writes are persistent, and which are not.

So each call could be annotated with 2 tags:

- `is_success` - If this call ends with `STOP` or `RETURN`
- `is_persistent` - If this call and all its caller have `is_success == true`

Only reversible writes inside a call with `is_persistent == true` will be applied to the new state. Reversible writes in a call with `is_persistent == false` will be reverted at the closest call that has `is_success == false`.

## Current approach

Since the only requirement of a read/write access is `rw_counter` uniqueness, we are not restricted to only do read/writes with sequential `rw_counter` in a step, instead we can do read/write with any `rw_counter`, as long as we don't break the `rw_counter` uniqueness requirement.

We ask the prover to tell us each call's information including:

- `is_success` - Described above
- `is_persistent` - Described above
- `rw_counter_end_of_reversion` - The `rw_counter` at the end of reversion of the call if it has `is_persistent == false`

In EVM circuit we track the value `reversible_write_counter` to count how many reversible writes have been made so far. This value is initialized at `0` of each call.

With `is_persistent`, `rw_counter_end_of_reversion` and `reversible_write_counter`, we can do the reversible write with its corresponding reversion at the same step, because we know at which point it should happen. The pseudo code of reversible write looks like:

```python
rw_table.lookup(
rw_counter=rw_counter,
reversible_write=reversible_write, # write to new value
)

if not is_persistent:
rw_table.lookup(
rw_counter=rw_counter_end_of_reversion - reversible_write_counter,
reversible_write=reversible_write.reverted(), # write to previous value
)

rw_counter += 1
reversible_write_counter += 1
```

Note that the we are increasing the `reversible_write_counter`, so the reversions are applied in reverse order in `rw_table`, which is exactly the behavior we want.

Another important check is to ensure `rw_counter_end_of_reversion` is correct. At the step that does the reversions, we check if there is a gap of `rw_counter` to the next step, where the gap size is exactly the `reversible_write_counter`. And in the end of the gap, the `rw_counter` is exactly `rw_counter_end_of_reversion`. The pseudo code looks like:

```python
if not is_persistent:
assert rw_counter_end_of_reversion == rw_counter + reversible_write_counter
rw_counter = call.rw_counter_end_of_reversion + 1
```

With illustration:

![](./state-write-reversion_reversion-simple.png)

The step that does the reversions is also responsible for reversions of its successful callees. Note that each callee's `reversible_write_counter` is initialized at `0`. To make sure they can do the reversion at correct `rw_counter`, we need to propagate the `rw_counter_end_of_reversion` to be itself minus the current accumulated `reversible_write_counter`. The pseudo code looks like:

```python
if not is_persistent and callee_is_success:
assert callee_rw_counter_end_of_reversion \
== rw_counter_end_of_reversion - reversible_write_counter
```

![](./state-write-reversion_reversion-nested.png)

At the end of successful callee, we accumulate the `reversible_write_counter` to its caller.

### Adjustment for `SELFDESTRUCT`

See [Design Notes, Reversible Write Reversion Note2, SELFDESTRUCT](./reversible-write-reversion2.md#selfdestruct)

## Other approaches

### With `revision_id`

TODO
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# State Write Reversion Note 2
# Reversible Write Reversion Note 2

# ZKEVM - State Circuit Extension - `StateDB`

## Reversion

In EVM, there are multiple kinds of `StateDB` update that could be reverted when any internal call fails.
In EVM, there are multiple kinds of `StateDB` updates that could be reverted when any internal call fails.

- `tx_access_list_account` - `(tx_id, address) -> accessed`
- `tx_access_list_storage_slot` - `(tx_id, address, storage_slot) -> accessed`
Expand All @@ -24,7 +24,7 @@ The complete list can be found [here](https://github.com/ethereum/go-ethereum/bl

The actions that write to the `StateDB` inside the red box will also revert themselves in the revert section (red circle), but in reverse order.

Each call needs to know its `rw_counter_end_of_revert_section` to revert with the correct `rw_counter`. If callee is a success call but in some red box (`is_persistent=0`), we need to copy caller's `rw_counter_end_of_revert_section` and `state_write_counter` to callee's.
Each call needs to know its `rw_counter_end_of_revert_section` to revert with the correct `rw_counter`. If callee is a success call but in some red box (`is_persistent=0`), we need to copy caller's `rw_counter_end_of_revert_section` and `reversible_write_counter` to callee's.

## `SELFDESTRUCT`

Expand Down
80 changes: 0 additions & 80 deletions src/design/state-write-reversion.md

This file was deleted.