From c8f54155d0046a41cd335476e95e2b1b1e8db664 Mon Sep 17 00:00:00 2001 From: MirandaWood Date: Thu, 7 Nov 2024 15:29:20 +0000 Subject: [PATCH 1/3] docs: update docs for blobs --- docs/docs/migration_notes.md | 19 ++ .../blobs.md | 180 ++++++++++++++++++ .../published-data.md | 59 +++--- docs/docs/protocol-specs/logs/index.md | 2 + .../protocol-specs/rollup-circuits/index.md | 53 +----- .../rollup-circuits/merge-rollup.md | 6 +- .../rollup-circuits/root-rollup.md | 4 +- docs/docs/protocol-specs/state/wonky-tree.md | 4 +- .../common_errors/sandbox-errors.md | 4 +- docs/sidebars.js | 1 + 10 files changed, 246 insertions(+), 86 deletions(-) create mode 100644 docs/docs/protocol-specs/data-publication-and-availability/blobs.md diff --git a/docs/docs/migration_notes.md b/docs/docs/migration_notes.md index 84912b635308..f5e8ff35cdc5 100644 --- a/docs/docs/migration_notes.md +++ b/docs/docs/migration_notes.md @@ -238,6 +238,25 @@ For this reason we've decided to rename it: To reduce loading times, the package `@aztec/noir-contracts.js` no longer exposes all artifacts as its default export. Instead, it exposes a `ContractNames` variable with the list of all contract names available. To import a given artifact, use the corresponding export, such as `@aztec/noir-contracts.js/FPC`. +### Blobs +We now publish all DA in EVM blobs rather than calldata. This replaces all code that touched the `txsEffectsHash`. +In the rollup circuits, instead of hashing each child circuit's `txsEffectsHash` to form a tree, we track tx effects by absorbing them into a sponge for blob data (hence the name: `spongeBlob`). This sponge is treated like the state trees in that we check each rollup circuit 'follows' the next: + +```diff +- let txs_effects_hash = sha256_to_field(left.txs_effects_hash, right.txs_effects_hash); ++ assert(left.end_sponge_blob.eq(right.start_sponge_blob)); ++ let start_sponge_blob = left.start_sponge_blob; ++ let end_sponge_blob = right.end_sponge_blob; +``` +This sponge is used in the block root circuit to confirm that an injected array of all `txEffects` does match those rolled up so far in the `spongeBlob`. Then, the `txEffects` array is used to construct and prove opening of the polynomial representing the blob commitment on L1 (this is done efficiently thanks to the Barycentric formula). +On L1, we publish the array as a blob and verify the above proof of opening. This confirms that the tx effects in the rollup circuit match the data in the blob: + +```diff +- bytes32 txsEffectsHash = TxsDecoder.decode(_body); ++ bytes32 blobHash = _validateBlob(blobInput); +``` +Where `blobInput` contains the proof of opening and evaluation calculated in the block root rollup circuit. It is then stored and used as a public input to verifying the epoch proof. + ## 0.67.0 ### L2 Gas limit of 6M enforced for public portion of TX diff --git a/docs/docs/protocol-specs/data-publication-and-availability/blobs.md b/docs/docs/protocol-specs/data-publication-and-availability/blobs.md new file mode 100644 index 000000000000..39c05d3d2e82 --- /dev/null +++ b/docs/docs/protocol-specs/data-publication-and-availability/blobs.md @@ -0,0 +1,180 @@ +--- +title: Blobs +--- + +## Implementation + +### Technical Background + +Essentially, we replace publishing all a tx's effects in calldata with publishing in a blob. Any data inside a blob is *not available* to the EVM so we cannot simply hash the same data on L1 and in the rollup circuits, and check the hash matches, as we do now. + +Instead, publishing a blob makes the `blobhash` available: + +```solidity +/** +* blobhash(i) returns the versioned_hash of the i-th blob associated with _this_ transaction. +* bytes[0:1]: 0x01 +* bytes[1:32]: the last 31 bytes of the sha256 hash of the kzg commitment C. +*/ +bytes32 blobHash; +assembly { + blobHash := blobhash(0) +} +``` + +Where the commitment $C$ is a KZG commitment to the data inside the blob over the BLS12-381 curve. There are more details [here](https://notes.ethereum.org/@vbuterin/proto_danksharding_faq#What-format-is-blob-data-in-and-how-is-it-committed-to) on exactly what this is, but briefly, given a set of 4096 data points inside a blob, $d_i$, we define the polynomial $p$ as: + +$$p(\omega^i) = d_i.$$ + +In the background, this polynomial is found by interpolating the $d_i$ s (evaluations) against the $\omega^i$ s (points), where $\omega^{4096} = 1$ (i.e. is a 4096th root of unity). + +This means our blob data $d_i$ is actually the polynomial $p$ given in evaluation form. Working in evaluation form, particularly when the polynomial is evaluated at roots of unity, gives us a [host of benefits](https://dankradfeist.de/ethereum/2021/06/18/pcs-multiproofs.html#evaluation-form). One of those is that we can commit to the polynomial (using a precomputed trusted setup for secret $s$ and BLS12-381 generator $G_1$) with a simple linear combination: + +$$ C = p(s)G_1 = p(sG_1) = \sum_{i = 0}^{4095} d_i l_i(sG_1),$$ + +where $l_i(x)$ are the [Lagrange polynomials](https://dankradfeist.de/ethereum/2021/06/18/pcs-multiproofs.html#lagrange-polynomials). The details for us are not important - the important part is that we can commit to our blob by simply multiplying each data point by the corresponding element of the Lagrange-basis trusted setup and summing the result! + +### Proving DA + +So to prove that we are publishing the correct tx effects, we just do this sum in the circuit, and check the final output is the same $C$ given by the EVM, right? Wrong. The commitment is over BLS12-381, so we would be calculating hefty wrong-field elliptic curve operations. + +Thankfully, there is a more efficient way, already implemented in the [`blob`](https://github.com/AztecProtocol/aztec-packages/tree/master/noir-projects/noir-protocol-circuits/crates/blob) crate in aztec-packages. + +Our goal is to efficiently show that our tx effects accumulated in the rollup circuits are the same $d_i$ s in the blob committed to by $C$ on L1. To do this, we can provide an *opening proof* for $C$. In the circuit, we evaluate the polynomial at a challenge value $z$ and return the result: $p(z) = y$. We then construct a [KZG proof](https://dankradfeist.de/ethereum/2020/06/16/kate-polynomial-commitments.html#kate-proofs) in typescript of this opening (which is actually a commitment to the the quotient polynomial $q(x)$), and verify it on L1 using the [point evaluation precompile](https://eips.ethereum.org/EIPS/eip-4844#point-evaluation-precompile) added as part of EIP-4844. It has inputs: + +- `versioned_hash`: The `blobhash` for this $C$ +- `z`: The challenge value +- `y`: The claimed evaluation value at `z` +- `commitment`: The commitment $C$ +- `proof`: The KZG proof of opening + +It checks: + +- `assert kzg_to_versioned_hash(commitment) == versioned_hash` +- `assert verify_kzg_proof(commitment, z, y, proof)` + +As long as we use our tx effect fields as the $d_i$ values inside the circuit, and use the same $y$ and $z$ in the public inputs of the Honk L1 verification as input to the precompile, we have shown that $C$ indeed commits to our data. Note: I'm glossing over some details here which are explained in the links above (particularly the 'KZG Proof' and 'host of benefits' links). + +But isn't evaluating $p(z)$ in the circuit also a bunch of very slow wrong-field arithmetic? No! Well, yes, but not as much as you'd think! + +To evaluate $p$ in evalulation form at some value not in its domain (i.e. not one of the $\omega^i$ s), we use the [barycentric formula](https://dankradfeist.de/ethereum/2021/06/18/pcs-multiproofs.html#evaluating-a-polynomial-in-evaluation-form-on-a-point-outside-the-domain): + +$$p(z) = A(z)\sum_{i=0}^{4095} \frac{d_i}{A'(\omega^i)} \frac{1}{z - \omega^i}.$$ + +What's $A(x)$, you ask? Doesn't matter! One of the nice properties we get by defining $p$ as an interpolation over the roots of unity, is that the above formula is simplified to: + +$$p(z) = \frac{z^{4096} - 1}{4096} \sum_{i=0}^{4095} \frac{d_i\omega^i}{z - \omega^i}.$$ + +We can precompute all the $\omega^i$, $-\omega^i$ s and $4096^{-1}$, the $d_i$ s are our tx effects, and $z$ is the challenge point (discussed more below). This means computing $p(z)$ is threoretically 4096 wrong-field multiplications and 4096 wrong-field divisions, far fewer than would be required for BLS12-381 elliptic curve operations. + +### Rollup Circuits + +#### Base + +We need to pass up *something* encompassing the tx effects to the rollup circuits, so they can be used as $d_i$ s when we prove the blob opening. The simplest option would be to `poseidon2` hash the tx effects instead and pass those up, but that has some issues: + +- If we have one hash per base rollup (i.e. per tx), we have an ever increasing list of hashes to manage. +- If we hash these in pairs, then we need to recreate the rollup structure when we prove the blob. + +The latter is doable, but means encoding some maximum number of txs, `N`, to loop over and potentially wasting gates for blocks with fewer than `N` txs. For instance, if we chose `N = 96`, a block with only 2 txs would still have to loop 96 times. Plus, a block could never have more than 96 transactions without a fork. + +Instead, we manage state in the vein of `PartialStateReference`, where we provide a `start` and `end` state in each base and subsequent merge rollup circuits check that they follow on from one another. The base circuits themselves simply prove that adding the data of its tx indeed moves the state from `start` to `end`. + +To encompass all the tx effects, we use a `poseidon2` sponge and absorb each field. We also track the number of fields added to ensure we don't overflow the blob (4096 BLS fields, which *can* fit 4112 BN254 fields, but adding the mapping between these is complex). Given that this struct is a sponge used for a blob, I have named it: + +```rs +global IV: Field = (FIELDS_PER_BLOB as Field) * 18446744073709551616; + +struct SpongeBlob { + sponge: Poseidon2, + fields: u32, +} + +impl SpongeBlob { + fn new() -> Self { + Self { + sponge: Poseidon2::new(IV), + fields: 0, + } + } + // Add fields to the sponge + fn absorb(&mut self, input: [Field; N], in_len: u32) { + // in_len is all non-0 input + for i in 0..in_len { + self.sponge.absorb(input[i]); + } + self.fields += in_len; + } + // Finalise the sponge and output poseidon2 hash of all fields absorbed + fn squeeze(&mut self) -> Field { + self.sponge.squeeze() + } +} +``` + +To summarise: each base circuit starts with a `start` `SpongeBlob` instance, which is either blank or from the preceding circuit, then calls `.absorb()` with the tx effects as input. Just like the output `BaseOrMergeRollupPublicInputs` has a `start` and `end` `PartialStateReference`, it will also have a `start` and `end` `SpongeBlob`. + +#### Merge + +We simply check that the `left`'s `end` `SpongeBlob` == the `right`'s `start` `SpongeBlob`, and assign the output's `start` `SpongeBlob` to be the `left`'s and the `end` `SpongeBlob` to be the `right`'s. + +#### Block Root + +The current route is to inline the blob functionality inside the block root circuit. + + +First, we must gather all our tx effects ($d_i$ s). These will be injected as private inputs to the circuit and checked against the `SpongeBlob`s from the pair of `BaseOrMergeRollupPublicInputs` that we know contain all the effects in the block's txs. Like the merge circuit, the block root checks that the `left`'s `end` `SpongeBlob` == the `right`'s `start` `SpongeBlob`. + +It then calls `squeeze()` on the `right`'s `end` `SpongeBlob` to produce the hash of all effects that will be in the blob. Let's call this `h`. The raw injected tx effects are `poseidon2` hashed and we check that the result matches `h`. We now have our set of $d_i$ s. + +We now need to produce a challenge point `z`. This value must encompass the two 'commitments' used to represent the blob data: $C$ and `h` (see [here](https://notes.ethereum.org/@vbuterin/proto_danksharding_faq#Moderate-approach-works-with-any-ZK-SNARK) for more on the method). We simply provide $C$ as a public input to the block root circuit, and compute `z = poseidon2(h, C)`. + +The block root now has all the inputs required to call the blob functionality described above. Along with the usual `BlockRootOrBlockMergePublicInputs`, we also have `BlobPublicInputs`: $C$, $z$, and $y$. + + + +### L1 Contracts + +#### Rollup + +The function `propose()` takes in these `BlobPublicInputs` and a ts generated `kzgProof` alongside its usual inputs for proposing a new L2 block. The transaction also includes our blob sidecar(s). We verify the `BlobPublicInputs` correspond to the sidecars by calling EVM's point evaluation precompile: + +```solidity + // input for the blob precompile + bytes32[] input; + // extract the blobhash from the one submitted earlier: + input[0] = blobHashes[blockHash]; + input[1] = z; + input[2] = y; + input[3] = C; + // the opening proof is computed in ts and inserted here + input[4] = kzgProof; + + // Staticcall the point eval precompile https://eips.ethereum.org/EIPS/eip-4844#point-evaluation-precompile : + (bool success, bytes memory data) = address(0x0a).staticcall(input); + require(success, "Point evaluation precompile failed"); +``` + +We have now linked the `BlobPublicInputs` ($C$, $z$, and $y$) to a published EVM blob. We still need to show that these inputs were generated in our rollup circuits corresponding to the blocks we claim. For each proposed block, we store them: + +```solidity +blobPublicInputs[blockNumber] = BlobPublicInputs({ + z, + y, + c, +}); +``` + +Then, when the epoch proof is submitted in `submitEpochRootProof()`, we access these to verify the ZKP: + +```solidity + // blob_public_inputs + for (uint256 i = 0; i < _epochSize; i++) { + uint256 j = currentIndex + i; + publicInputs[j] = blobPublicInputs[previousBlockNumber + i + 1].z; + publicInputs[j + 1] = blobPublicInputs[previousBlockNumber + i + 1].y; + publicInputs[j + 2] = blobPublicInputs[previousBlockNumber + i + 1].c; + } +``` + +Note that we do not need to check that our $C$ matches the `blobhash` - the precompile does this for us. diff --git a/docs/docs/protocol-specs/data-publication-and-availability/published-data.md b/docs/docs/protocol-specs/data-publication-and-availability/published-data.md index 9a72572e1b0f..d9ab988b1f61 100644 --- a/docs/docs/protocol-specs/data-publication-and-availability/published-data.md +++ b/docs/docs/protocol-specs/data-publication-and-availability/published-data.md @@ -7,39 +7,32 @@ The "Effects" of a transaction are the collection of state changes and metadata | Field | Type | Description | | -------------------- | ----------------------------------------------------------------------- | ------------------------------------------------------------------------------------ | | `revertCode` | `RevertCode` | Indicates the reason for reverting in public application logic. 0 indicates success. | -| `note_hashes` | `Tuple` | The note hashes to be inserted into the note hash tree. | +| `transactionFee` | `Fr` | The transaction fee, denominated in FPA. | +| `noteHashes` | `Tuple` | The note hashes to be inserted into the note hash tree. | | `nullifiers` | `Tuple` | The nullifiers to be inserted into the nullifier tree. | -| `l2_to_l2_msgs` | `Tuple` | The L2 to L1 messages to be inserted into the messagebox on L1. | -| `public_data_writes` | `Tuple` | Public data writes to be inserted into the public data tree | -| `encrypted_logs` | `TxL2Logs` | Buffers containing the emitted encrypted logs. | -| `unencrypted_logs` | `TxL2Logs` | Buffers containing the emitted unencrypted logs. | +| `l2ToL1Msgs` | `Tuple` | The L2 to L1 messages to be inserted into the messagebox on L1. | +| `publicDataWrites` | `Tuple` | Public data writes to be inserted into the public data tree | +| `noteEncryptedLogs` | `TxL2Logs` | Buffers containing the emitted note logs. | +| `encryptedLogs` | `TxL2Logs` | Buffers containing the emitted encrypted logs. | +| `unencryptedLogs` | `TxL2Logs` | Buffers containing the emitted unencrypted logs. | -Each can have several transactions. Thus, an block is presently encoded as: +To publish the above data, we must convert it into arrays of BLS12 fields for EVM defined blobs. The encoding is defined as: -| byte start | num bytes | name | -| -------------------------------------------------------------------------------------------------------- | --------- | --------------------------------------- | -| 0x0 | 0x4 | len(newL1ToL2Msgs) (denoted a) | -| 0x4 | a \* 0x20 | newL1ToL2Msgs | -| 0x4 + a \* 0x20 = tx0Start | 0x4 | len(numTxs) (denoted t) | -| | | TxEffect 0 | -| tx0Start | 0x20 | revertCode | -| tx0Start + 0x20 | 0x1 | len(noteHashes) (denoted b) | -| tx0Start + 0x20 + 0x1 | b \* 0x20 | noteHashes | -| tx0Start + 0x20 + 0x1 + b \* 0x20 | 0x1 | len(nullifiers) (denoted c) | -| tx0Start + 0x20 + 0x1 + b \* 0x20 + 0x1 | c \* 0x20 | nullifiers | -| tx0Start + 0x20 + 0x1 + b \* 0x20 + 0x1 + c \* 0x20 | 0x1 | len(l2ToL1Msgs) (denoted d) | -| tx0Start + 0x20 + 0x1 + b \* 0x20 + 0x1 + c \* 0x20 + 0x1 | d \* 0x20 | l2ToL1Msgs | -| tx0Start + 0x20 + 0x1 + b \* 0x20 + 0x1 + c \* 0x20 + 0x1 + d \* 0x20 | 0x1 | len(newPublicDataWrites) (denoted e) | -| tx0Start + 0x20 + 0x1 + b \* 0x20 + 0x1 + c \* 0x20 + 0x1 + d \* 0x20 + 0x01 | e \* 0x40 | newPublicDataWrites | -| tx0Start + 0x20 + 0x1 + b \* 0x20 + 0x1 + c \* 0x20 + 0x1 + d \* 0x20 + 0x01 + e \* 0x40 | 0x04 | byteLen(newEncryptedLogs) (denoted f) | -| tx0Start + 0x20 + 0x1 + b \* 0x20 + 0x1 + c \* 0x20 + 0x1 + d \* 0x20 + 0x01 + e \* 0x40 + 0x4 | f | newEncryptedLogs | -| tx0Start + 0x20 + 0x1 + b \* 0x20 + 0x1 + c \* 0x20 + 0x1 + d \* 0x20 + 0x01 + e \* 0x40 + 0x4 + f | 0x04 | byteLen(newUnencryptedLogs) (denoted g) | -| tx0Start + 0x20 + 0x1 + b \* 0x20 + 0x1 + c \* 0x20 + 0x1 + d \* 0x20 + 0x01 + e \* 0x40 + 0x4 + f + 0x4 | g | newUnencryptedLogs | -| | | }, | -| | | TxEffect 1 | -| | | ... | -| | | }, | -| | | ... | -| | | TxEffect (t - 1) | -| | | ... | -| | | }, | +| field start | num fields | name | contents | +| ----------------------------------------------------- | ---------- | ---------------------- | ---------------------------------------------------------------------------- | +| 0 | 1 | Tx Start | TX_START_PREFIX, total len, REVERT_CODE_PREFIX, revertCode | +| 1 | 1 | Tx Fee | TX_FEE_PREFIX, transactionFee | +| 2 | 1 | Notes Start | (If notes exist) NOTES_PREFIX, noteHashes.len() | +| 3 | n | Notes | (If notes exist) noteHashes | +| 3 + n | 1 | Nullifiers Start | (If nullifiers exist) NULLIFIERS_PREFIX, nullifiers.len() | +| 3 + n + 1 | m | Nullifiers | (If nullifiers exist) nullifiers | +| 3 + n + 1 + m | 1 | L2toL1Messages Start | (If msgs exist) L2_L1_MSGS_PREFIX, l2ToL1Msgs.len() | +| 3 + n + 1 + m + 1 | l | L2toL1Messages | (If msgs exist) l2ToL1Msgs | +| 3 + n + 1 + m + 1 + l | 1 | PublicDataWrites Start | (If writes exist) PUBLIC_DATA_UPDATE_REQUESTS_PREFIX, publicDataWrites.len() | +| 3 + n + 1 + m + 1 + l + 1 | p | PublicDataWrites | (If writes exist) publicDataWrites | +| 3 + n + 1 + m + 1 + l + 1 + p | 1 | Note Logs Start | (If note logs exist) NOTE_ENCRYPTED_LOGS_PREFIX, noteEncryptedLogs.len() | +| 3 + n + 1 + m + 1 + l + 1 + p + 1 | nl | Note Logs | (If note logs exist) noteEncryptedLogs | +| 3 + n + 1 + m + 1 + l + 1 + p + 1 + nl | 1 | Encrypted Logs Start | (If encrypted logs exist) ENCRYPTED_LOGS_PREFIX, encryptedLogs.len() | +| 3 + n + 1 + m + 1 + l + 1 + p + 1 + nl + 1 | el | Encrypted Logs | (If encrypted logs exist) encryptedLogs | +| 3 + n + 1 + m + 1 + l + 1 + p + 1 + nl + 1 + el | 1 | Unencrypted Logs Start | (If unencrypted logs exist) UNENCRYPTED_LOGS_PREFIX, unencryptedLogs.len() | +| 3 + n + 1 + m + 1 + l + 1 + p + 1 + nl + 1 + el + 1 | ul | Unencrypted Logs | (If unencrypted logs exist) unencryptedLogs | diff --git a/docs/docs/protocol-specs/logs/index.md b/docs/docs/protocol-specs/logs/index.md index bb4f3fff2559..8d23a5f9d92f 100644 --- a/docs/docs/protocol-specs/logs/index.md +++ b/docs/docs/protocol-specs/logs/index.md @@ -54,6 +54,8 @@ A function can emit an arbitrary number of logs, provided they don't exceed the + + To minimize the on-chain verification data size, protocol circuits aggregate log hashes. The end result is a single hash within the base rollup proof, encompassing all logs of the same type. Each protocol circuit outputs two values for each log type: diff --git a/docs/docs/protocol-specs/rollup-circuits/index.md b/docs/docs/protocol-specs/rollup-circuits/index.md index 1c7ca403926a..b15a04102365 100644 --- a/docs/docs/protocol-specs/rollup-circuits/index.md +++ b/docs/docs/protocol-specs/rollup-circuits/index.md @@ -465,55 +465,17 @@ graph LR To ensure that state is made available, we could broadcast all of a block's input data as public inputs of the final root rollup proof, but a proof with so many public inputs would be very expensive to verify onchain. Instead, we can reduce the number of public inputs by committing to the block's body and iteratively "build" up the commitment at each rollup circuit iteration. -At the very end, we will have a commitment to the transactions that were included in the block (`TxsHash`), the messages that were sent from L2 to L1 (`OutHash`) and the messages that were sent from L1 to L2 (`InHash`). +At the very end, we will have a commitment to the transactions that were included in the block (`txs_effects_hash`, calculated by squeezing a Poseidon2 sponge which iteratively absorbed each tx's effects), the messages that were sent from L2 to L1 (`OutHash`) and the messages that were sent from L1 to L2 (`InHash`). -To check that the body is published an Aztec node can simply reconstruct the hashes from available data. -Since we define finality as the point where the block is validated and included in the state of the [validating light node](../l1-smart-contracts/index.md), we can define a block as being "available" if the validating light node can reconstruct the commitment hashes. +The block body is published on L1 in a `blob`. We link this `blob` (an array of fields of our data, committed to by the EVM) to the `txs_effects_hash` above by proving our effects, which make the preimage of the `txs_effects_hash`, make the same commitment as the one calculated in the EVM. -Since the `InHash` is directly computed by the `Inbox` contract on L1, the data is obviously available to the contract without doing any more work. -Furthermore, the `OutHash` is a computed from a subset of the data in `TxsHash` so if it is possible to reconstruct `TxsHash` it is also possible to reconstruct `OutHash`. +Since we define finality as the point where the block is validated and included in the state of the [validating light node](../l1-smart-contracts/index.md), we can define a block as being "available" if the validating light node can reconstruct the commitment hashes and validate the blob. + +Since the `InHash` is directly computed by the `Inbox` contract on L1, the data is obviously available to the contract without doing any more work. The `OutHash` is published and used to verify the final epoch proof as a public input. Since we strive to minimize the compute requirements to prove blocks, we amortize the commitment cost across the full tree. We can do so by building merkle trees of partial "commitments", whose roots are ultimately computed in the final root rollup circuit. -Below, we outline the `TxsHash` merkle tree that is based on the `TxEffect`s and a `OutHash` which is based on the `l2_to_l1_msgs` (cross-chain messages) for each transaction, with four transactions in this rollup. -While the `TxsHash` implicitly includes the `OutHash` we need it separately such that it can be passed to the `Outbox` for consumption by the portals with minimal work. - -```mdx -import { Mermaid } from '@docusaurus/theme-mermaid'; - - -graph BT - R[TxsHash] - M0[Hash 0-1] - M1[Hash 2-3] - B0[Hash 0.0-0.1] - B1[Hash 1.0-1.1] - B2[Hash 2.0-2.1] - B3[Hash 3.0-3.1] - K0[TxEffect 0.0] - K1[TxEffect 0.1] - K2[TxEffect 1.0] - K3[TxEffect 1.1] - K4[TxEffect 2.0] - K5[TxEffect 2.1] - K6[TxEffect 3.0] - K7[TxEffect 3.1] - - M0 --> R - M1 --> R - B0 --> M0 - B1 --> M0 - B2 --> M1 - B3 --> M1 - K0 --> B0 - K1 --> B0 - K2 --> B1 - K3 --> B1 - K4 --> B2 - K5 --> B2 - K6 --> B3 - K7 --> B3 -``` +Below, we outline the `OutHash` and `InHash` merkle trees that are based on the `l2_to_l1_msgs` (cross-chain messages) and `l1_to_l2_msgs` for each transaction respectively, with four transactions in this rollup. ```mdx import { Mermaid } from '@docusaurus/theme-mermaid'; @@ -605,8 +567,7 @@ graph BT K7 --> B3 ``` -While the `TxsHash` merely require the data to be published and known to L1, the `InHash` and `OutHash` needs to be computable on L1 as well. -This reason require them to be efficiently computable on L1 while still being non-horrible inside a snark - leading us to rely on SHA256. +The `InHash` and `OutHash` need to be efficiently computable on L1 while still being non-horrible inside a snark - leading us to rely on SHA256. The L2 to L1 messages from each transaction form a variable height tree. In the diagram above, transactions 0 and 3 have four messages, so require a tree with two layers, whereas the others only have two messages and so require a single layer tree. The base rollup calculates the root of this tree and passes it as the to the next layer. Merge rollups simply hash both of these roots together and pass it up as the `OutHash`. diff --git a/docs/docs/protocol-specs/rollup-circuits/merge-rollup.md b/docs/docs/protocol-specs/rollup-circuits/merge-rollup.md index b63ac550b3da..5e5af0dbedae 100644 --- a/docs/docs/protocol-specs/rollup-circuits/merge-rollup.md +++ b/docs/docs/protocol-specs/rollup-circuits/merge-rollup.md @@ -13,6 +13,8 @@ A[MergeRollupInputs] --> C[MergeRollupCircuit] --> B[BaseOrMergeRollupPublicInpu ``` + + ## Overview Below is a subset of the data structures figure from earlier for easy reference. @@ -91,14 +93,16 @@ def MergeRollupCircuit( assert left.public_inputs.constants == right.public_inputs.constants assert left.public_inputs.end == right.public_inputs.start assert left.public_inputs.num_txs >= right.public_inputs.num_txs + assert left.public_inputs.end_sponge == right.public_inputs.start_sponge return BaseOrMergeRollupPublicInputs( type=1, num_txs=left.public_inputs.num_txs + right.public_inputs.num_txs, - txs_effect_hash=SHA256(left.public_inputs.txs_effect_hash | right.public_inputs.txs_effect_hash), out_hash=SHA256(left.public_inputs.out_hash | right.public_inputs.out_hash), start=left.public_inputs.start, end=right.public_inputs.end, + start_sponge=left.public_inputs.start_sponge, + end_sponge=right.public_inputs.end_sponge, constants=left.public_inputs.constants ) ``` diff --git a/docs/docs/protocol-specs/rollup-circuits/root-rollup.md b/docs/docs/protocol-specs/rollup-circuits/root-rollup.md index f4185de2b892..23e6f5a5aed3 100644 --- a/docs/docs/protocol-specs/rollup-circuits/root-rollup.md +++ b/docs/docs/protocol-specs/rollup-circuits/root-rollup.md @@ -19,6 +19,8 @@ For rollup purposes, the node we want to convince of the correctness is the [val This might practically happen through a series of "squisher" circuits that will wrap the proof in another proof that is cheaper to verify on-chain. For example, wrapping a ultra-plonk proof in a standard plonk proof. ::: + + ## Overview ```mdx @@ -192,6 +194,7 @@ def RootRollupCircuit( assert left.public_inputs.constants == right.public_inputs.constants assert left.public_inputs.end == right.public_inputs.start assert left.public_inputs.num_txs >= right.public_inputs.num_txs + assert left.public_inputs.end_sponge == right.public_inputs.start_sponge assert parent.state.partial == left.public_inputs.start @@ -216,7 +219,6 @@ def RootRollupCircuit( last_archive = left.public_inputs.constants.last_archive, content_commitment: ContentCommitment( num_txs=left.public_inputs.num_txs + right.public_inputs.num_txs, - txs_effect_hash=SHA256(left.public_inputs.txs_effect_hash | right.public_inputs.txs_effect_hash), in_hash = l1_to_l2_roots.public_inputs.sha_root, out_hash = SHA256(left.public_inputs.out_hash | right.public_inputs.out_hash), ), diff --git a/docs/docs/protocol-specs/state/wonky-tree.md b/docs/docs/protocol-specs/state/wonky-tree.md index fb7953f470d8..ca9f356aaf2e 100644 --- a/docs/docs/protocol-specs/state/wonky-tree.md +++ b/docs/docs/protocol-specs/state/wonky-tree.md @@ -267,7 +267,7 @@ graph BT M2_c --> R_c ``` -The tree is reconstructed to check the `txs_effects_hash` (= the root of a wonky tree given by leaves of each tx's `tx_effects`) on L1. We also reconstruct it to provide a membership path against the stored `out_hash` (= the root of a wonky tree given by leaves of each tx's L2 to L1 message tree root) for consuming a L2 to L1 message. +The tree is reconstructed to provide a membership path against the stored `out_hash` (= the root of a wonky tree given by leaves of each tx's L2 to L1 message tree root) for consuming a L2 to L1 message. Currently, this tree is built via the orchestrator given the number of transactions to rollup. Each 'node' is assigned a level (0 at the root) and index in that level. The below function finds the parent level: @@ -304,7 +304,7 @@ The while loop triggers and shifts up our node to `level = 2` and `index = 2`. T ### Flexible wonky trees -We can also encode the structure of _any_ binary merkle tree by tracking `number_of_branches` and `number_of_leaves` for each node in the tree. This encoding was originally designed for [logs](../logs/index.md) before they were included in the `txs_effects_hash`, so the below explanation references the leaves stored in relation to logs and transactions. +We can also encode the structure of _any_ binary merkle tree by tracking `number_of_branches` and `number_of_leaves` for each node in the tree. This encoding was originally designed for [logs](../logs/index.md), so the below explanation references the leaves stored in relation to logs and transactions. The benefit of this method as opposed to the one above is allowing for any binary structure and therefore allowing for 'skipping' leaves with no information. However, the encoding grows as the tree grows, by at least 2 bytes per node. The above implementation only requires the number of leaves to be encoded, which will likely only require a single field to store. diff --git a/docs/docs/reference/developer_references/common_errors/sandbox-errors.md b/docs/docs/reference/developer_references/common_errors/sandbox-errors.md index 41414eda80d2..39a48507dc20 100644 --- a/docs/docs/reference/developer_references/common_errors/sandbox-errors.md +++ b/docs/docs/reference/developer_references/common_errors/sandbox-errors.md @@ -185,9 +185,7 @@ Users may create a proof against a historical state in Aztec. The rollup circuit ## Sequencer Errors -- "Txs effects hash mismatch" - the sequencer assembles a block and sends it to the rollup circuits for proof generation. Along with the proof, the circuits return the hash of the transaction effects that must be sent to the Rollup contract on L1. Before doing so, the sequencer sanity checks that this hash is equivalent to the transaction effects hash of the block that it submitted. This could be a bug in our code e.g. if we are ordering things differently in circuits and in our transaction/block (e.g. incorrect ordering of encrypted logs or queued public calls). Easiest way to debug this is by printing the txs effects hash of the block both on the TS (in l2Block.getTxsEffectsHash()) and noir side (in the base rollup) - -- "\$\{treeName\} tree root mismatch" - like with txs effects hash mismatch, it validates that the root of the tree matches the output of the circuit simulation. The tree name could be Public data tree, Note Hash Tree, Contract tree, Nullifier tree or the L1ToL2Message tree, +- "\$\{treeName\} tree root mismatch" - The sequencer validates that the root of the tree matches the output of the circuit simulation. The tree name could be Public data tree, Note Hash Tree, Contract tree, Nullifier tree or the L1ToL2Message tree, - "\$\{treeName\} tree next available leaf index mismatch" - validating a tree's root is not enough. It also checks that the `next_available_leaf_index` is as expected. This is the next index we can insert new values into. Note that for the public data tree, this test is skipped since as it is a sparse tree unlike the others. diff --git a/docs/sidebars.js b/docs/sidebars.js index ae40d1e6a513..fb1cd78eb291 100644 --- a/docs/sidebars.js +++ b/docs/sidebars.js @@ -288,6 +288,7 @@ export default { items: [ "protocol-specs/data-publication-and-availability/overview", "protocol-specs/data-publication-and-availability/published-data", + "protocol-specs/data-publication-and-availability/blobs", ], }, { From 08e72e04f514ec97dddaa86f7360ac050dc42d7d Mon Sep 17 00:00:00 2001 From: MirandaWood Date: Tue, 4 Feb 2025 16:55:23 +0000 Subject: [PATCH 2/3] docs: update blob docs --- .../blobs.md | 44 +++++++++++++------ 1 file changed, 31 insertions(+), 13 deletions(-) diff --git a/docs/docs/protocol-specs/data-publication-and-availability/blobs.md b/docs/docs/protocol-specs/data-publication-and-availability/blobs.md index 39c05d3d2e82..b888aadec254 100644 --- a/docs/docs/protocol-specs/data-publication-and-availability/blobs.md +++ b/docs/docs/protocol-specs/data-publication-and-availability/blobs.md @@ -80,7 +80,7 @@ The latter is doable, but means encoding some maximum number of txs, `N`, to loo Instead, we manage state in the vein of `PartialStateReference`, where we provide a `start` and `end` state in each base and subsequent merge rollup circuits check that they follow on from one another. The base circuits themselves simply prove that adding the data of its tx indeed moves the state from `start` to `end`. -To encompass all the tx effects, we use a `poseidon2` sponge and absorb each field. We also track the number of fields added to ensure we don't overflow the blob (4096 BLS fields, which *can* fit 4112 BN254 fields, but adding the mapping between these is complex). Given that this struct is a sponge used for a blob, I have named it: +To encompass all the tx effects, we use a `poseidon2` sponge and absorb each field. We also track the number of fields added to ensure we don't overflow the blobs (4096 BLS fields per blob, with configurable `BLOBS_PER_BLOCK`). Given that this struct is a sponge used for a blob, I have named it: ```rs global IV: Field = (FIELDS_PER_BLOB as Field) * 18446744073709551616; @@ -125,12 +125,16 @@ The current route is to inline the blob functionality inside the block root circ First, we must gather all our tx effects ($d_i$ s). These will be injected as private inputs to the circuit and checked against the `SpongeBlob`s from the pair of `BaseOrMergeRollupPublicInputs` that we know contain all the effects in the block's txs. Like the merge circuit, the block root checks that the `left`'s `end` `SpongeBlob` == the `right`'s `start` `SpongeBlob`. -It then calls `squeeze()` on the `right`'s `end` `SpongeBlob` to produce the hash of all effects that will be in the blob. Let's call this `h`. The raw injected tx effects are `poseidon2` hashed and we check that the result matches `h`. We now have our set of $d_i$ s. +It then calls `squeeze()` on the `right`'s `end` `SpongeBlob` to produce the hash of all effects that will be in the block. Let's call this `h`. The raw injected tx effects are `poseidon2` hashed and we check that the result matches `h`. We now have our set of $d_i$ s. We now need to produce a challenge point `z`. This value must encompass the two 'commitments' used to represent the blob data: $C$ and `h` (see [here](https://notes.ethereum.org/@vbuterin/proto_danksharding_faq#Moderate-approach-works-with-any-ZK-SNARK) for more on the method). We simply provide $C$ as a public input to the block root circuit, and compute `z = poseidon2(h, C)`. +Note that with multiple blobs per block, each blob uses the same `h` but has a unique `C`. Since `h` does encompass all fields in the blob (plus some more) and the uniqueness of `C` ensures the uniqueness of `z`, this is acceptable. + The block root now has all the inputs required to call the blob functionality described above. Along with the usual `BlockRootOrBlockMergePublicInputs`, we also have `BlobPublicInputs`: $C$, $z$, and $y$. +Each blob in the block has its own set of `BlobPublicInputs`. Currently, each are propagated up to the Root circuit and verified on L1 against each blob. In future, we want to combine each insteance of `BlobPublicInputs` so the contract only has to call the precompile once per block. + ### L1 Contracts @@ -155,26 +159,40 @@ The function `propose()` takes in these `BlobPublicInputs` and a ts generated `k require(success, "Point evaluation precompile failed"); ``` -We have now linked the `BlobPublicInputs` ($C$, $z$, and $y$) to a published EVM blob. We still need to show that these inputs were generated in our rollup circuits corresponding to the blocks we claim. For each proposed block, we store them: +We have now linked the `BlobPublicInputs` ($C$, $z$, and $y$) to a published EVM blob. We still need to show that these inputs were generated in our rollup circuits corresponding to the blocks we claim. To avoid storing `BLOBS_PER_BLOCK * 4` fields per block, we hash all the `BlobPublicInputs` to `blobPublicInputsHash`. +For each proposed block, we store them: ```solidity -blobPublicInputs[blockNumber] = BlobPublicInputs({ - z, - y, - c, -}); +rollupStore.blobPublicInputsHashes[blockNumber] = blobPublicInputsHash; ``` -Then, when the epoch proof is submitted in `submitEpochRootProof()`, we access these to verify the ZKP: +Then, when the epoch proof is submitted in `submitEpochRootProof()`, we inject the raw `BlobPublicInputs`, hash them, and check this matches each block's `blobPublicInputsHash`. We use these to verify the ZKP: ```solidity // blob_public_inputs + uint256 blobOffset = 0; for (uint256 i = 0; i < _epochSize; i++) { - uint256 j = currentIndex + i; - publicInputs[j] = blobPublicInputs[previousBlockNumber + i + 1].z; - publicInputs[j + 1] = blobPublicInputs[previousBlockNumber + i + 1].y; - publicInputs[j + 2] = blobPublicInputs[previousBlockNumber + i + 1].c; + uint8 blobsInBlock = uint8(_blobPublicInputs[blobOffset++]); + for (uint256 j = 0; j < Constants.BLOBS_PER_BLOCK; j++) { + if (j < blobsInBlock) { + // z + publicInputs[offset++] = bytes32(_blobPublicInputs[blobOffset:blobOffset += 32]); + // y + (publicInputs[offset++], publicInputs[offset++], publicInputs[offset++]) = + bytes32ToBigNum(bytes32(_blobPublicInputs[blobOffset:blobOffset += 32])); + // c[0] + publicInputs[offset++] = + bytes32(uint256(uint248(bytes31(_blobPublicInputs[blobOffset:blobOffset += 31])))); + // c[1] + publicInputs[offset++] = + bytes32(uint256(uint136(bytes17(_blobPublicInputs[blobOffset:blobOffset += 17])))); + } else { + offset += Constants.BLOB_PUBLIC_INPUTS; + } + } } ``` +Notice that if a block needs less than `BLOBS_PER_BLOCK` blobs, we don't waste gas on calling the precompile or assigning public inputs for the unused blobs. If we incorrectly claim that (e.g.) the block used 2 blobs, when it actually used 3, the proof would not verify because `BlobPublicInputs` would exist for the third blob but they would not have been assigned in the above loop (see `offset += Constants.BLOB_PUBLIC_INPUTS`). + Note that we do not need to check that our $C$ matches the `blobhash` - the precompile does this for us. From 6061d0d86ea18ca309c989f413e8996d8cdffbd9 Mon Sep 17 00:00:00 2001 From: MirandaWood Date: Fri, 7 Feb 2025 09:36:56 +0000 Subject: [PATCH 3/3] docs: clarification --- docs/docs/migration_notes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/docs/migration_notes.md b/docs/docs/migration_notes.md index 46df824406f7..3ff86e966394 100644 --- a/docs/docs/migration_notes.md +++ b/docs/docs/migration_notes.md @@ -260,7 +260,7 @@ For this reason we've decided to rename it: To reduce loading times, the package `@aztec/noir-contracts.js` no longer exposes all artifacts as its default export. Instead, it exposes a `ContractNames` variable with the list of all contract names available. To import a given artifact, use the corresponding export, such as `@aztec/noir-contracts.js/FPC`. ### Blobs -We now publish all DA in EVM blobs rather than calldata. This replaces all code that touched the `txsEffectsHash`. +We now publish the majority of DA in L1 blobs rather than calldata, with only contract class logs remaining as calldata. This replaces all code that touched the `txsEffectsHash`. In the rollup circuits, instead of hashing each child circuit's `txsEffectsHash` to form a tree, we track tx effects by absorbing them into a sponge for blob data (hence the name: `spongeBlob`). This sponge is treated like the state trees in that we check each rollup circuit 'follows' the next: ```diff