Skip to content

feat(avm): merkle db hints (part 1)#12922

Merged
fcarreiro merged 11 commits intomasterfrom
fc/avm-merkle-hints
Mar 26, 2025
Merged

feat(avm): merkle db hints (part 1)#12922
fcarreiro merged 11 commits intomasterfrom
fc/avm-merkle-hints

Conversation

@fcarreiro
Copy link
Contributor

@fcarreiro fcarreiro commented Mar 21, 2025

This PR adds AVM hints for the following merkle operations

  • getPreviousValueIndex to get the low leaf index
  • getLeafPreimage to get the low leaf preimage
  • getSiblingPath to get the low leaf sibling path

On the C++ side, the operations are called slightly differently. I'm using the C++ world state db names.

This PR also separates the C++ DB interfaces into low level (basically the equivalent of the TS merkleops) and high level (equivalent of the public trees db/journal). This needed to be done because loose low level operations cannot necessarily be constrained. We usually need more context, and a coarser granularity. Therefore the idea is that low level ops are hinted (and unconstrained), and high level ops are constrained.

Hinting is currently tested via the deserialization tests, and it should be used by the bulk test, but we never get there (beyond bytecode processing). So some things might still be wrong.

I'm trying to get this out as quick as possible to unblock others.

Initializing HintedRawContractDB with...
 * contractInstances: 6
 * contractClasses: 3
 * bytecodeCommitments: 3
Initializing HintedRawMerkleDB with...
 * get_sibling_path hints: 3
 * get_previous_value_index hints: 27
 * get_leaf_preimage hints_public_data_tree: 3

PS: there's probably a lot of duplication happening now in the hints. We'll have to eventually deduplicate.

Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@fcarreiro fcarreiro force-pushed the fc/avm-merkle-hints branch 2 times, most recently from e187f39 to 95e145e Compare March 21, 2025 14:43
{
return std::hash<uint64_t>()(ff.data[0]) ^ (std::hash<uint64_t>()(ff.data[1]) << 1) ^
(std::hash<uint64_t>()(ff.data[2]) << 2) ^ (std::hash<uint64_t>()(ff.data[3]) << 3);
return bb::utils::hash_as_tuple(ff.data[0], ff.data[1], ff.data[2], ff.data[3]);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the original one. I think this one should work as well, just please review hash_as_tuple.

{
// See https://stackoverflow.com/questions/7110301/generic-hash-for-tuples-in-unordered-map-unordered-set.
size_t seed = 0;
((seed ^= std::hash<Ts>()(ts) + 0x9e3779b9 + (seed << 6) + (seed >> 2)), ...);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewer: could you check that the semantics match the intended in the link?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay to me, but I am a C++ noob!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know this variadic templates were a thing. Interesting! looks good

@fcarreiro fcarreiro force-pushed the fc/avm-merkle-hints branch 3 times, most recently from 2ecf2fe to ed9ab7e Compare March 25, 2025 15:25
@fcarreiro fcarreiro marked this pull request as ready for review March 25, 2025 15:32
private:
MerkleDBInterface& raw_merkle_db;
LowLevelMerkleDBInterface& raw_merkle_db;
// TODO: when you have a merkle gadget, consider marking it "mutable" so that read can be const.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one will take in the merkle gadget(s).

Copy link
Contributor

@dbanks12 dbanks12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Looks quite clean even despite the many TODOs/FIXMEs

{
// See https://stackoverflow.com/questions/7110301/generic-hash-for-tuples-in-unordered-map-unordered-set.
size_t seed = 0;
((seed ^= std::hash<Ts>()(ts) + 0x9e3779b9 + (seed << 6) + (seed >> 2)), ...);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay to me, but I am a C++ noob!

TxProofValidator,
} from '@aztec/p2p';
import { ProtocolContractAddress } from '@aztec/protocol-contracts';
import { readPublicState } from '@aztec/simulator/server';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that we are exposing too much from src/public if this readPublicState is accessible 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think originally it wasn't, it didn't even exist, but I think Palla created it to avoid code duplication. I'd rather have code duplication (in this case) than exposing things from PublicTreesDB!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checked and yes public/index.ts and avm/index.ts export many things that probably shouldn't. We should review when we have time.

Comment on lines +108 to +114
): Promise<
| {
index: bigint;
alreadyPresent: boolean;
}
| undefined
> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does that leading | do?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's just formatting

Comment on lines 322 to 326

// Will be patched/padded at the end of this fn
const endTreeSnapshots = new TreeSnapshots(
endStateReference.l1ToL2MessageTree,
endStateReference.partial.noteHashTree,
endStateReference.partial.nullifierTree,
endStateReference.partial.publicDataTree,
);

// FIXME: We are first creating the PIs with the wrong endTreeSnapshots, then patching them.
// This is because we need to know the lengths of the accumulated data arrays to pad them.
// We should refactor this to avoid this hack.
// We should just get the info we need from the trace, and create the rest of the PIs here.
const avmCircuitPublicInputs = this.trace.toAvmCircuitPublicInputs(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed!

Comment on lines +154 to 171
// Hint for MerkleTreeDB.getLeafPreimage.
// NOTE: I need this factory because in order to get hold of the schema, I need an actual instance of the class,
// having the type doesn't suffice since TS does type erasure in the end.
function AvmGetLeafPreimageHintFactory<T extends IndexedTreeLeaf>(klass: {
schema: z.ZodSchema;
new (...args: any[]): T;
}) {
return class AvmGetLeafPreimageHint {
constructor(
public readonly hintKey: AppendOnlyTreeSnapshot,
// params (tree id will be implicit)
public readonly index: bigint,
// return
public readonly leaf: T,
public readonly nextIndex: bigint,
public readonly nextValue: Fr,
) {}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand this comment, but I'm also not sure I need to....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically he can't do T::schema like you'd do in a language where generic types (and types in general) exist in runtime

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so he needs the actual T passed as a real argument

@fcarreiro fcarreiro force-pushed the fc/avm-merkle-hints branch from 594c71e to c63332b Compare March 25, 2025 17:01
Copy link
Contributor

@sirasistant sirasistant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Two details only

{
// See https://stackoverflow.com/questions/7110301/generic-hash-for-tuples-in-unordered-map-unordered-set.
size_t seed = 0;
((seed ^= std::hash<Ts>()(ts) + 0x9e3779b9 + (seed << 6) + (seed >> 2)), ...);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know this variadic templates were a thing. Interesting! looks good

@fcarreiro fcarreiro force-pushed the fc/avm-merkle-hints branch from 3eaee75 to 690a654 Compare March 26, 2025 10:09
@fcarreiro fcarreiro force-pushed the fc/avm-merkle-hints branch from 690a654 to 08ba375 Compare March 26, 2025 10:33
@fcarreiro fcarreiro force-pushed the fc/avm-merkle-hints branch from d6b4ede to 783bd19 Compare March 26, 2025 11:28
@fcarreiro fcarreiro merged commit 34ec9e8 into master Mar 26, 2025
8 checks passed
Copy link
Contributor Author

Merge activity

  • Mar 26, 8:08 AM EDT: A user merged this pull request with Graphite.

@fcarreiro fcarreiro deleted the fc/avm-merkle-hints branch March 26, 2025 12:08
DanielKotov pushed a commit that referenced this pull request Mar 27, 2025
This PR adds AVM hints for the following merkle operations
* getPreviousValueIndex to get the low leaf index
* getLeafPreimage to get the low leaf preimage
* getSiblingPath to get the low leaf sibling path

On the C++ side, the operations are called slightly differently. I'm using the C++ world state db names.

This PR also separates the C++ DB interfaces into low level (basically the equivalent of the TS merkleops) and high level (equivalent of the public trees db/journal). This needed to be done because loose low level operations cannot necessarily be constrained. We usually need more context, and a coarser granularity. Therefore the idea is that low level ops are hinted (and unconstrained), and high level ops are constrained.

Hinting is currently tested via the deserialization tests, and it should be used by the bulk test, but we never get there (beyond bytecode processing). So some things might still be wrong.

I'm trying to get this out as quick as possible to unblock others.

```
Initializing HintedRawContractDB with...
 * contractInstances: 6
 * contractClasses: 3
 * bytecodeCommitments: 3
Initializing HintedRawMerkleDB with...
 * get_sibling_path hints: 3
 * get_previous_value_index hints: 27
 * get_leaf_preimage hints_public_data_tree: 3
```

PS: there's probably a lot of duplication happening now in the hints. We'll have to eventually deduplicate.
DanielKotov pushed a commit that referenced this pull request Mar 27, 2025
This PR adds AVM hints for the following merkle operations
* getPreviousValueIndex to get the low leaf index
* getLeafPreimage to get the low leaf preimage
* getSiblingPath to get the low leaf sibling path

On the C++ side, the operations are called slightly differently. I'm using the C++ world state db names.

This PR also separates the C++ DB interfaces into low level (basically the equivalent of the TS merkleops) and high level (equivalent of the public trees db/journal). This needed to be done because loose low level operations cannot necessarily be constrained. We usually need more context, and a coarser granularity. Therefore the idea is that low level ops are hinted (and unconstrained), and high level ops are constrained.

Hinting is currently tested via the deserialization tests, and it should be used by the bulk test, but we never get there (beyond bytecode processing). So some things might still be wrong.

I'm trying to get this out as quick as possible to unblock others.

```
Initializing HintedRawContractDB with...
 * contractInstances: 6
 * contractClasses: 3
 * bytecodeCommitments: 3
Initializing HintedRawMerkleDB with...
 * get_sibling_path hints: 3
 * get_previous_value_index hints: 27
 * get_leaf_preimage hints_public_data_tree: 3
```

PS: there's probably a lot of duplication happening now in the hints. We'll have to eventually deduplicate.
PhilWindle pushed a commit that referenced this pull request Mar 27, 2025
🤖 I have created a new Aztec Packages release
---


##
[0.82.3](v0.82.2...v0.82.3)
(2025-03-27)


### Features

* `msgpack` encoding for `Program` and `WitnessStack`
([#12841](#12841))
([1e58eb1](1e58eb1))
* 64 bit log type id, 64 bit log metadata
([#12956](#12956))
([20d734a](20d734a))
* AVM parsing tag validation
([#12936](#12936))
([56b1f0d](56b1f0d))
* **avm:** add calldata & returndata to context
([#13008](#13008))
([f03b2e5](f03b2e5))
* **avm:** merkle db hints (part 1)
([#12922](#12922))
([34ec9e8](34ec9e8))
* **avm:** merkle hints (part 2)
([#13077](#13077))
([fbbc6c7](fbbc6c7))
* **avm:** vm2 initial context
([#12972](#12972))
([e2b1361](e2b1361))
* benchmark avm simulator
([#12985](#12985))
([00fae1b](00fae1b))
* client flows benchmarks
([#13007](#13007))
([9bf7568](9bf7568))
* gas benchmark for "normal usage"
([#13073](#13073))
([4eb1156](4eb1156))
* Implement merkle writes in the merkle check gadget
([#13050](#13050))
([c94fe50](c94fe50))
* LogEncryption trait
([#12942](#12942))
([0b7e564](0b7e564))
* Node snapshot sync
([#12927](#12927))
([afde851](afde851)),
closes
[#12926](#12926)
* **p2p:** capture all gossipsub metrics
([#12930](#12930))
([cc940cb](cc940cb))
* Prover node snapshot sync
([#13097](#13097))
([1e77efb](1e77efb))
* staking asset handler
([#12968](#12968))
([af48184](af48184)),
closes
[#12932](#12932)
* stream crs data to disk
([#12996](#12996))
([d016e4d](d016e4d)),
closes
[#12948](#12948)
* track failed tests. add flake.
([f4936d7](f4936d7))
* Track test history.
([#13037](#13037))
([036bb32](036bb32))
* track total tx fee
([#12601](#12601))
([9612a4e](9612a4e))
* Validators sentinel
([#12818](#12818))
([770695c](770695c))


### Bug Fixes

* added #[derive(Eq)] to EcdsaPublicKeyNote
([#12966](#12966))
([0c21c74](0c21c74))
* Allow use of local blob sink client
([#13025](#13025))
([ba8d654](ba8d654))
* **avm:** semicolons are hard
([#12999](#12999))
([8871c83](8871c83))
* bootstrap network and sponsored fpc devnet
([#13044](#13044))
([8a47d8b](8a47d8b))
* Bump tsc target
([#13052](#13052))
([985e83b](985e83b))
* cycle_group fuzzer
([#12921](#12921))
([69f426e](69f426e))
* **docs:** Fix import errors in aztec.js tutorial
([#12969](#12969))
([856208a](856208a))
* **docs:** Load token artifact from the compiled source in the sample
dapp tutorial
([#12802](#12802))
([0838084](0838084)),
closes
[#12810](#12810)
* **docs:** Update sponsored fpc docs to use 82.2 syntax
([#13054](#13054))
([e5d425b](e5d425b))
* **e2e:** p2p
([#13002](#13002))
([1ece539](1ece539))
* extend e2e 2 pxes timeout. strip color codes for error_regex.
([73820e4](73820e4))
* flake
([6cc9e81](6cc9e81))
* fuzzer on staking asset handler constructor test
([#13101](#13101))
([d936285](d936285))
* invalid getCommittee function
([#13072](#13072))
([327341f](327341f))
* mac publish should use clang 18 like x-compiler, and use it
([#12983](#12983))
([7b83c45](7b83c45))
* make circuit parsing deterministic
([#11772](#11772))
([76ef873](76ef873))
* parse away trailing slash from consensus host
([#12577](#12577))
([6701806](6701806))
* prerelease versions should be pushed to install.aztec.network
([#13086](#13086))
([c4e6039](c4e6039))
* smoke
([#13060](#13060))
([7756b15](7756b15))
* some flake additions
([58638f1](58638f1))
* sponsored fpc arg parsed correctly
([#12976](#12976))
([#12977](#12977))
([a85f530](a85f530))
* starting the sandbox with no pxe should still deploy initial test
accounts
([#13047](#13047))
([d92d895](d92d895))
* Syntax error when running tests via jest after tsc build
([#13051](#13051))
([f972db9](f972db9))
* Use the correct image in aztec start
([#13058](#13058))
([06285cd](06285cd))
* yolo fix
([91e2f4b](91e2f4b))
* yolo fix nightly
([b3b3259](b3b3259))
* yolo fix obvious thing to track fails.
([2fee630](2fee630))
* yolo flakes
([e3b030a](e3b030a))
* yolo set -x
([bfd3205](bfd3205))
* yolo we suspect the halt is making tests fail that would have passed
([04e3fa2](04e3fa2))


### Miscellaneous

* `getIndexedTaggingSecretAsSender` oracle cleanup
([#13015](#13015))
([8e71e55](8e71e55))
* Add a script to generate cpp files for AVM2
([#13091](#13091))
([7bb43a9](7bb43a9))
* add default native proving for cli wallet
([#12855](#12855))
([c0f773c](c0f773c))
* add default native proving for cli wallet retry
([#13028](#13028))
([b2f4785](b2f4785))
* Alpha testnet into master
([#13033](#13033))
([d98fdbd](d98fdbd))
* AVM TS - move tag validation outside of instruction constructors
([#13038](#13038))
([45548ab](45548ab)),
closes
[#12934](#12934)
* **avm:** final codegen nuking
([#13089](#13089))
([9c82f3f](9c82f3f))
* **avm:** remove codegen (all but flavor)
([#13079](#13079))
([e1f2bdd](e1f2bdd))
* **bb:** minor acir buf C++ improvements
([#13042](#13042))
([1ebd044](1ebd044))
* boxes dep cleanup
([#12979](#12979))
([6540b7c](6540b7c))
* **ci:** less catch all e2e_p2p flakes
([#12737](#12737))
([2134634](2134634))
* comprehensive cleanup of translator flavor and use inheritance
properly in flavors
([#13041](#13041))
([dc5f78f](dc5f78f))
* compress storage footprint
([#12871](#12871))
([58c110f](58c110f))
* display warning when installing bb versions &lt; 0.82.0
([#13027](#13027))
([7247fe7](7247fe7))
* **docs:** Update docs on fees and various other updates
([#12929](#12929))
([1dec907](1dec907))
* dump dmesg/net/cpu/mem usage at end of ci run
([#12967](#12967))
([8877792](8877792))
* fix governance util issue
([#13043](#13043))
([d768d26](d768d26))
* redundant if in affine from projective constructor
([#13045](#13045))
([3a7ba2d](3a7ba2d))
* remove addition of dummy ops in mock circuit producer
([#13003](#13003))
([a64d1dc](a64d1dc))
* remove dummy ops in decider pk
([#13049](#13049))
([da6d021](da6d021))
* replace relative paths to noir-protocol-circuits
([e1b88f6](e1b88f6))
* replace relative paths to noir-protocol-circuits
([849b4b0](849b4b0))
* replace relative paths to noir-protocol-circuits
([18a02d6](18a02d6))
* Revert "chore: add default native proving for cli wallet
([#12855](#12855))"
([#13013](#13013))
([98e2576](98e2576))
* Speed up and deflake sentinel test
([#13078](#13078))
([27f1eca](27f1eca))
* **testnet:** making consensus host mandatory input
([#12716](#12716))
([d47c74a](d47c74a))
* towards no more mock op_queues
([#12984](#12984))
([fefffa7](fefffa7))
* update bb version for noir 1.0.0-beta.0+
([#13026](#13026))
([dd68074](dd68074))
* update CODEOWNERS to reflect new sync method
([#12998](#12998))
([a3d1915](a3d1915))


### Documentation

* Add fees to cli reference
([#12884](#12884))
([4a0fd58](4a0fd58))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants