feat: hash contract class logs with poseidon + add preimage to blob#12061
feat: hash contract class logs with poseidon + add preimage to blob#12061
Conversation
| // TODO(MW): use chunked poseidon if cheaper? Use variable poseidon with length of non-0 fields? | ||
| let log_hash = | ||
| unsafe { emit_contract_class_unencrypted_log_private(contract_address, log, counter) }; | ||
| std::hash::poseidon2::Poseidon2::hash(log_to_emit, CONTRACT_CLASS_LOG_SIZE_IN_FIELDS); | ||
| // Safety: TODO(MW): constraining below length in the base rollup. Is this good/safe? | ||
| let length = unsafe { find_index_hint_from_end(log_to_emit, |elem| !is_empty(elem)) }; | ||
| // Safety: the below only exists to broadcast the raw log, so we can provide it to the base rollup later to be constrained. |
There was a problem hiding this comment.
Two Qs here:
- I'm just hashing the entire (padded) log here because it's easy and clean. I could hash just the N input fields. This is a bit more expensive in the circuits (not by much, I'd use the chunk variable hash) but might be a lot faster elsewhere. I already have code to hash N of M fields, and check the remaining M - N fields are all zero.
- I'm carrying the log length (N) throughout the kernels to track gas since we are only adding N things to the blob. I'm not sure this is even required atm because logs are set to cost 0? I'm also not constraining this length until the base rollup when we verify the hash. I'm not sure if this is ok.
There was a problem hiding this comment.
I'd potentially go with the unpadded approach, since our recent discussions (via that hackmd) suggest that eventually (probably in a separate PR), we don't actually need to hash in this circuit at all.
We probably do need to track the length, because ultimately these logs should cost a relative fortune. And eventually (as per prev paragraph) only the Base will need to constrain this length (because we'll remove constrained hashing of this log from private-land).
There was a problem hiding this comment.
because we'll remove constrained hashing of this log from private-land
Shall I just do this now, and make the oracle hash again? I don't remember now - but wasn't there some reason we weren't happy with the original oracle hash even though we verified the hash later?
There was a problem hiding this comment.
Sorry for the delay; I need to sort out my notification settings.
Shall I just do this now, and make the oracle hash again?
Can do! It can either be an oracle, or just make the function unconstrained? But you messaged 14 hours ago, so maybe you've already chosen an approach, which I'll find out when I read further down this page :)
There was a problem hiding this comment.
wasn't there some reason we weren't happy with the original oracle hash even though we verified the hash later?
I can't remember 😬
There was a problem hiding this comment.
Just FYI Alvaro found the unpadded hash is insecure unless we also constrain the length in the registeter (which just adds gates. Fixed len hash is about the same gates). From @sirasistant:
I think that opens the attack vector:
- Private chooses a shorter length
- Private only hashes the shorter length
- Base rollup receives the shorter preimage
- Hashes match
- We publish incomplete bytecode
I willl revert to a fixed length hash
yarn-project/archiver/src/archiver/kv_archiver_store/log_store.ts
Outdated
Show resolved
Hide resolved
yarn-project/archiver/src/archiver/memory_archiver_store/memory_archiver_store.ts
Outdated
Show resolved
Hide resolved
Tracking an address alongside the cc log to aid filters in the archiver. The address is always constrained to be that of the registerer, but many archiver tests use random addresses so they brick without ugly hacks
|
👀 |
There was a problem hiding this comment.
Sorry - this file is changed by yarn format and I forgot to remove from staged - nothing to do with PR. If the next CI run passes, I'll remove it.
iAmMichaelConnor
left a comment
There was a problem hiding this comment.
Wooooh I read it all! Thanks for doing all this! It's quite scary how much of the codebase needs to be touched to change a very very thin "slice" of functionality. It's essentially just changing a public input of the private context, and it blows up into thousands of lines.
Anyway, I've written comments, and a lot of them will be because this is the first time I'm looking at most of these files. :)
| let log_hash = poseidon2_hash_subarray_variable(log_to_emit, length); | ||
| // Safety: the below only exists to broadcast the raw log, so we can provide it to the base rollup later to be constrained. | ||
| unsafe { | ||
| emit_contract_class_unencrypted_log_private(contract_address, log_to_emit, counter); |
There was a problem hiding this comment.
Forewarning, I still don't really know my way around the codebase or our naming conventions.
Forewarning, sometimes my review comments are pedantic.
I think I've seen a naming pattern for other oracles which just inform typescript about something (e.g. when lining up a hint for some later kernel / rollup circuit): the word "notify" is used. Perhaps this oracle function should be called notify_....
There was a problem hiding this comment.
I've not really touched the oracle that much, but since we actually add state in ts which is different to the state tracked here (raw log vs hashed log), maybe emit still stands. I think the notify is different but unsure?
| @@ -217,13 +220,18 @@ pub contract ContractClassRegisterer { | |||
| // cannot prove non-registration. Therefore, it is possible that a malicious oracle might prevent sequencers | |||
There was a problem hiding this comment.
I think some of this comment is out of date now, since your special poseidon2_hash_subarray_variable function is able to hash the whole thing in-circuit, so the purpose of the oracle is now just to inform typescript, rather than its previous purpose as a hack to compute the log_hash.
There was a problem hiding this comment.
True, I was hoping the reviewer would give me some guidance on this. If we move the hash to the oracle / unconstrained as planned this comment is still correct isn't it? Shall I change it then change it back?
noir-projects/noir-protocol-circuits/crates/types/src/utils/arrays.nr
Outdated
Show resolved
Hide resolved
noir-projects/noir-protocol-circuits/crates/types/src/utils/arrays.nr
Outdated
Show resolved
Hide resolved
| // TODO(MW): Lazily used /2 instead of CONTRACT_CLASS_LOG_DATA_SIZE_IN_FIELDS, because this keeps overfilling block blobs | ||
| // Below line gives error 'Type instantiation is excessively deep and possibly infinite. ts(2589)' | ||
| // makeTuple(CONTRACT_CLASS_LOG_DATA_SIZE_IN_FIELDS, Fr.random); | ||
| const fields = Array.from({ length: Math.ceil(CONTRACT_CLASS_LOG_DATA_SIZE_IN_FIELDS / 2) }, () => Fr.random()); | ||
| return new ContractClassLog( | ||
| await AztecAddress.random(), | ||
| fields.concat(Array.from({ length: Math.floor(CONTRACT_CLASS_LOG_DATA_SIZE_IN_FIELDS / 2) }, () => Fr.ZERO)), | ||
| ); |
There was a problem hiding this comment.
Maybe comment something like "Only half of the contract class log's fields get filled with random values here (ideally all of the fields would be filled with random values), because...". It took my brain a sec to understand this.
There was a problem hiding this comment.
Can we not ignore the error, like I saw you do elsewhere?
There was a problem hiding this comment.
No - it's a nightmare because they are tuples, not arrays. I couldn't get tuples defined in our usual way working at all
| } | ||
|
|
||
| async hash() { | ||
| return await poseidon2Hash(this.getEmittedFields()); |
There was a problem hiding this comment.
Maybe comment that this is one of two approaches to hashing this data, to disambiguate from the weird AVM way of hashing the data.
There was a problem hiding this comment.
I don't think these logs are ever hashed in the AVM way at all? Information about functions is hashed in the AVM way, but not these logs AFAIK
|
|
||
| /** | ||
| * Gets either revertible or non revertible contract class logs emitted by this tx. | ||
| * @param revertible - true for revertible only logs, false for non reverible only logs. |
There was a problem hiding this comment.
Oh gosh, I didn't even pay attention to any logic that discerned between rev and non-rev in the .nr files...
There was a problem hiding this comment.
I didn't need to change it because before this PR we use a log hash in the kernels. After this PR we... use a log hash in the kernels :)
LeilaWang
left a comment
There was a problem hiding this comment.
Agree with some of Mike's comments, but maybe they can be addressed in followup prs. It looks great to me!
After this: #12061 we were overflowing the trace on contract class registrations. This was not caught by tests due to: - Insufficient tests with full proving (and the ones we have deploy without proving!) - Insufficient WASM testing: this overflow caused the overflowing trace to go over 4GB
After this: AztecProtocol/aztec-packages#12061 we were overflowing the trace on contract class registrations. This was not caught by tests due to: - Insufficient tests with full proving (and the ones we have deploy without proving!) - Insufficient WASM testing: this overflow caused the overflowing trace to go over 4GB
After this: #12061 we were overflowing the trace on contract class registrations. This was not caught by tests due to: - Insufficient tests with full proving (and the ones we have deploy without proving!) - Insufficient WASM testing: this overflow caused the overflowing trace to go over 4GB
After this: #12061 we were overflowing the trace on contract class registrations. This was not caught by tests due to: - Insufficient tests with full proving (and the ones we have deploy without proving!) - Insufficient WASM testing: this overflow caused the overflowing trace to go over 4GB
Quick and dirtySomewhat clean implementation which:LogHashes with poseidon hashes and constrain them in the registerer.EDIT: Added log validation to tx in this commit: cb739a2
EDIT 2: We may not need to constrain the hash in the registerer, so can revert to the oracle hashing the log and leaving the constraining to the base rollup in a future PR.