fix!: Prefix HeapVector with its size in ForeignCallParams#9288
fix!: Prefix HeapVector with its size in ForeignCallParams#9288
HeapVector with its size in ForeignCallParams#9288Conversation
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Test Suite Duration'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.
| Benchmark suite | Current: 2c4dfd3 | Previous: ee66c98 | Ratio |
|---|---|---|---|
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_rollup-lib |
3 s |
2 s |
1.50 |
test_report_zkpassport_noir_rsa_ |
2 s |
1 s |
2 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @TomAFrench
|
Okay, so I misunderstood the Then for some reason I thought the To formula I came up with is to derive the number of items in the slice involves calculating a type-dependent flattened size, where tuples are structs are flattened, but array length and content is treated as 1. We use this length to divide the capacity inserted in the VM, which gives us the number of items we need to consume in total to clear the slice from the data. |
Changes to Brillig bytecode sizes
🧾 Summary (10% most significant diffs)
Full diff report 👇
|
Changes to number of Brillig opcodes executed
🧾 Summary (10% most significant diffs)
Full diff report 👇
|
Changes to circuit sizes
🧾 Summary (10% most significant diffs)
Full diff report 👇
|
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Compilation Time'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.
| Benchmark suite | Current: e551e90 | Previous: 7408cbe | Ratio |
|---|---|---|---|
sha512-100-bytes |
1.951 s |
1.553 s |
1.26 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @TomAFrench
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Opcode count'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.10.
| Benchmark suite | Current: e551e90 | Previous: 7408cbe | Ratio |
|---|---|---|---|
sha512-100-bytes |
22169 opcodes |
13173 opcodes |
1.68 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @TomAFrench
vezenovm
left a comment
There was a problem hiding this comment.
So it seems you ultimately decided to pass the capacity (when reading the foreign call inputs in the VM) due to the call inputs all being flattened?
| Some(field_iterator.consumed() + capacity) | ||
| } else { | ||
| None | ||
| let flattened_capacity = field_iterator |
There was a problem hiding this comment.
I definitely prefer this approach to that of the parent. It didn't make sense to me that we should have to toggle with_length_prefix for both arrays and slices. As we have the types we should know that a slice always has a capacity prefix and an array never has that prefix.
There was a problem hiding this comment.
Let me explain the difficulties that I saw with fully reasoning out which parameter need to be padded and which one doesn't. It might be that there is a way to do it, I haven't fully thought it through.
So we have a number of ForeignCallParams, and where it's an Array then it might come from a HeapVector or a HeapArray, we don't know without looking at the type. However, the PrintableType we have are of a different granularity: say we have a Tuple as a parameter for the print, and it happens to contain a Slice and an Array field, interspersed with some others, then these are going to appear as separate ForeignCallParams, but to tell which one is the Slice and which one is an Array, we have to unpack the Tuple and match it to the parameters.
Now, I wasn't sure if one of the Tuple fields was itself a Tuple, then I think it gets flattened out as well, so we can't just iterate over the parameters, we have to recursively unpack the types until they match the fields. It seemed messy.
decode_ does it by recursing over the type tree, while consuming flattened data, so I thought if I prefix the vectors, then I would know where to expect it and where not to. But it turned out to be rather artificial, as the Array itself can contain complex data, and only the top one prefixed.
Then I tried to prefix at the source, and by doing so it makes it so much more sense: we attach not some artificial "byte size" but an actual type-dependent vector capacity. That made me finally understand what this capacity even means, which isn't trivial, but at least it depends on the types, not the size of the data, which is easier to attach in the interpreter for example.
The call inputs are not all flattened, although I kinda wish they would be, because it's a bit hard to understand still why something is or isn't flattened. Maybe it's how the variables in the SSA are? For example the members of a tuple are passed as separate parameters, but a slice of tuples are passed as a single array. And then there is the revert data, which is already flattened when the printer sees it, but cannot contain slices. The output of a foreign call also comes flattened. Debug variable content is stored flattened. I thought if we include the capacity at the source in the VM we can limit this special handling to slices in a way that is easy to explain, without having to radically rewrite the decoding part to match types with fields precisely. Not sure if this is an acceptable break in backwards compatibility, but I wanted to throw it out there. Doing this limits the scope of this change, for example debug variables automatically get it. |
HeapVector with its size in ForeignCallParams
vezenovm
left a comment
There was a problem hiding this comment.
I'm ok with this change, but I know @TomAFrench wanted to take a look before we merged. Either way we should make sure we do not break any Aztec oracles before merging.
I hacked together an alternative #9321 that avoids adding a pre-pending a capacity to slice foreign call inputs that we could consider as well. It instead removes any padding from the vector after calling into get_memory_values.
|
Closing in favour of #9321 |
Description
Problem*
#9287 tries to handle slices only in the printing process, which is a bit clunky and full of caveats. This is an alternative approach which prefixes slices when the Brillig VM reads them from memory.
Summary*
Changes
brillig_vm::VM::get_memory_valuesto prefixHeapVectordata by its size, which should be the capacity of slices. By doing it inget_memory_values, before the values are converted toForeignCallParam::Array, we can easily limit this handling to just slices, but notHeapArray, where the length is known, and then nested types are not a problem during decoding.This is a breaking change because any foreign call handler decoding slices would need to expect 1 extra field in the data.
Additional Context
Documentation*
Check one:
PR Checklist*
cargo fmton default settings.