Skip to content

fix!: Prefix HeapVector with its size in ForeignCallParams#9288

Closed
aakoshh wants to merge 22 commits intomasterfrom
af/9271-fix-print-slice-alt
Closed

fix!: Prefix HeapVector with its size in ForeignCallParams#9288
aakoshh wants to merge 22 commits intomasterfrom
af/9271-fix-print-slice-alt

Conversation

@aakoshh
Copy link
Copy Markdown
Contributor

@aakoshh aakoshh commented Jul 23, 2025

Description

Problem*

#9287 tries to handle slices only in the printing process, which is a bit clunky and full of caveats. This is an alternative approach which prefixes slices when the Brillig VM reads them from memory.

Summary*

Changes brillig_vm::VM::get_memory_values to prefix HeapVector data by its size, which should be the capacity of slices. By doing it in get_memory_values, before the values are converted to ForeignCallParam::Array, we can easily limit this handling to just slices, but not HeapArray, where the length is known, and then nested types are not a problem during decoding.

This is a breaking change because any foreign call handler decoding slices would need to expect 1 extra field in the data.

Additional Context

Documentation*

Check one:

  • No documentation needed.
  • Documentation included in this PR.
  • [For Experimental Features] Documentation to be submitted in a separate PR.

PR Checklist*

  • I have tested the changes locally.
  • I have formatted the changes with Prettier and/or cargo fmt on default settings.

@aakoshh aakoshh changed the title Try prefixing just slices fix!: Prefix HeapVector with its length in ForeignCallParams Jul 23, 2025
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Test Suite Duration'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: 2c4dfd3 Previous: ee66c98 Ratio
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_rollup-lib 3 s 2 s 1.50
test_report_zkpassport_noir_rsa_ 2 s 1 s 2

This comment was automatically generated by workflow using github-action-benchmark.

CC: @TomAFrench

@aakoshh aakoshh marked this pull request as ready for review July 23, 2025 13:46
@aakoshh aakoshh marked this pull request as draft July 23, 2025 14:12
@aakoshh
Copy link
Copy Markdown
Contributor Author

aakoshh commented Jul 23, 2025

Okay, so I misunderstood the size in get_memory_values. Initially I thought it's the capacity, but then it turns out it's a bit more involved, because for a slice of tuples, it's going to be capacity*item_size, e.g. [(1,2), (3,4)] has a size of 4. Then I wanted implement flattened_size for the PrintableType, and divide to get the number of items I need to consume, however I didn't know what to do for Enum where the number of fields depends on the tag at runtime, and it turns out it crashes if I want to print it in #9294

Then for some reason I thought the size would correspond to the number of fields in the data (or actually inherited that approach from #9287), but that's not true: [[1,2], [3,4]] has a size of 2, ie. 2 arrays in the slice, and [(0, true, [1, 2]), (3, false, [5, 6]),] has a size of 6, despite all 8 fields being in the slice: the arrays only contribute once.

To formula I came up with is to derive the number of items in the slice involves calculating a type-dependent flattened size, where tuples are structs are flattened, but array length and content is treated as 1. We use this length to divide the capacity inserted in the VM, which gives us the number of items we need to consume in total to clear the slice from the data.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jul 23, 2025

Changes to Brillig bytecode sizes

Generated at commit: ed6eeb8912364a6f192bcdb8b64654d954da76ff, compared to commit: d88542d6ba7ca633f7456fb9346ac3cfcff7b70d

🧾 Summary (10% most significant diffs)

Program Brillig opcodes (+/-) %
regression_9271_inliner_max +420 ❌ +88.79%
regression_9271_inliner_min +420 ❌ +88.79%
regression_9271_inliner_zero +420 ❌ +88.79%

Full diff report 👇
Program Brillig opcodes (+/-) %
regression_9271_inliner_max 893 (+420) +88.79%
regression_9271_inliner_min 893 (+420) +88.79%
regression_9271_inliner_zero 893 (+420) +88.79%

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jul 23, 2025

Changes to number of Brillig opcodes executed

Generated at commit: ed6eeb8912364a6f192bcdb8b64654d954da76ff, compared to commit: d88542d6ba7ca633f7456fb9346ac3cfcff7b70d

🧾 Summary (10% most significant diffs)

Program Brillig opcodes (+/-) %
regression_9271_inliner_max +422 ❌ +94.41%
regression_9271_inliner_min +422 ❌ +94.41%
regression_9271_inliner_zero +422 ❌ +94.41%

Full diff report 👇
Program Brillig opcodes (+/-) %
regression_9271_inliner_max 869 (+422) +94.41%
regression_9271_inliner_min 869 (+422) +94.41%
regression_9271_inliner_zero 869 (+422) +94.41%

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jul 23, 2025

Changes to circuit sizes

Generated at commit: ed6eeb8912364a6f192bcdb8b64654d954da76ff, compared to commit: d88542d6ba7ca633f7456fb9346ac3cfcff7b70d

🧾 Summary (10% most significant diffs)

Program ACIR opcodes (+/-) % Circuit size (+/-) %
regression_9271 +1 ❌ +33.33% 0 ➖ 0.00%

Full diff report 👇
Program ACIR opcodes (+/-) % Circuit size (+/-) %
regression_9271 4 (+1) +33.33% 18 (0) 0.00%

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Compilation Time'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: e551e90 Previous: 7408cbe Ratio
sha512-100-bytes 1.951 s 1.553 s 1.26

This comment was automatically generated by workflow using github-action-benchmark.

CC: @TomAFrench

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Opcode count'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.10.

Benchmark suite Current: e551e90 Previous: 7408cbe Ratio
sha512-100-bytes 22169 opcodes 13173 opcodes 1.68

This comment was automatically generated by workflow using github-action-benchmark.

CC: @TomAFrench

Copy link
Copy Markdown
Contributor

@vezenovm vezenovm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it seems you ultimately decided to pass the capacity (when reading the foreign call inputs in the VM) due to the call inputs all being flattened?

Some(field_iterator.consumed() + capacity)
} else {
None
let flattened_capacity = field_iterator
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely prefer this approach to that of the parent. It didn't make sense to me that we should have to toggle with_length_prefix for both arrays and slices. As we have the types we should know that a slice always has a capacity prefix and an array never has that prefix.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me explain the difficulties that I saw with fully reasoning out which parameter need to be padded and which one doesn't. It might be that there is a way to do it, I haven't fully thought it through.

So we have a number of ForeignCallParams, and where it's an Array then it might come from a HeapVector or a HeapArray, we don't know without looking at the type. However, the PrintableType we have are of a different granularity: say we have a Tuple as a parameter for the print, and it happens to contain a Slice and an Array field, interspersed with some others, then these are going to appear as separate ForeignCallParams, but to tell which one is the Slice and which one is an Array, we have to unpack the Tuple and match it to the parameters.

Now, I wasn't sure if one of the Tuple fields was itself a Tuple, then I think it gets flattened out as well, so we can't just iterate over the parameters, we have to recursively unpack the types until they match the fields. It seemed messy.

decode_ does it by recursing over the type tree, while consuming flattened data, so I thought if I prefix the vectors, then I would know where to expect it and where not to. But it turned out to be rather artificial, as the Array itself can contain complex data, and only the top one prefixed.

Then I tried to prefix at the source, and by doing so it makes it so much more sense: we attach not some artificial "byte size" but an actual type-dependent vector capacity. That made me finally understand what this capacity even means, which isn't trivial, but at least it depends on the types, not the size of the data, which is easier to attach in the interpreter for example.

@aakoshh
Copy link
Copy Markdown
Contributor Author

aakoshh commented Jul 23, 2025

So it seems you ultimately decided to pass the capacity (when reading the foreign call inputs in the VM) due to the call inputs all being flattened?

The call inputs are not all flattened, although I kinda wish they would be, because it's a bit hard to understand still why something is or isn't flattened. Maybe it's how the variables in the SSA are?

For example the members of a tuple are passed as separate parameters, but a slice of tuples are passed as a single array.

And then there is the revert data, which is already flattened when the printer sees it, but cannot contain slices. The output of a foreign call also comes flattened. Debug variable content is stored flattened.

I thought if we include the capacity at the source in the VM we can limit this special handling to slices in a way that is easy to explain, without having to radically rewrite the decoding part to match types with fields precisely. Not sure if this is an acceptable break in backwards compatibility, but I wanted to throw it out there. Doing this limits the scope of this change, for example debug variables automatically get it.

@aakoshh aakoshh marked this pull request as ready for review July 23, 2025 19:39
@aakoshh aakoshh requested a review from a team July 23, 2025 19:40
@aakoshh aakoshh changed the title fix!: Prefix HeapVector with its length in ForeignCallParams fix!: Prefix HeapVector with its size in ForeignCallParams Jul 23, 2025
@aakoshh aakoshh changed the base branch from af/9271-fix-print-slice to master July 24, 2025 06:57
Copy link
Copy Markdown
Contributor

@vezenovm vezenovm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with this change, but I know @TomAFrench wanted to take a look before we merged. Either way we should make sure we do not break any Aztec oracles before merging.

I hacked together an alternative #9321 that avoids adding a pre-pending a capacity to slice foreign call inputs that we could consider as well. It instead removes any padding from the vector after calling into get_memory_values.

@aakoshh
Copy link
Copy Markdown
Contributor Author

aakoshh commented Jul 25, 2025

Closing in favour of #9321

@aakoshh aakoshh closed this Jul 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants