feat(SSA): always simplify AsSlice intrinsic in ACIR functions#10279
feat(SSA): always simplify AsSlice intrinsic in ACIR functions#10279
Conversation
Changes to circuit sizes
🧾 Summary (10% most significant diffs)
Full diff report 👇
|
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Test Suite Duration'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.
| Benchmark suite | Current: 87176a2 | Previous: 2f73a86 | Ratio |
|---|---|---|---|
test_report_zkpassport_noir-ecdsa_ |
3 s |
2 s |
1.50 |
test_report_zkpassport_noir_rsa_ |
2 s |
1 s |
2 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @TomAFrench
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Execution Time'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.
| Benchmark suite | Current: cce2476 | Previous: e34f4ee | Ratio |
|---|---|---|---|
rollup-block-root-single-tx |
0.003 s |
0.002 s |
1.50 |
rollup-checkpoint-merge |
0.004 s |
0.003 s |
1.33 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @TomAFrench
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Compilation Time'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.
| Benchmark suite | Current: 5872785 | Previous: ca04e9c | Ratio |
|---|---|---|---|
private-kernel-reset |
8.442 s |
6.98 s |
1.21 |
sha512-100-bytes |
1.907 s |
1.58 s |
1.21 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @TomAFrench
vezenovm
left a comment
There was a problem hiding this comment.
I am unsure whether constant instruction simplification is the right place for this code. Upon inserting an AsSlice we will always generate this alternative code. At that point this logic may as well just live in SSA gen. If we figure out why there is a Brillig byte code regression maybe we could even get rid of the AsSlice intrinsic entirely (and thus the as_slice_length pass as well).
| // ``` | ||
| // | ||
| // We don't do this for Brillig because it sometimes leads to more opcodes. | ||
| if dfg.runtime().is_acir() { |
There was a problem hiding this comment.
Could we reduce nesting here with an inverted condition and early return
| Instruction::MakeArray { elements, typ: Type::Slice(element_types.clone()) }; | ||
| let slice_length = dfg.make_constant(length.into(), NumericType::length_type()); | ||
| let new_slice = | ||
| dfg.insert_instruction_and_results(make_array, block, None, call_stack).first(); |
There was a problem hiding this comment.
There is a make_array helper function
I think what happens is that we need code to copy each of the element from the array into the slice. The bigger the array, more ops are needed for this. While with the current |
There was a problem hiding this comment.
ACVM Benchmarks
Details
| Benchmark suite | Current: cce2476 | Previous: e34f4ee | Ratio |
|---|---|---|---|
purely_sequential_opcodes |
262581 ns/iter (± 582) |
263115 ns/iter (± 1711) |
1.00 |
perfectly_parallel_opcodes |
231922 ns/iter (± 4599) |
232833 ns/iter (± 1560) |
1.00 |
perfectly_parallel_batch_inversion_opcodes |
2794302 ns/iter (± 1864) |
2796347 ns/iter (± 10387) |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Test Suite Duration
Details
| Benchmark suite | Current: cce2476 | Previous: e34f4ee | Ratio |
|---|---|---|---|
test_report_AztecProtocol_aztec-packages_noir-projects_aztec-nr |
117 s |
128 s |
0.91 |
test_report_AztecProtocol_aztec-packages_noir-projects_noir-contracts |
156 s |
131 s |
1.19 |
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_blob |
308 s |
287 s |
1.07 |
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_private-kernel-lib |
239 s |
226 s |
1.06 |
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_types |
123 s |
125 s |
0.98 |
test_report_noir-lang_noir-bignum_ |
161 s |
164 s |
0.98 |
test_report_noir-lang_noir_bigcurve_ |
336 s |
319 s |
1.05 |
test_report_noir-lang_sha256_ |
14 s |
15 s |
0.93 |
test_report_noir-lang_sha512_ |
13 s |
14 s |
0.93 |
test_report_zkpassport_noir-ecdsa_ |
1 s |
1 s |
1 |
test_report_zkpassport_noir_rsa_ |
1 s |
2 s |
0.50 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Execution Time
Details
| Benchmark suite | Current: cce2476 | Previous: e34f4ee | Ratio |
|---|---|---|---|
private-kernel-inner |
0.011 s |
0.011 s |
1 |
private-kernel-reset |
0.146 s |
0.147 s |
0.99 |
private-kernel-tail |
0.009 s |
0.009 s |
1 |
rollup-block-root-first-empty-tx |
0.003 s |
0.003 s |
1 |
rollup-block-root-single-tx |
0.003 s |
0.002 s |
1.50 |
rollup-block-root |
0.004 s |
0.004 s |
1 |
rollup-checkpoint-merge |
0.004 s |
0.003 s |
1.33 |
rollup-checkpoint-root-single-block |
11.4 s |
11.3 s |
1.01 |
rollup-checkpoint-root |
11.6 s |
11.2 s |
1.04 |
rollup-root |
0.004 s |
0.004 s |
1 |
rollup-tx-base-private |
0.299 s |
0.298 s |
1.00 |
rollup-tx-base-public |
0.239 s |
0.238 s |
1.00 |
rollup-tx-merge |
0.002 s |
0.002 s |
1 |
semaphore-depth-10 |
0.009 s |
0.009 s |
1 |
sha512-100-bytes |
0.049 s |
0.076 s |
0.64 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Compilation Time
Details
| Benchmark suite | Current: cce2476 | Previous: e34f4ee | Ratio |
|---|---|---|---|
private-kernel-inner |
1.72 s |
1.746 s |
0.99 |
private-kernel-reset |
7 s |
7.32 s |
0.96 |
private-kernel-tail |
1.412 s |
1.356 s |
1.04 |
rollup-block-root-first-empty-tx |
1.41 s |
1.43 s |
0.99 |
rollup-block-root-single-tx |
1.41 s |
1.34 s |
1.05 |
rollup-block-root |
1.43 s |
1.43 s |
1 |
rollup-checkpoint-merge |
1.462 s |
1.452 s |
1.01 |
rollup-checkpoint-root-single-block |
200 s |
210 s |
0.95 |
rollup-checkpoint-root |
207 s |
196 s |
1.06 |
rollup-root |
1.494 s |
1.688 s |
0.89 |
rollup-tx-base-private |
21.3 s |
18.08 s |
1.18 |
rollup-tx-base-public |
76.56 s |
77.18 s |
0.99 |
rollup-tx-merge |
1.382 s |
1.362 s |
1.01 |
semaphore-depth-10 |
0.803 s |
0.784 s |
1.02 |
sha512-100-bytes |
1.665 s |
1.907 s |
0.87 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Opcode count
Details
| Benchmark suite | Current: cce2476 | Previous: e34f4ee | Ratio |
|---|---|---|---|
private-kernel-inner |
14544 opcodes |
14544 opcodes |
1 |
private-kernel-reset |
70415 opcodes |
70415 opcodes |
1 |
private-kernel-tail |
11680 opcodes |
11680 opcodes |
1 |
rollup-block-root-first-empty-tx |
1364 opcodes |
1364 opcodes |
1 |
rollup-block-root-single-tx |
1048 opcodes |
1048 opcodes |
1 |
rollup-block-root |
2409 opcodes |
2409 opcodes |
1 |
rollup-checkpoint-merge |
2130 opcodes |
2130 opcodes |
1 |
rollup-checkpoint-root-single-block |
962015 opcodes |
962015 opcodes |
1 |
rollup-checkpoint-root |
963375 opcodes |
963375 opcodes |
1 |
rollup-root |
2630 opcodes |
2630 opcodes |
1 |
rollup-tx-base-private |
263908 opcodes |
263908 opcodes |
1 |
rollup-tx-base-public |
245185 opcodes |
245185 opcodes |
1 |
rollup-tx-merge |
1486 opcodes |
1486 opcodes |
1 |
semaphore-depth-10 |
5699 opcodes |
5699 opcodes |
1 |
sha512-100-bytes |
13173 opcodes |
13173 opcodes |
1 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Artifact Size
Details
| Benchmark suite | Current: cce2476 | Previous: e34f4ee | Ratio |
|---|---|---|---|
private-kernel-inner |
714.3 KB |
714.3 KB |
1 |
private-kernel-reset |
1864 KB |
1864 KB |
1 |
private-kernel-tail |
546.2 KB |
546.2 KB |
1 |
rollup-block-root-first-empty-tx |
179.6 KB |
179.6 KB |
1 |
rollup-block-root-single-tx |
177.9 KB |
177.9 KB |
1 |
rollup-block-root |
257.9 KB |
257.9 KB |
1 |
rollup-checkpoint-merge |
370.6 KB |
370.6 KB |
1 |
rollup-checkpoint-root-single-block |
27640.2 KB |
27640.2 KB |
1 |
rollup-checkpoint-root |
27685.9 KB |
27685.9 KB |
1 |
rollup-root |
411.4 KB |
411.4 KB |
1 |
rollup-tx-base-private |
4909.3 KB |
4909.3 KB |
1 |
rollup-tx-base-public |
4555.4 KB |
4555.4 KB |
1 |
rollup-tx-merge |
186.1 KB |
186.1 KB |
1 |
semaphore-depth-10 |
570.9 KB |
570.9 KB |
1 |
sha512-100-bytes |
506.3 KB |
506.3 KB |
1 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Execution Memory
Details
| Benchmark suite | Current: cce2476 | Previous: e34f4ee | Ratio |
|---|---|---|---|
private-kernel-inner |
253.63 MB |
253.63 MB |
1 |
private-kernel-reset |
287.24 MB |
287.24 MB |
1 |
private-kernel-tail |
243.5 MB |
243.5 MB |
1 |
rollup-block-root |
337.87 MB |
337.87 MB |
1 |
rollup-checkpoint-merge |
336.67 MB |
336.67 MB |
1 |
rollup-checkpoint-root-single-block |
1020 MB |
1020 MB |
1 |
rollup-checkpoint-root |
1020 MB |
1020 MB |
1 |
rollup-root |
337.87 MB |
337.87 MB |
1 |
rollup-tx-base-private |
451.42 MB |
451.42 MB |
1 |
rollup-tx-base-public |
466.49 MB |
466.49 MB |
1 |
rollup-tx-merge |
336.13 MB |
336.13 MB |
1 |
semaphore_depth_10 |
73.7 MB |
73.7 MB |
1 |
sha512_100_bytes |
71.96 MB |
71.96 MB |
1 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Compilation Memory
Details
| Benchmark suite | Current: cce2476 | Previous: e34f4ee | Ratio |
|---|---|---|---|
private-kernel-inner |
254.47 MB |
254.47 MB |
1 |
private-kernel-reset |
493.18 MB |
493.18 MB |
1 |
private-kernel-tail |
244.63 MB |
244.63 MB |
1 |
rollup-block-root-first-empty-tx |
339.38 MB |
339.38 MB |
1 |
rollup-block-root-single-tx |
337.41 MB |
337.41 MB |
1 |
rollup-block-root |
340.43 MB |
340.43 MB |
1 |
rollup-checkpoint-merge |
339.76 MB |
339.76 MB |
1 |
rollup-checkpoint-root-single-block |
5910 MB |
5910 MB |
1 |
rollup-checkpoint-root |
5920 MB |
5920 MB |
1 |
rollup-root |
341.31 MB |
341.31 MB |
1 |
rollup-tx-base-private |
1070 MB |
1070 MB |
1 |
rollup-tx-base-public |
2890 MB |
2890 MB |
1 |
rollup-tx-merge |
336.59 MB |
336.59 MB |
1 |
semaphore_depth_10 |
92.18 MB |
92.18 MB |
1 |
sha512_100_bytes |
185.49 MB |
185.48 MB |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
Yeah I was assuming there was some additional overhead from array gets + make_array vs. as_slice in Brillig gen. The regressions in #10279 (comment) don't look horrible to be honest. Maybe we are ok with just eating that overhead in exchange for simplifying the compiler. I do think this ACIR change can be moved to SSA gen either way though. |
Do you mean, instead of simplifying AsSlice we handle it during ssa-gen and directly output these chain of instructions?
In that program I tried changing the input array: the larger it is, the larger the Brillig code gets. For example right now it's 238 instructions with 20 elements, but if we bump it to 200 elements it's 1138 instructions. So... I'm not sure if we should go forward with it, I don't know if test_programs are representative of real programs. Ideally, we translate The SSA for this program: fn main(array: [Field; 3]) -> pub Field {
let slice = array.as_slice();
let slice = slice.insert(0, 1);
slice[0]
}is: The last In the test program I didn't think it was the same case as before, but it seems it is! All those We can see that the array gets that were on I think I'm going to try this optimization in a separate PR. It seems relatively simple and we don't end up generating a huge list of |
|
Out of curiosity, I tried the above in this PR: #10295 Initially
So by now #10295 is a monster PR doing a bit of each of the above, just to see how much things can be optimized. I'll probably close this as at this point it might be risky to introduce these changes, but at least I was able to confirm that these optimizations can impact performance. |
Yes. The reason I say this is because the instruction simplification logic you added operates over all value kinds (constant and dynamic). So when we insert an
I also think it would be ideal translate
Yes this looks like it would be good as a separate optimization. |

Description
Problem
Resolves #10278
Summary
Just trying this out, out of curiosity, because it's relatively simple to implement.
Additional Context
Documentation
Check one:
PR Checklist
cargo fmton default settings.