Skip to content

fix(ssa): Constant fold Brillig calls using the SSA interpreter#9365

Closed
vezenovm wants to merge 17 commits intomasterfrom
mv/const-eval-with-ssa-interpreter
Closed

fix(ssa): Constant fold Brillig calls using the SSA interpreter#9365
vezenovm wants to merge 17 commits intomasterfrom
mv/const-eval-with-ssa-interpreter

Conversation

@vezenovm
Copy link
Contributor

@vezenovm vezenovm commented Jul 30, 2025

Description

Problem*

Resolves #9015
Resolves #9097

This also resolves a panic from a Noir sync when attempting to debug log an array value (the same error as #9097)

Let's take the snippet from #9097:

fn main() {
    // Safety:
    let result = unsafe { foo([0]) };
    println(result);
}

unconstrained fn foo(bytes: [u8; 1]) -> Field {
    for _ in 0..bytes.len() {
        let _ = bytes[0];
    }
    0
}

We can see the following SSA when we fold constants with Brillig:

Details

After Inlining Brillig Calls (1) (step 42):
acir(inline) impure fn main f0 {
  b0():
    v1 = make_array [u8 0] : [u8; 1]                    // src/main.nr:3:32
    call f2(u1 1, Field 0)                              // std/lib.nr:40:9
    return
}
brillig(inline) predicate_pure fn foo f1 {
  b0(v0: [u8; 1]):
    inc_rc v0                                           // src/main.nr:8:17
    return Field 0
}
brillig(inline) impure fn print_unconstrained f2 {
  b0(v0: u1, v1: Field):
    v13 = make_array b"{\"kind\":\"field\"}"            // std/lib.nr:40:9
    call print(v0, v1, v13, u1 0)                       // std/lib.nr:34:5
    return
}

Remove Unreachable Instructions (2) (step 43): 0 ms
After Remove Unreachable Instructions (2) (step 43):
acir(inline) impure fn main f0 {
  b0():
    v1 = make_array [u8 0] : [u8; 1]                    // src/main.nr:3:32
    call f1(u1 1, Field 0)                              // std/lib.nr:40:9
    return
}
brillig(inline) predicate_pure fn print_unconstrained f1 {
  b0(v0: u1, v1: Field):
    v13 = make_array b"{\"kind\":\"field\"}"            // std/lib.nr:40:9
    call print(v0, v1, v13, u1 0)                       // std/lib.nr:34:5
    return
}

We can see that one Brillig function was folding into f0. However, the print remains. However, the Brillig context structure contains all the artifacts already has a set mapping for function ids to Brillig artifacts. Thus, when we remove functions after Brillig gen and change their IDs (e.g., like when normalizing) we end up fetching an artifact for the wrong function when it comes times to generate an entry point. So we were fetching

Summary*

I realized this could have been fixed for #9097 by simply updating the IDs point to the Brillig artifacts as part of normalization. However, this felt brittle and I felt the problem laid out above still feels like it could come up elsewhere.

Aside the benefit of avoiding extra Brillig compilation, switching to using the SSA interpreter for constant folding Brillig entry points also avoids these kinds of situations where Brillig context needs to be updated due to secondary SSA passes post Brillig gen.

Additional Context

Documentation*

Check one:

  • No documentation needed.
  • Documentation included in this PR.
  • [For Experimental Features] Documentation to be submitted in a separate PR.

PR Checklist*

  • I have tested the changes locally.
  • I have formatted the changes with Prettier and/or cargo fmt on default settings.

@vezenovm vezenovm marked this pull request as draft July 30, 2025 23:41
@vezenovm
Copy link
Contributor Author

regression_7451 +2 ❌ +200.00% +2,848 ❌ +16752.94%
inline_decompose_hint_brillig_call +15 ❌ +1500.00% +74 ❌ +435.29%
strings +1 ❌ +3.45% +3 ❌ +9.09%

Hmm looks like we are getting some regressions. I was not expecting any regression so I will have to investigate before we can merge.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Test Suite Duration'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: c987274 Previous: ee2ac1a Ratio
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_rollup-lib 909 s 682 s 1.33
test_report_noir-lang_sha512_ 216 s 13 s 16.62
test_report_zkpassport_noir_rsa_ 1 s 0 s +∞

This comment was automatically generated by workflow using github-action-benchmark.

CC: @TomAFrench

@github-actions
Copy link
Contributor

github-actions bot commented Jul 31, 2025

Changes to Brillig bytecode sizes

Generated at commit: de310253ba179dc73ad18b4b234ed8381668b804, compared to commit: ee2ac1a04e85134079fbe557d4029d3f64aff7b6

🧾 Summary (10% most significant diffs)

Program Brillig opcodes (+/-) %
global_var_regression_simple_inliner_max -1 ✅ -2.56%
global_var_regression_simple_inliner_zero -1 ✅ -2.56%
binary_operator_overloading_inliner_max -26 ✅ -6.79%

Full diff report 👇
Program Brillig opcodes (+/-) %
nested_array_in_slice_inliner_max 953 (-2) -0.21%
nested_array_in_slice_inliner_min 809 (-2) -0.25%
nested_array_in_slice_inliner_zero 809 (-2) -0.25%
array_oob_regression_7965_inliner_max 259 (-2) -0.77%
array_oob_regression_7965_inliner_min 259 (-2) -0.77%
array_oob_regression_7965_inliner_zero 259 (-2) -0.77%
higher_order_functions_inliner_zero 625 (-5) -0.79%
nested_array_dynamic_inliner_max 1,723 (-15) -0.86%
nested_array_dynamic_inliner_min 1,431 (-15) -1.04%
nested_array_dynamic_inliner_zero 1,431 (-15) -1.04%
global_consts_inliner_max 197 (-4) -1.99%
global_consts_inliner_zero 197 (-4) -1.99%
global_var_regression_simple_inliner_max 38 (-1) -2.56%
global_var_regression_simple_inliner_zero 38 (-1) -2.56%
binary_operator_overloading_inliner_max 357 (-26) -6.79%

@github-actions
Copy link
Contributor

github-actions bot commented Jul 31, 2025

Changes to number of Brillig opcodes executed

Generated at commit: de310253ba179dc73ad18b4b234ed8381668b804, compared to commit: ee2ac1a04e85134079fbe557d4029d3f64aff7b6

🧾 Summary (10% most significant diffs)

Program Brillig opcodes (+/-) %
higher_order_functions_inliner_zero -12 ✅ -1.03%
global_var_regression_simple_inliner_max -1 ✅ -3.23%
global_var_regression_simple_inliner_zero -1 ✅ -3.23%

Full diff report 👇
Program Brillig opcodes (+/-) %
poseidon_bn254_hash_width_3_inliner_max 137,347 (+234) +0.17%
regression_5252_inliner_max 760,074 (+1,200) +0.16%
poseidonsponge_x5_254_inliner_max 152,905 (+240) +0.16%
poseidon_bn254_hash_width_3_inliner_zero 167,759 (+234) +0.14%
global_consts_inliner_max 1,371 (-4) -0.29%
global_consts_inliner_zero 1,371 (-4) -0.29%
array_oob_regression_7965_inliner_max 313 (-1) -0.32%
array_oob_regression_7965_inliner_min 313 (-1) -0.32%
array_oob_regression_7965_inliner_zero 313 (-1) -0.32%
nested_array_dynamic_inliner_min 2,828 (-10) -0.35%
nested_array_dynamic_inliner_zero 2,828 (-10) -0.35%
nested_array_dynamic_inliner_max 2,644 (-10) -0.38%
higher_order_functions_inliner_zero 1,153 (-12) -1.03%
global_var_regression_simple_inliner_max 30 (-1) -3.23%
global_var_regression_simple_inliner_zero 30 (-1) -3.23%

@vezenovm vezenovm added the bench-show Display benchmark results on PR label Jul 31, 2025
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACVM Benchmarks

Details
Benchmark suite Current: c987274 Previous: ee2ac1a Ratio
purely_sequential_opcodes 246544 ns/iter (± 770) 246963 ns/iter (± 660) 1.00
perfectly_parallel_opcodes 218489 ns/iter (± 3724) 219542 ns/iter (± 2180) 1.00
perfectly_parallel_batch_inversion_opcodes 2779355 ns/iter (± 2092) 2776812 ns/iter (± 1780) 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compilation Time

Details
Benchmark suite Current: c987274 Previous: ee2ac1a Ratio
private-kernel-inner 1.794 s 1.686 s 1.06
private-kernel-reset 7.714 s 7.822 s 0.99
private-kernel-tail 1.378 s 1.296 s 1.06
rollup-base-private 15.94 s 15.34 s 1.04
rollup-base-public 13.64 s 13.32 s 1.02
rollup-block-root-empty 21.5 s 20.48 s 1.05
rollup-block-root-single-tx 231 s 189 s 1.22
rollup-block-root 229 s 211 s 1.09
rollup-merge 1.314 s 1.378 s 0.95
rollup-root 1.438 s 1.456 s 0.99
semaphore-depth-10 0.806 s 0.759 s 1.06
sha512-100-bytes 1.788 s 1.747 s 1.02

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Artifact Size

Details
Benchmark suite Current: c987274 Previous: ee2ac1a Ratio
private-kernel-inner 709.1 KB 709.1 KB 1
private-kernel-reset 2033.1 KB 2033.1 KB 1
private-kernel-tail 535.9 KB 535.9 KB 1
rollup-base-private 4313.9 KB 4314.6 KB 1.00
rollup-base-public 3327.1 KB 3327.1 KB 1
rollup-block-root-empty 3841.7 KB 3842.8 KB 1.00
rollup-block-root-single-tx 30722.2 KB 30722.5 KB 1.00
rollup-block-root 30771.3 KB 30771.4 KB 1.00
rollup-merge 183 KB 183 KB 1
rollup-root 388.9 KB 388.9 KB 1
semaphore-depth-10 632.1 KB 632.1 KB 1
sha512-100-bytes 525.2 KB 525.2 KB 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Execution Time

Details
Benchmark suite Current: c987274 Previous: ee2ac1a Ratio
private-kernel-inner 0.014 s 0.014 s 1
private-kernel-reset 0.155 s 0.154 s 1.01
private-kernel-tail 0.01 s 0.01 s 1
rollup-base-private 0.264 s 0.264 s 1
rollup-base-public 0.164 s 0.162 s 1.01
rollup-block-root 13.2 s 12.9 s 1.02
rollup-merge 0.002 s 0.002 s 1
rollup-root 0.004 s 0.004 s 1
semaphore-depth-10 0.019 s 0.019 s 1
sha512-100-bytes 0.101 s 0.096 s 1.05

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opcode count

Details
Benchmark suite Current: c987274 Previous: ee2ac1a Ratio
private-kernel-inner 14792 opcodes 14792 opcodes 1
private-kernel-reset 68868 opcodes 68868 opcodes 1
private-kernel-tail 11176 opcodes 11176 opcodes 1
rollup-base-private 221419 opcodes 221419 opcodes 1
rollup-base-public 159972 opcodes 159972 opcodes 1
rollup-block-root-empty 68110 opcodes 68110 opcodes 1
rollup-block-root-single-tx 963865 opcodes 963865 opcodes 1
rollup-block-root 965153 opcodes 965153 opcodes 1
rollup-merge 1411 opcodes 1411 opcodes 1
rollup-root 2631 opcodes 2631 opcodes 1
semaphore-depth-10 5763 opcodes 5763 opcodes 1
sha512-100-bytes 13173 opcodes 13173 opcodes 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Execution Memory

Details
Benchmark suite Current: c987274 Previous: ee2ac1a Ratio
private-kernel-inner 205.02 MB 205.02 MB 1
private-kernel-reset 241.84 MB 241.84 MB 1
private-kernel-tail 194.26 MB 194.26 MB 1
rollup-base-private 498.48 MB 498.48 MB 1
rollup-base-public 431.12 MB 431.12 MB 1
rollup-block-root 1500 MB 1500 MB 1
rollup-merge 326.44 MB 326.44 MB 1
rollup-root 328.85 MB 328.85 MB 1
semaphore_depth_10 69.51 MB 69.51 MB 1
sha512_100_bytes 55 MB 55 MB 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compilation Memory

Details
Benchmark suite Current: c987274 Previous: ee2ac1a Ratio
private-kernel-inner 238.23 MB 238.26 MB 1.00
private-kernel-reset 547.42 MB 547.41 MB 1.00
private-kernel-tail 212.14 MB 212.14 MB 1
rollup-base-private 1340 MB 1340 MB 1
rollup-base-public 1420 MB 1420 MB 1
rollup-block-root-empty 1140 MB 1090 MB 1.05
rollup-block-root-single-tx 10670 MB 9340 MB 1.14
rollup-block-root 10670 MB 9350 MB 1.14
rollup-merge 330.52 MB 330.52 MB 1
rollup-root 339.74 MB 339.69 MB 1.00
semaphore_depth_10 104.78 MB 104.76 MB 1.00
sha512_100_bytes 234.74 MB 234.84 MB 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test Suite Duration

Details
Benchmark suite Current: c987274 Previous: ee2ac1a Ratio
test_report_AztecProtocol_aztec-packages_noir-projects_aztec-nr 103 s 103 s 1
test_report_AztecProtocol_aztec-packages_noir-projects_noir-contracts 114 s 113 s 1.01
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_blob 194 s 168 s 1.15
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_private-kernel-lib 209 s 220 s 0.95
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_reset-kernel-lib 34 s 32 s 1.06
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_rollup-lib 909 s 682 s 1.33
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_types 100 s 101 s 0.99
test_report_noir-lang_noir-bignum_ 399 s 413 s 0.97
test_report_noir-lang_noir_bigcurve_ 353 s 299 s 1.18
test_report_noir-lang_noir_json_parser_ 55 s
test_report_noir-lang_sha512_ 216 s 13 s 16.62
test_report_zkpassport_noir-ecdsa_ 1 s 2 s 0.50
test_report_zkpassport_noir_rsa_ 1 s 0 s +∞

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Collaborator

@asterite asterite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@vezenovm vezenovm marked this pull request as ready for review July 31, 2025 20:09
@vezenovm
Copy link
Contributor Author

vezenovm commented Jul 31, 2025

sha512_100_bytes +221 ❌ +1.68% +443 ❌ +1.12%

Looks like we still have a minor regression here. Most likely due to us not folding Brillig calls as we constant fold all the other instructions. I'm going to look into a fixpoint loop quickly to see if it removes the regression. Otherwise, I'm going to merge as this regression is small and we in fact see improvements in other tests. Optimizing this can then come in a separate PR as it would most likely require changing the interpreter interface to not borrow Ssa so that we can evaluate calls during the main constant_folding pass.

@vezenovm
Copy link
Contributor Author

I ended up making a fixed point loop in constant_folding. The loop is contained in a top-level method in constant_folding and fold_constant_brillig_calls is now a submodule of constant_folding.

@vezenovm vezenovm enabled auto-merge July 31, 2025 21:33
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Compilation Time'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: c987274 Previous: ee2ac1a Ratio
rollup-block-root-single-tx 231 s 189 s 1.22

This comment was automatically generated by workflow using github-action-benchmark.

CC: @TomAFrench

@vezenovm
Copy link
Contributor Author

I ended up making a fixed point loop in constant_folding. The loop is contained in a top-level method in constant_folding and fold_constant_brillig_calls is now a submodule of constant_folding.

ugh, we seem to be getting large time regressions in the sha512 library https://github.com/noir-lang/noir/actions/runs/16660549208/job/47156670574?pr=9365. Perhaps we do just have to bite the bullet and change the interpreter interface.

@vezenovm
Copy link
Contributor Author

vezenovm commented Jul 31, 2025

ugh, we seem to be getting large time regressions in the sha512 library https://github.com/noir-lang/noir/actions/runs/16660549208/job/47156670574?pr=9365. Perhaps we do just have to bite the bullet and change the interpreter interface.

If we want to get this fix in and are ok with a small circuit size regression I will just revert to e83fe33 where everything is green. Otherwise, I will take a look at how we can change the interpreter interface to avoid borrowing the entire Ssa for this case.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Execution Time'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: 6ccd460 Previous: d7e2c17 Ratio
private-kernel-inner 0.02 s 0.016 s 1.25

This comment was automatically generated by workflow using github-action-benchmark.

CC: @TomAFrench

//! This improves constant folding across function boundaries, particularly for small utility
//! Brillig functions that can be inlined
//!
//! Note: Only calls to Brillig functions from ACIR functions (Brillig entry points) are evaluated here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem correct based on the changes to brillig bytecode sizes shown in the benchmarls.

@vezenovm
Copy link
Contributor Author

vezenovm commented Sep 2, 2025

Superseded by #9655

@vezenovm vezenovm closed this Sep 2, 2025
auto-merge was automatically disabled September 2, 2025 13:47

Pull request was closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bench-show Display benchmark results on PR

Projects

None yet

3 participants