Skip to content

chore(inlining): Mark functions with <= 10 instructions and no control flow as inline always #8533

Merged
aakoshh merged 11 commits intomasterfrom
mv/mark-more-simple-funcs-to-inline
May 19, 2025
Merged

chore(inlining): Mark functions with <= 10 instructions and no control flow as inline always #8533
aakoshh merged 11 commits intomasterfrom
mv/mark-more-simple-funcs-to-inline

Conversation

@vezenovm
Copy link
Contributor

@vezenovm vezenovm commented May 15, 2025

Description

Problem*

Experimenting with resolving #8457

Summary*

We have a pass to inline functions with a single instruction (or none). In general, small functions without any control flow should be inlined as they should almost always lead to further optimizations.

This PR changes inline_functions_with_at_most_one_instruction to inline_simple_functions.

As per the doc comments in this PR a simple function is defined as the following:

  • Contains no more than 10 instructions
  • The function only has a single block (e.g. no control flow or conditional branches)
  • It is not marked with the no predicates inline type

Additional Context

See #8533 (comment) to understand the regressions for to_bytes_consistent_inliner_min.

Documentation*

Check one:

  • No documentation needed.
  • Documentation included in this PR.
  • [For Experimental Features] Documentation to be submitted in a separate PR.

PR Checklist*

  • I have tested the changes locally.
  • I have formatted the changes with Prettier and/or cargo fmt on default settings.

@vezenovm vezenovm added the bench-show Display benchmark results on PR label May 15, 2025
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACVM Benchmarks

Details
Benchmark suite Current: 8071d98 Previous: 1feac18 Ratio
purely_sequential_opcodes 255409 ns/iter (± 1185) 253639 ns/iter (± 322) 1.01
perfectly_parallel_opcodes 225339 ns/iter (± 3433) 222904 ns/iter (± 9810) 1.01
perfectly_parallel_batch_inversion_opcodes 3212357 ns/iter (± 4012) 3564404 ns/iter (± 6706) 0.90

This comment was automatically generated by workflow using github-action-benchmark.

…ine' into mv/mark-more-simple-funcs-to-inline
@github-actions
Copy link
Contributor

github-actions bot commented May 15, 2025

Changes to Brillig bytecode sizes

Generated at commit: 5064eee99fc9e74bc1253b784126ba637c4c81ec, compared to commit: 1feac18e7d54673f78d0d2d32c6c09d470a3841c

🧾 Summary (10% most significant diffs)

Program Brillig opcodes (+/-) %
to_bytes_consistent_inliner_min +53 ❌ +76.81%
wrapping_operations_inliner_min -58 ✅ -51.79%
regression_4124_inliner_min -28 ✅ -51.85%
inline_decompose_hint_brillig_call_inliner_min -96 ✅ -54.24%
missing_closure_env_inliner_min -31 ✅ -54.39%
trait_impl_base_type_inliner_min -129 ✅ -56.09%
fold_distinct_return_inliner_min -55 ✅ -59.78%
global_var_multiple_entry_points_nested_inliner_min -55 ✅ -59.78%
regression_unsafe_no_predicates_inliner_min -53 ✅ -60.92%
brillig_acir_as_brillig_inliner_min -70 ✅ -64.22%
brillig_calls_inliner_min -117 ✅ -64.29%
closures_mut_ref_inliner_min -66 ✅ -66.00%
import_inliner_min -80 ✅ -66.67%
regression_6734_inliner_min -48 ✅ -70.59%
assert_inliner_min -134 ✅ -83.75%
brillig_rc_regression_6123_inliner_zero -111 ✅ -84.73%

Full diff report 👇
Program Brillig opcodes (+/-) %
to_bytes_consistent_inliner_min 122 (+53) +76.81%
reference_counts_slices_inliner_0_inliner_min 1,216 (+157) +14.83%
reference_counts_inliner_max_inliner_min 1,048 (+120) +12.93%
reference_counts_inliner_0_inliner_min 1,048 (+120) +12.93%
reference_counts_inliner_min_inliner_min 1,048 (+120) +12.93%
pedersen_commitment_inliner_min 163 (+16) +10.88%
simple_shield_inliner_zero 678 (+62) +10.06%
brillig_cow_inliner_min 334 (+19) +6.03%
reference_counts_inliner_0_inliner_zero 896 (+49) +5.79%
reference_counts_inliner_min_inliner_zero 896 (+49) +5.79%
reference_counts_inliner_max_inliner_zero 896 (+49) +5.79%
reference_counts_slices_inliner_0_inliner_zero 1,005 (+49) +5.13%
uhashmap_inliner_max 11,589 (+3) +0.03%
hashmap_inliner_max 17,278 (-12) -0.07%
poseidonsponge_x5_254_inliner_min 3,010 (-10) -0.33%
poseidon_bn254_hash_width_3_inliner_min 4,742 (-20) -0.42%
regression_5252_inliner_min 3,376 (-20) -0.59%
uhashmap_inliner_zero 6,846 (-51) -0.74%
hashmap_inliner_min 8,760 (-81) -0.92%
regression_6674_1_inliner_zero 185 (-2) -1.07%
regression_6674_3_inliner_max 474 (-6) -1.25%
simple_shield_inliner_min 698 (-10) -1.41%
hashmap_inliner_zero 7,716 (-112) -1.43%
slices_inliner_zero 1,628 (-30) -1.81%
slice_coercion_inliner_zero 333 (-7) -2.06%
strings_inliner_zero 893 (-19) -2.08%
regression_6674_3_inliner_zero 410 (-9) -2.15%
array_dedup_regression_inliner_min 259 (-6) -2.26%
regression_bignum_inliner_min 319 (-8) -2.45%
slices_inliner_min 2,166 (-55) -2.48%
regression_6674_2_inliner_zero 185 (-5) -2.63%
conditional_1_inliner_min 517 (-14) -2.64%
conditional_1_inliner_zero 517 (-14) -2.64%
loop_break_regression_8319_inliner_min 209 (-6) -2.79%
pedersen_check_inliner_zero 406 (-12) -2.87%
brillig_pedersen_inliner_zero 406 (-12) -2.87%
conditional_regression_short_circuit_inliner_min 231 (-7) -2.94%
while_loop_break_regression_8521_inliner_min 176 (-6) -3.30%
conditional_regression_short_circuit_inliner_zero 204 (-7) -3.32%
array_to_slice_inliner_min 873 (-30) -3.32%
fold_numeric_generic_poseidon_inliner_min 577 (-20) -3.35%
7_function_inliner_zero 442 (-16) -3.49%
global_consts_inliner_min 230 (-9) -3.77%
regression_1144_1169_2399_6609_inliner_zero 891 (-35) -3.78%
global_slice_rc_regression_8259_inliner_min 225 (-9) -3.85%
uhashmap_inliner_min 7,261 (-300) -3.97%
brillig_cow_regression_inliner_min 1,212 (-55) -4.34%
to_be_bytes_inliner_min 200 (-10) -4.76%
hash_to_field_inliner_min 188 (-10) -5.05%
regression_6674_2_inliner_min 244 (-13) -5.06%
conditional_2_inliner_min 145 (-8) -5.23%
multi_scalar_mul_inliner_min 288 (-16) -5.26%
fold_2_to_17_inliner_min 343 (-20) -5.51%
regression_1144_1169_2399_6609_inliner_min 995 (-61) -5.78%
merkle_insert_inliner_min 379 (-24) -5.96%
global_array_rc_regression_8259_inliner_min 99 (-7) -6.60%
reference_only_used_as_alias_inliner_min 223 (-16) -6.69%
array_sort_inliner_min 452 (-33) -6.80%
regression_3394_inliner_min 115 (-9) -7.26%
regression_7128_inliner_min 124 (-10) -7.46%
slice_regex_inliner_min 1,861 (-152) -7.55%
ram_blowup_regression_inliner_min 230 (-20) -8.00%
simple_print_inliner_min 202 (-18) -8.18%
array_len_inliner_min 111 (-10) -8.26%
pedersen_hash_inliner_min 263 (-24) -8.36%
higher_order_functions_inliner_zero 646 (-60) -8.50%
signed_division_inliner_zero 203 (-19) -8.56%
prelude_inliner_min 190 (-18) -8.65%
fmtstr_with_global_inliner_min 106 (-11) -9.40%
loop_inliner_min 121 (-13) -9.70%
regression_3051_inliner_min 165 (-18) -9.84%
regression_method_cannot_be_found_inliner_min 73 (-9) -10.98%
fold_call_witness_condition_inliner_min 113 (-14) -11.02%
6_array_inliner_zero 341 (-43) -11.20%
fold_2_to_17_inliner_zero 315 (-40) -11.27%
brillig_pedersen_inliner_min 406 (-54) -11.74%
pedersen_check_inliner_min 406 (-54) -11.74%
derive_inliner_zero 283 (-38) -11.84%
comptime_variable_at_runtime_inliner_min 79 (-11) -12.22%
higher_order_functions_inliner_min 1,110 (-159) -12.53%
global_var_regression_entry_points_inliner_zero 62 (-9) -12.68%
hint_black_box_inliner_min 322 (-47) -12.74%
signed_division_inliner_min 203 (-30) -12.88%
hint_black_box_inliner_zero 312 (-47) -13.09%
strings_inliner_min 903 (-142) -13.59%
regression_6451_inliner_min 50 (-8) -13.79%
unary_operator_overloading_inliner_min 62 (-10) -13.89%
regression_6674_1_inliner_min 221 (-36) -14.01%
global_var_func_with_multiple_entry_points_inliner_zero 44 (-9) -16.98%
6_array_inliner_min 341 (-76) -18.23%
regression_7195_inliner_zero 39 (-9) -18.75%
fold_after_inlined_calls_inliner_min 34 (-8) -19.05%
regression_6674_3_inliner_min 674 (-163) -19.47%
regression_5045_inliner_min 132 (-33) -20.00%
conditional_regression_661_inliner_zero 118 (-30) -20.27%
to_le_bytes_inliner_min 125 (-32) -20.38%
unsafe_range_constraint_inliner_min 29 (-8) -21.62%
regression_7836_inliner_min 47 (-13) -21.67%
regression_7836_inliner_zero 47 (-13) -21.67%
conditional_regression_661_inliner_min 118 (-36) -23.38%
assign_mutation_in_lvalue_inliner_min 71 (-22) -23.66%
submodules_inliner_min 29 (-9) -23.68%
regression_11294_inliner_min 288 (-91) -24.01%
inline_never_basic_inliner_min 28 (-9) -24.32%
global_var_entry_point_used_in_another_entry_inliner_zero 58 (-19) -24.68%
regression_7744_inliner_min 56 (-19) -25.33%
global_var_regression_entry_points_inliner_min 73 (-25) -25.51%
regression_8235_inliner_min 35 (-12) -25.53%
7_function_inliner_min 452 (-158) -25.90%
fold_basic_inliner_min 28 (-10) -26.32%
1327_concrete_in_generic_inliner_min 24 (-9) -27.27%
embedded_curve_ops_inliner_min 305 (-121) -28.40%
global_var_multiple_entry_points_nested_inliner_zero 37 (-15) -28.85%
generics_inliner_min 170 (-69) -28.87%
slice_coercion_inliner_min 333 (-138) -29.30%
regression_7195_inliner_min 39 (-17) -30.36%
brillig_nested_arrays_inliner_zero 141 (-65) -31.55%
derive_inliner_min 358 (-168) -31.94%
5_over_inliner_min 42 (-20) -32.26%
traits_in_crates_1_inliner_min 28 (-14) -33.33%
traits_in_crates_2_inliner_min 28 (-14) -33.33%
references_inliner_zero 148 (-84) -36.21%
4_sub_inliner_min 38 (-22) -36.67%
binary_operator_overloading_inliner_min 224 (-130) -36.72%
brillig_rc_regression_6123_inliner_min 143 (-93) -39.41%
fold_basic_nested_call_inliner_min 30 (-20) -40.00%
debug_logs_inliner_min 5,318 (-3,661) -40.77%
brillig_nested_arrays_inliner_min 141 (-98) -41.00%
brillig_calls_array_inliner_zero 98 (-70) -41.67%
brillig_calls_array_inliner_min 112 (-83) -42.56%
references_inliner_min 230 (-171) -42.64%
global_var_entry_point_used_in_another_entry_inliner_min 58 (-44) -43.14%
global_var_func_with_multiple_entry_points_inliner_min 44 (-34) -43.59%
to_bytes_integration_inliner_min 91 (-85) -48.30%
regression_4088_inliner_min 26 (-26) -50.00%
wrapping_operations_inliner_min 54 (-58) -51.79%
regression_4124_inliner_min 26 (-28) -51.85%
inline_decompose_hint_brillig_call_inliner_min 81 (-96) -54.24%
missing_closure_env_inliner_min 26 (-31) -54.39%
trait_impl_base_type_inliner_min 101 (-129) -56.09%
fold_distinct_return_inliner_min 37 (-55) -59.78%
global_var_multiple_entry_points_nested_inliner_min 37 (-55) -59.78%
regression_unsafe_no_predicates_inliner_min 34 (-53) -60.92%
brillig_acir_as_brillig_inliner_min 39 (-70) -64.22%
brillig_calls_inliner_min 65 (-117) -64.29%
closures_mut_ref_inliner_min 34 (-66) -66.00%
import_inliner_min 40 (-80) -66.67%
regression_6734_inliner_min 20 (-48) -70.59%
assert_inliner_min 26 (-134) -83.75%
brillig_rc_regression_6123_inliner_zero 20 (-111) -84.73%

@github-actions
Copy link
Contributor

github-actions bot commented May 15, 2025

Changes to number of Brillig opcodes executed

Generated at commit: 5064eee99fc9e74bc1253b784126ba637c4c81ec, compared to commit: 1feac18e7d54673f78d0d2d32c6c09d470a3841c

🧾 Summary (10% most significant diffs)

Program Brillig opcodes (+/-) %
while_loop_break_regression_8521_inliner_min +99 ❌ +123.75%
regression_7195_inliner_min -49 ✅ -61.25%
global_var_func_with_multiple_entry_points_inliner_min -57 ✅ -61.29%
missing_closure_env_inliner_min -38 ✅ -63.33%
global_var_entry_point_used_in_another_entry_inliner_min -83 ✅ -64.34%
wrapping_operations_inliner_min -87 ✅ -65.41%
brillig_nested_arrays_inliner_min -274 ✅ -66.99%
brillig_calls_inliner_min -122 ✅ -68.54%
closures_mut_ref_inliner_min -79 ✅ -69.91%
import_inliner_min -92 ✅ -73.02%
brillig_acir_as_brillig_inliner_min -83 ✅ -73.45%
regression_6734_inliner_min -50 ✅ -73.53%
global_var_multiple_entry_points_nested_inliner_min -95 ✅ -75.40%
assert_inliner_min -105 ✅ -82.68%
brillig_rc_regression_6123_inliner_zero -177 ✅ -90.77%

Full diff report 👇
Program Brillig opcodes (+/-) %
while_loop_break_regression_8521_inliner_min 179 (+99) +123.75%
loop_break_regression_8319_inliner_min 722 (+99) +15.89%
to_bytes_consistent_inliner_min 510 (+33) +6.92%
pedersen_commitment_inliner_min 193 (+9) +4.89%
fold_call_witness_condition_inliner_min 70 (+2) +2.94%
fold_2_to_17_inliner_zero 1,041,023 (+9,369) +0.91%
regression_5252_inliner_min 917,163 (+29) +0.00%
uhashmap_inliner_max 144,425 (+3) +0.00%
poseidonsponge_x5_254_inliner_min 183,742 (-14) -0.01%
poseidon_bn254_hash_width_3_inliner_min 168,102 (-28) -0.02%
hashmap_inliner_max 51,498 (-18) -0.03%
regression_6674_3_inliner_max 1,206 (-6) -0.50%
fold_numeric_generic_poseidon_inliner_min 4,693 (-28) -0.59%
regression_6674_1_inliner_zero 687 (-5) -0.72%
conditional_1_inliner_min 1,908 (-14) -0.73%
conditional_1_inliner_zero 1,908 (-14) -0.73%
to_be_bytes_inliner_min 1,889 (-14) -0.74%
regression_6674_3_inliner_zero 1,146 (-9) -0.78%
global_consts_inliner_min 1,416 (-13) -0.91%
regression_6674_2_inliner_zero 687 (-8) -1.15%
ram_blowup_regression_inliner_min 286,550 (-3,598) -1.24%
brillig_cow_regression_inliner_min 194,580 (-2,446) -1.24%
7_function_inliner_zero 2,132 (-29) -1.34%
to_le_bytes_inliner_min 1,017 (-14) -1.36%
regression_7128_inliner_min 1,009 (-14) -1.37%
simple_shield_inliner_zero 2,117 (-32) -1.49%
slices_inliner_zero 2,739 (-42) -1.51%
hash_to_field_inliner_min 881 (-14) -1.56%
array_to_slice_inliner_min 1,928 (-34) -1.73%
regression_1144_1169_2399_6609_inliner_zero 2,257 (-42) -1.83%
slices_inliner_min 3,805 (-75) -1.93%
regression_bignum_inliner_min 462 (-12) -2.53%
fold_2_to_17_inliner_min 1,041,095 (-27,981) -2.62%
regression_1144_1169_2399_6609_inliner_min 2,550 (-72) -2.75%
brillig_cow_inliner_min 1,216 (-38) -3.03%
slice_coercion_inliner_zero 330 (-11) -3.23%
regression_6674_2_inliner_min 864 (-32) -3.57%
uhashmap_inliner_zero 168,672 (-8,207) -4.64%
higher_order_functions_inliner_zero 1,151 (-74) -6.04%
array_len_inliner_min 202 (-14) -6.48%
regression_6674_1_inliner_min 837 (-60) -6.69%
hint_black_box_inliner_min 859 (-65) -7.03%
brillig_pedersen_inliner_zero 588 (-45) -7.11%
pedersen_check_inliner_zero 588 (-45) -7.11%
array_sort_inliner_min 988 (-76) -7.14%
hint_black_box_inliner_zero 845 (-65) -7.14%
reference_only_used_as_alias_inliner_min 250 (-20) -7.41%
loop_inliner_min 185 (-15) -7.50%
regression_unsafe_no_predicates_inliner_min 24 (-2) -7.69%
hashmap_inliner_zero 66,723 (-5,561) -7.69%
regression_5045_inliner_min 23 (-2) -8.00%
uhashmap_inliner_min 176,586 (-18,673) -9.56%
7_function_inliner_min 2,146 (-227) -9.57%
regression_3394_inliner_min 114 (-13) -10.24%
simple_print_inliner_min 202 (-26) -11.40%
prelude_inliner_min 190 (-26) -12.04%
fmtstr_with_global_inliner_min 105 (-15) -12.50%
to_bytes_integration_inliner_min 2,068 (-319) -13.36%
regression_3051_inliner_min 165 (-26) -13.61%
strings_inliner_zero 1,568 (-248) -13.66%
merkle_insert_inliner_min 2,454 (-394) -13.83%
pedersen_hash_inliner_min 397 (-65) -14.07%
regression_6674_3_inliner_min 1,533 (-251) -14.07%
higher_order_functions_inliner_min 1,786 (-312) -14.87%
regression_method_cannot_be_found_inliner_min 72 (-13) -15.29%
simple_shield_inliner_min 1,832 (-339) -15.61%
hashmap_inliner_min 73,335 (-13,603) -15.65%
derive_inliner_zero 289 (-58) -16.71%
global_var_regression_entry_points_inliner_zero 54 (-13) -19.40%
unary_operator_overloading_inliner_min 56 (-14) -20.00%
conditional_regression_661_inliner_zero 114 (-29) -20.28%
regression_6451_inliner_min 40 (-12) -23.08%
multi_scalar_mul_inliner_min 57,064 (-17,560) -23.53%
6_array_inliner_zero 1,427 (-468) -24.70%
signed_division_inliner_zero 158 (-53) -25.12%
brillig_pedersen_inliner_min 588 (-202) -25.57%
pedersen_check_inliner_min 588 (-202) -25.57%
assign_mutation_in_lvalue_inliner_min 75 (-26) -25.74%
regression_11294_inliner_min 1,205 (-420) -25.85%
generics_inliner_min 236 (-85) -26.48%
global_var_func_with_multiple_entry_points_inliner_zero 36 (-13) -26.53%
conditional_regression_661_inliner_min 114 (-44) -27.85%
slice_coercion_inliner_min 330 (-128) -27.95%
regression_7836_inliner_min 36 (-14) -28.00%
regression_7836_inliner_zero 36 (-14) -28.00%
global_array_rc_regression_8259_inliner_min 134 (-54) -28.72%
fold_after_inlined_calls_inliner_min 26 (-12) -31.58%
references_inliner_zero 248 (-118) -32.24%
global_slice_rc_regression_8259_inliner_min 260 (-133) -33.84%
submodules_inliner_min 25 (-13) -34.21%
unsafe_range_constraint_inliner_min 23 (-12) -34.29%
embedded_curve_ops_inliner_min 316 (-169) -34.85%
inline_never_basic_inliner_min 24 (-13) -35.14%
global_var_multiple_entry_points_nested_inliner_zero 31 (-17) -35.42%
strings_inliner_min 1,582 (-881) -35.77%
fold_basic_inliner_min 24 (-14) -36.84%
1327_concrete_in_generic_inliner_min 22 (-13) -37.14%
references_inliner_min 386 (-229) -37.24%
signed_division_inliner_min 158 (-98) -38.28%
derive_inliner_min 351 (-238) -40.41%
6_array_inliner_min 1,427 (-1,053) -42.46%
traits_in_crates_1_inliner_min 24 (-18) -42.86%
traits_in_crates_2_inliner_min 24 (-18) -42.86%
binary_operator_overloading_inliner_min 228 (-172) -43.00%
debug_logs_inliner_min 5,342 (-4,152) -43.73%
5_over_inliner_min 36 (-28) -43.75%
regression_7195_inliner_zero 31 (-25) -44.64%
array_dedup_regression_inliner_min 675 (-558) -45.26%
brillig_nested_arrays_inliner_zero 135 (-115) -46.00%
global_var_entry_point_used_in_another_entry_inliner_zero 46 (-40) -46.51%
slice_regex_inliner_min 4,383 (-3,842) -46.71%
4_sub_inliner_min 34 (-30) -46.88%
comptime_variable_at_runtime_inliner_min 79 (-72) -47.68%
global_var_regression_entry_points_inliner_min 69 (-65) -48.51%
trait_impl_base_type_inliner_min 123 (-130) -51.38%
brillig_calls_array_inliner_zero 96 (-103) -51.76%
fold_basic_nested_call_inliner_min 26 (-28) -51.85%
brillig_calls_array_inliner_min 114 (-124) -52.10%
brillig_rc_regression_6123_inliner_min 204 (-227) -52.67%
regression_4124_inliner_min 22 (-32) -59.26%
fold_distinct_return_inliner_min 29 (-43) -59.72%
inline_decompose_hint_brillig_call_inliner_min 75 (-112) -59.89%
regression_4088_inliner_min 22 (-34) -60.71%
regression_7195_inliner_min 31 (-49) -61.25%
global_var_func_with_multiple_entry_points_inliner_min 36 (-57) -61.29%
missing_closure_env_inliner_min 22 (-38) -63.33%
global_var_entry_point_used_in_another_entry_inliner_min 46 (-83) -64.34%
wrapping_operations_inliner_min 46 (-87) -65.41%
brillig_nested_arrays_inliner_min 135 (-274) -66.99%
brillig_calls_inliner_min 56 (-122) -68.54%
closures_mut_ref_inliner_min 34 (-79) -69.91%
import_inliner_min 34 (-92) -73.02%
brillig_acir_as_brillig_inliner_min 30 (-83) -73.45%
regression_6734_inliner_min 18 (-50) -73.53%
global_var_multiple_entry_points_nested_inliner_min 31 (-95) -75.40%
assert_inliner_min 22 (-105) -82.68%
brillig_rc_regression_6123_inliner_zero 18 (-177) -90.77%

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compilation Time

Details
Benchmark suite Current: 8071d98 Previous: 1feac18 Ratio
private-kernel-inner 2.372 s 2.314 s 1.03
private-kernel-reset 6.782 s 8.546 s 0.79
private-kernel-tail 1.088 s 1.066 s 1.02
rollup-base-private 16.08 s 16.04 s 1.00
rollup-base-public 12.86 s 12.88 s 1.00
rollup-block-root-empty 1.364 s 1.306 s 1.04
rollup-block-root-single-tx 125 s 125 s 1
rollup-block-root 128 s 121 s 1.06
rollup-merge 1.132 s 1.098 s 1.03
rollup-root 1.776 s 1.7 s 1.04
semaphore-depth-10 0.835 s 0.836 s 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Artifact Size

Details
Benchmark suite Current: 8071d98 Previous: 1feac18 Ratio
private-kernel-inner 1127.1 KB 1127.1 KB 1
private-kernel-reset 2051.1 KB 2051.1 KB 1
private-kernel-tail 583.1 KB 583.1 KB 1
rollup-base-private 5123.1 KB 5123.2 KB 1.00
rollup-base-public 3945.1 KB 3941.8 KB 1.00
rollup-block-root-empty 256.8 KB 256.8 KB 1
rollup-block-root-single-tx 25674.8 KB 25679 KB 1.00
rollup-block-root 25714.3 KB 25706 KB 1.00
rollup-merge 181.7 KB 181.7 KB 1
rollup-root 477.8 KB 473.4 KB 1.01
semaphore-depth-10 636.3 KB 636.3 KB 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Execution Time

Details
Benchmark suite Current: 8367c35 Previous: 1feac18 Ratio
private-kernel-inner 0.028 s 0.028 s 1
private-kernel-reset 0.16 s 0.185 s 0.86
private-kernel-tail 0.011 s 0.011 s 1
rollup-base-private 0.305 s 0.304 s 1.00
rollup-base-public 0.192 s 0.192 s 1
rollup-block-root 11.5 s 12.1 s 0.95
rollup-merge 0.004 s 0.004 s 1
rollup-root 0.013 s 0.013 s 1
semaphore-depth-10 0.02 s 0.02 s 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compilation Memory

Details
Benchmark suite Current: 8071d98 Previous: 1feac18 Ratio
private-kernel-inner 316.51 MB 316.51 MB 1
private-kernel-reset 572.22 MB 572.31 MB 1.00
private-kernel-tail 231.44 MB 231.43 MB 1.00
rollup-base-private 1440 MB 1440 MB 1
rollup-base-public 1470 MB 1470 MB 1
rollup-block-root-empty 400.74 MB 402.96 MB 0.99
rollup-block-root-single-tx 7220 MB 7220 MB 1
rollup-block-root 7220 MB 7220 MB 1
rollup-merge 385.03 MB 385.03 MB 1
rollup-root 446.38 MB 445.96 MB 1.00
semaphore_depth_10 128.47 MB 128.47 MB 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Execution Memory

Details
Benchmark suite Current: 8071d98 Previous: 1feac18 Ratio
private-kernel-inner 241.29 MB 241.29 MB 1
private-kernel-reset 263.55 MB 263.55 MB 1
private-kernel-tail 216.78 MB 216.78 MB 1
rollup-base-private 549.27 MB 549.27 MB 1
rollup-base-public 541.72 MB 541.72 MB 1
rollup-block-root 1450 MB 1450 MB 1
rollup-merge 370.98 MB 370.98 MB 1
rollup-root 376.99 MB 376.91 MB 1.00
semaphore_depth_10 93.04 MB 93.04 MB 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test Suite Duration

Details
Benchmark suite Current: 8071d98 Previous: 1feac18 Ratio
test_report_AztecProtocol_aztec-packages_noir-projects_aztec-nr 66 s 65 s 1.02
test_report_AztecProtocol_aztec-packages_noir-projects_noir-contracts 99 s 100 s 0.99
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_blob 44 s 48 s 0.92
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_private-kernel-lib 194 s 192 s 1.01
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_rollup-lib 183 s 182 s 1.01
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_types 61 s 60 s 1.02
test_report_noir-lang_noir-bignum_ 353 s 375 s 0.94
test_report_noir-lang_noir_bigcurve_ 225 s 227 s 0.99
test_report_noir-lang_sha512_ 30 s 30 s 1
test_report_zkpassport_noir_rsa_ 23 s 23 s 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Test Suite Duration'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: 5f4cd2d Previous: e2a52b7 Ratio
test_report_noir-lang_noir_bigcurve_ 284 s 230 s 1.23

This comment was automatically generated by workflow using github-action-benchmark.

CC: @TomAFrench

@vezenovm
Copy link
Contributor Author

vezenovm commented May 15, 2025

So we have this size regression:

to_bytes_consistent_inliner_min +53 ❌ +76.81%

And this execution trace regression:

to_bytes_consistent_inliner_min 510 (+33) +6.92%

We have this final SSA on master:

brillig(inline) predicate_pure fn main f0 {
  b0(v0: Field):
    v3 = call f1(v0) -> [u8; 31]
    v5 = call f1(Field 2040124) -> [u8; 31]
    jmp b1(u32 0)
  b1(v1: u32):
    v8 = lt v1, u32 31
    jmpif v8 then: b2, else: b3
  b2():
    v9 = array_get v5, index v1 -> u8
    v10 = array_get v3, index v1 -> u8
    constrain v9 == v10
    v12 = unchecked_add v1, u32 1
    jmp b1(v12)
  b3():
    return
}
brillig(inline) predicate_pure fn to_be_bytes f1 {
  b0(v0: Field):
    v3 = call to_be_radix(v0, u32 256) -> [u8; 31]
    return v3
}

And this SSA on this PR:

brillig(inline) predicate_pure fn main f0 {
  b0(v0: Field):
    v4 = call to_be_radix(v0, u32 256) -> [u8; 31]
    v9 = make_array [u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 0, u8 31, u8 33, u8 60] : [u8; 31]
    jmp b1(u32 0)
  b1(v1: u32):
    v12 = lt v1, u32 31
    jmpif v12 then: b2, else: b3
  b2():
    v13 = array_get v9, index v1 -> u8
    v14 = array_get v4, index v1 -> u8
    constrain v13 == v14
    v16 = unchecked_add v1, u32 1
    jmp b1(v16)
  b3():
    return
}

The second call to f1 is with a constant input. This gets simplified to a single make array. As to_be_radix is a single Brillig opcode the simplified SSA actually leads to a larger bytecode size as well as execution trace. This was captured in an issue already from a similar case here #8321.

Due to the other benefits in this PR I think it is ok to eat this edge case. Especially as it is only occurring for _inliner_min which is also an unlikely aggressiveness developers would use in practice (an aggressiveness at zero is the more useful setting).

@vezenovm vezenovm marked this pull request as ready for review May 15, 2025 17:22
@vezenovm vezenovm requested a review from a team May 15, 2025 17:22
Copy link
Collaborator

@asterite asterite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Copy link
Contributor

@jfecher jfecher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should wait to merge this until late Friday since it touches audit code

Copy link
Member

@TomAFrench TomAFrench left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, will defer to @michaeljklein or @aakoshh when it comes to hitting merge.

Copy link
Contributor

@aakoshh aakoshh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no problem with merging stuff into the audited code, I'm just writing a post-audit report at this point, but I think some extra comments and magic-number outsourcing could be helpful to explain in the code why we went from 1 to 10 in particular.

@vezenovm vezenovm requested a review from aakoshh May 19, 2025 16:26
@aakoshh aakoshh enabled auto-merge May 19, 2025 16:36
Copy link
Contributor

@aakoshh aakoshh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@aakoshh aakoshh added this pull request to the merge queue May 19, 2025
Merged via the queue into master with commit e33b900 May 19, 2025
118 checks passed
@aakoshh aakoshh deleted the mv/mark-more-simple-funcs-to-inline branch May 19, 2025 17:04
github-merge-queue bot pushed a commit to AztecProtocol/aztec-packages that referenced this pull request May 22, 2025
Automated pull of nightly from the
[noir](https://github.com/noir-lang/noir) programming language, a
dependency of Aztec.
BEGIN_COMMIT_OVERRIDE
fix(licm): Account for nested loops being control dependent when
analyzing outer loops (noir-lang/noir#8593)
chore(refactor): Switch unreachable function removal to use centralized
call graph (noir-lang/noir#8578)
chore(test): Allow lambdas in fuzzing
(noir-lang/noir#8584)
chore: use insta for execution_success stdout
(noir-lang/noir#8576)
chore: use generator instead of zero for ec-add predicate
(noir-lang/noir#8552)
fix: use predicate expression as binary result
(noir-lang/noir#8583)
fix(ssa): Do not generate apply functions when no lambda variants exist
(noir-lang/noir#8573)
chore: put `nargo expand` snapshosts in the same directory
(noir-lang/noir#8577)
chore: Use FxHashMap for TypeBindings
(noir-lang/noir#8574)
chore(experimental): use larger stack size for parsing
(noir-lang/noir#8347)
chore: use insta snapshots for compile_failure stderr
(noir-lang/noir#8569)
chore(inlining): Mark functions with <= 10 instructions and no control
flow as inline always (noir-lang/noir#8533)
chore(ssa): Add weighted edges to call graph, move callers and callees
methods to call graph (noir-lang/noir#8513)
fix(frontend): Override to allow empty array input
(noir-lang/noir#8568)
fix: avoid logging all unused params in DIE pass
(noir-lang/noir#8566)
chore: bump external pinned commits
(noir-lang/noir#8562)
chore(deps): bump base-x from 3.0.9 to 3.0.11
(noir-lang/noir#8555)
chore(fuzz): Call function pointers
(noir-lang/noir#8531)
feat: C++ codegen for msgpack
(noir-lang/noir#7716)
feat(performance): brillig array set optimization
(noir-lang/noir#8550)
chore(fuzz): AST fuzzer to use function valued arguments (Part 1)
(noir-lang/noir#8514)
fix(licm): Check whether the loop is executed when hoisting with a
predicate (noir-lang/noir#8546)
feat: Implement $crate (noir-lang/noir#8537)
fix: add offset to ArrayGet
(noir-lang/noir#8536)
chore: remove some unused enum variants and functions
(noir-lang/noir#8538)
fix: disallow `()` in entry points
(noir-lang/noir#8529)
chore: Remove println in ssa interpreter
(noir-lang/noir#8528)
fix: don't overflow when casting signed value to u128
(noir-lang/noir#8526)
chore(performance): Enable hoisting pure with predicate calls
(noir-lang/noir#8522)
feat(fuzz): AST fuzzing with SSA interpreter
(noir-lang/noir#8436)
chore: Add u1 ops to interpreter, convert Value panics to errors
(noir-lang/noir#8469)
chore: Release Noir(1.0.0-beta.6)
(noir-lang/noir#8438)
chore(fuzz): AST generator to add `ctx_limit` to all functions
(noir-lang/noir#8507)
fix(inlining): Use centralized CallGraph structure for inline info
computation (noir-lang/noir#8489)
fix: remove private builtins from `Field` impl
(noir-lang/noir#8496)
feat: primitive types are no longer keywords
(noir-lang/noir#8470)
fix: parenthesized pattern, and correct 1-element tuple printing
(noir-lang/noir#8482)
fix: fix visibility of methods in `std::meta`
(noir-lang/noir#8497)
fix: Change `can_be_main` to be recursive
(noir-lang/noir#8501)
chore: add SSA interpreter test for higher order functions
(noir-lang/noir#8486)
fix(frontend)!: Ban zero sized arrays and strings as program input
(noir-lang/noir#8491)
fix!: remove `to_be_radix` and `to_le_radix` from stdlib interface
(noir-lang/noir#8495)
END_COMMIT_OVERRIDE

---------

Co-authored-by: AztecBot <tech@aztecprotocol.com>
Co-authored-by: Tom French <15848336+TomAFrench@users.noreply.github.com>
AztecBot added a commit to AztecProtocol/aztec-nr that referenced this pull request May 23, 2025
Automated pull of nightly from the
[noir](https://github.com/noir-lang/noir) programming language, a
dependency of Aztec.
BEGIN_COMMIT_OVERRIDE
fix(licm): Account for nested loops being control dependent when
analyzing outer loops (noir-lang/noir#8593)
chore(refactor): Switch unreachable function removal to use centralized
call graph (noir-lang/noir#8578)
chore(test): Allow lambdas in fuzzing
(noir-lang/noir#8584)
chore: use insta for execution_success stdout
(noir-lang/noir#8576)
chore: use generator instead of zero for ec-add predicate
(noir-lang/noir#8552)
fix: use predicate expression as binary result
(noir-lang/noir#8583)
fix(ssa): Do not generate apply functions when no lambda variants exist
(noir-lang/noir#8573)
chore: put `nargo expand` snapshosts in the same directory
(noir-lang/noir#8577)
chore: Use FxHashMap for TypeBindings
(noir-lang/noir#8574)
chore(experimental): use larger stack size for parsing
(noir-lang/noir#8347)
chore: use insta snapshots for compile_failure stderr
(noir-lang/noir#8569)
chore(inlining): Mark functions with <= 10 instructions and no control
flow as inline always (noir-lang/noir#8533)
chore(ssa): Add weighted edges to call graph, move callers and callees
methods to call graph (noir-lang/noir#8513)
fix(frontend): Override to allow empty array input
(noir-lang/noir#8568)
fix: avoid logging all unused params in DIE pass
(noir-lang/noir#8566)
chore: bump external pinned commits
(noir-lang/noir#8562)
chore(deps): bump base-x from 3.0.9 to 3.0.11
(noir-lang/noir#8555)
chore(fuzz): Call function pointers
(noir-lang/noir#8531)
feat: C++ codegen for msgpack
(noir-lang/noir#7716)
feat(performance): brillig array set optimization
(noir-lang/noir#8550)
chore(fuzz): AST fuzzer to use function valued arguments (Part 1)
(noir-lang/noir#8514)
fix(licm): Check whether the loop is executed when hoisting with a
predicate (noir-lang/noir#8546)
feat: Implement $crate (noir-lang/noir#8537)
fix: add offset to ArrayGet
(noir-lang/noir#8536)
chore: remove some unused enum variants and functions
(noir-lang/noir#8538)
fix: disallow `()` in entry points
(noir-lang/noir#8529)
chore: Remove println in ssa interpreter
(noir-lang/noir#8528)
fix: don't overflow when casting signed value to u128
(noir-lang/noir#8526)
chore(performance): Enable hoisting pure with predicate calls
(noir-lang/noir#8522)
feat(fuzz): AST fuzzing with SSA interpreter
(noir-lang/noir#8436)
chore: Add u1 ops to interpreter, convert Value panics to errors
(noir-lang/noir#8469)
chore: Release Noir(1.0.0-beta.6)
(noir-lang/noir#8438)
chore(fuzz): AST generator to add `ctx_limit` to all functions
(noir-lang/noir#8507)
fix(inlining): Use centralized CallGraph structure for inline info
computation (noir-lang/noir#8489)
fix: remove private builtins from `Field` impl
(noir-lang/noir#8496)
feat: primitive types are no longer keywords
(noir-lang/noir#8470)
fix: parenthesized pattern, and correct 1-element tuple printing
(noir-lang/noir#8482)
fix: fix visibility of methods in `std::meta`
(noir-lang/noir#8497)
fix: Change `can_be_main` to be recursive
(noir-lang/noir#8501)
chore: add SSA interpreter test for higher order functions
(noir-lang/noir#8486)
fix(frontend)!: Ban zero sized arrays and strings as program input
(noir-lang/noir#8491)
fix!: remove `to_be_radix` and `to_le_radix` from stdlib interface
(noir-lang/noir#8495)
END_COMMIT_OVERRIDE

---------

Co-authored-by: AztecBot <tech@aztecprotocol.com>
Co-authored-by: Tom French <15848336+TomAFrench@users.noreply.github.com>
Thunkar pushed a commit to AztecProtocol/aztec-packages that referenced this pull request May 23, 2025
Automated pull of nightly from the
[noir](https://github.com/noir-lang/noir) programming language, a
dependency of Aztec.
BEGIN_COMMIT_OVERRIDE
fix(licm): Account for nested loops being control dependent when
analyzing outer loops (noir-lang/noir#8593)
chore(refactor): Switch unreachable function removal to use centralized
call graph (noir-lang/noir#8578)
chore(test): Allow lambdas in fuzzing
(noir-lang/noir#8584)
chore: use insta for execution_success stdout
(noir-lang/noir#8576)
chore: use generator instead of zero for ec-add predicate
(noir-lang/noir#8552)
fix: use predicate expression as binary result
(noir-lang/noir#8583)
fix(ssa): Do not generate apply functions when no lambda variants exist
(noir-lang/noir#8573)
chore: put `nargo expand` snapshosts in the same directory
(noir-lang/noir#8577)
chore: Use FxHashMap for TypeBindings
(noir-lang/noir#8574)
chore(experimental): use larger stack size for parsing
(noir-lang/noir#8347)
chore: use insta snapshots for compile_failure stderr
(noir-lang/noir#8569)
chore(inlining): Mark functions with <= 10 instructions and no control
flow as inline always (noir-lang/noir#8533)
chore(ssa): Add weighted edges to call graph, move callers and callees
methods to call graph (noir-lang/noir#8513)
fix(frontend): Override to allow empty array input
(noir-lang/noir#8568)
fix: avoid logging all unused params in DIE pass
(noir-lang/noir#8566)
chore: bump external pinned commits
(noir-lang/noir#8562)
chore(deps): bump base-x from 3.0.9 to 3.0.11
(noir-lang/noir#8555)
chore(fuzz): Call function pointers
(noir-lang/noir#8531)
feat: C++ codegen for msgpack
(noir-lang/noir#7716)
feat(performance): brillig array set optimization
(noir-lang/noir#8550)
chore(fuzz): AST fuzzer to use function valued arguments (Part 1)
(noir-lang/noir#8514)
fix(licm): Check whether the loop is executed when hoisting with a
predicate (noir-lang/noir#8546)
feat: Implement $crate (noir-lang/noir#8537)
fix: add offset to ArrayGet
(noir-lang/noir#8536)
chore: remove some unused enum variants and functions
(noir-lang/noir#8538)
fix: disallow `()` in entry points
(noir-lang/noir#8529)
chore: Remove println in ssa interpreter
(noir-lang/noir#8528)
fix: don't overflow when casting signed value to u128
(noir-lang/noir#8526)
chore(performance): Enable hoisting pure with predicate calls
(noir-lang/noir#8522)
feat(fuzz): AST fuzzing with SSA interpreter
(noir-lang/noir#8436)
chore: Add u1 ops to interpreter, convert Value panics to errors
(noir-lang/noir#8469)
chore: Release Noir(1.0.0-beta.6)
(noir-lang/noir#8438)
chore(fuzz): AST generator to add `ctx_limit` to all functions
(noir-lang/noir#8507)
fix(inlining): Use centralized CallGraph structure for inline info
computation (noir-lang/noir#8489)
fix: remove private builtins from `Field` impl
(noir-lang/noir#8496)
feat: primitive types are no longer keywords
(noir-lang/noir#8470)
fix: parenthesized pattern, and correct 1-element tuple printing
(noir-lang/noir#8482)
fix: fix visibility of methods in `std::meta`
(noir-lang/noir#8497)
fix: Change `can_be_main` to be recursive
(noir-lang/noir#8501)
chore: add SSA interpreter test for higher order functions
(noir-lang/noir#8486)
fix(frontend)!: Ban zero sized arrays and strings as program input
(noir-lang/noir#8491)
fix!: remove `to_be_radix` and `to_le_radix` from stdlib interface
(noir-lang/noir#8495)
END_COMMIT_OVERRIDE

---------

Co-authored-by: AztecBot <tech@aztecprotocol.com>
Co-authored-by: Tom French <15848336+TomAFrench@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bench-show Display benchmark results on PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants