Skip to content

chore(ssa): Assert that array offset are not set#9948

Closed
aakoshh wants to merge 7 commits intomasterfrom
af/array-idx-offset
Closed

chore(ssa): Assert that array offset are not set#9948
aakoshh wants to merge 7 commits intomasterfrom
af/array-idx-offset

Conversation

@aakoshh
Copy link
Contributor

@aakoshh aakoshh commented Sep 22, 2025

Description

Problem*

Part of auditing brillig_array_get_and_set.

Summary*

Add assertions that ArrayGet::offset and ArraySet::offset are None in some operations where they weren't handled.

For example Instruction::has_side_effects uses DataFlowGraph::is_safe_index to test that a constant index is less than the length of the array, however it did not take into account that the offset could have made the index equal to the logical size.

Another example is optimize_length_one_array_read: it was called after we established that the index is not a constant, so the offset should be None even if the pass was executed already, and yet it received an offset as a parameter and used it with a constant zero index, which wouldn't make sense for anything but None.

We could fix this by accounting for the offset and subtracting its value. This PR instead just panics, as this pass should not be followed up by anything that accesses these functions at the moment.

Removes offset handling from remove_unreachable_instructions: since it appeared under runtime.is_acir(), it should never be set anyway.

Additional Context

I am not exactly sure why this pass exists if I'm honest. I understand what it's doing, thanks to @asterite having added docs during the green lighting, however during Brillig codegen we check if the instruction has offset and if the index is dynamic then we have to generate offsets anyway, so why don't we insert the necessary constants into the DFG during codegen (not Brillig opcodes to calculate them, just a new constant based on the Array/Slice type), and then avoid having to have the offset: _ pattern throughout the codebase, and having to reason about whether using index without considering offset is correct or not?

The fact that it can manipulate indexes also sounds like a footgun based on the experience of setting the index to 0 in #9888 causing type inconsistencies.

Documentation*

Check one:

  • No documentation needed.
  • Documentation included in this PR.
  • [For Experimental Features] Documentation to be submitted in a separate PR.

PR Checklist*

  • I have tested the changes locally.
  • I have formatted the changes with Prettier and/or cargo fmt on default settings.

@aakoshh aakoshh requested a review from a team September 22, 2025 15:09
@TomAFrench
Copy link
Member

Looks like this is for #9218

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Execution Time'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: 5c47130 Previous: f355e1d Ratio
rollup-checkpoint-merge 0.004 s 0.003 s 1.33

This comment was automatically generated by workflow using github-action-benchmark.

CC: @TomAFrench

@asterite
Copy link
Collaborator

however during Brillig codegen we check if the instruction has offset and if the index is dynamic then we have to generate offsets anyway, so why don't we insert the necessary constants into the DFG during codegen (not Brillig opcodes to calculate them, just a new constant based on the Array/Slice type), and then avoid having to have the offset: _ pattern throughout the codebase, and having to reason about whether using index without considering offset is correct or not?

I think this could be done, though I didn't check where constants are initialized in brillig (or, well, it doesn't seem there's a single place where this happens).

I think the reason is that if you have this:

v0 = array_get v1, index u32 10 -> Field

brillig will generate a CONST op for 10, then refer to that memory address for the array get.

That u32 10 comes from a ValueId. Then a numeric constant is transformed into a const here:

Value::NumericConstant { constant, .. } => {

I guess what you are proposing is, when in Brillig we need to generate code for the above array_get, if we see a constant index and, say, it's an array and not a slice, transform that 10 into 11. But then it's not guaranteed that that 11 has a ValueId so we'd need to either generate a new ValueId (impossible at this point) or just to brillig math with opcodes to reach that final value. Or, alternatively, which I guess is also what you propose here, is to do a pre-pass in Brillig to already have that 11 as a CONST somewhere, or as a ValueId, etc. But I think this is tricky to do, though it might be very well worth it to simplify things.

(I'm not entirely sure what I said above is correct so someone please correct me)

@vezenovm
Copy link
Contributor

if the index is dynamic then we have to generate offsets anyway

In theory we could. I originally did the optimization this way, but I cannot remember why I did not keep it that way.

But then it's not guaranteed that that 11 has a ValueId so we'd need to either generate a new ValueId (impossible at this point) or just to brillig math with opcodes to reach that final value

Yes we cannot simply have the constants as part of the DFG and use them in Brillig. They still needed to be allocated to a register using the CONST opcode. The block in which they are allocated can have a significant affect on the byte code executed (e.g., in a loop vs. in the pre-header).

Or, alternatively, which I guess is also what you propose here, is to do a pre-pass in Brillig to already have that 11 as a CONST somewhere, or as a ValueId, etc. But I think this is tricky to do, though it might be very well worth it to simplify things.

So we do already have a pre-pass on Brillig where we declare the locations to allocate constants. We can see some of the regressions when we attempted to move this logic back to Brillig gen #8532. We could just move this pass to Ssa::to_brillig as a way to guarantee it is only a pre-Brillig pass and separate from our normal SSA pipeline. This was always meant to be the last pass to precede Brillig's constant allocation pass. I would rather keep the pass and move where it is run. It will have the same effect as the pre-pass @asterite mentioned.

@vezenovm
Copy link
Contributor

I agree that the idea of the "offset" is confusing. But I would say this is largely due to ACIR/Brillig sharing instructions to perform array operations while their array semantics differ. In ACIR arrays are truly arrays representing a block of memory. In Brillig they are basically pointers. Thus, those pointers have their own layout and they must be appropriately offset. All this pass does is essentially pre-compute any offsets it can. So yes, this is unique to Brillig, but it also needs to occur before constant allocation. I would not want to clutter constant allocation with this logic as it is better to maintain separation of concerns.

@aakoshh
Copy link
Contributor Author

aakoshh commented Sep 22, 2025

Or, alternatively, which I guess is also what you propose here, is to do a pre-pass in Brillig to already have that 11 as a CONST somewhere, or as a ValueId, etc. But I think this is tricky to do, though it might be very well worth it to simplify things.

I wonder if we could do a bit of a slight of hand:

  1. In brillig_array_get_set we visit all the ArrayGet and ArraySet, we call compute_index_and_offset to lay the constant in the &mut DataFlowGraph while we can mutate it with make_constant, but do not update the index in the instruction. We just made some groundwork.
  2. In BrilligBlock::convert_ssa_instruction we recalculate the offset and call a new DataFlowGraph::get_constant method which looks up the previously prepared constants. It's ICE if they are not there.
  3. Since these constants don't appear in the SSA, we call brillig_array_get_set as part of Brillig gen at an appropriate place.

This way the internal memory layout of Brillig remains its own concern, and we don't need the offset field any more. Since we are using make_constant, as long as they are not forgotten, they get deduplicated with any other use of the same value ID.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Test Suite Duration'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: 5c47130 Previous: f355e1d Ratio
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_blob 321 s 253 s 1.27
test_report_zkpassport_noir-ecdsa_ 3 s 2 s 1.50

This comment was automatically generated by workflow using github-action-benchmark.

CC: @TomAFrench

@aakoshh
Copy link
Contributor Author

aakoshh commented Sep 22, 2025

Although I do realise this doesn't mesh well with functions like variables_used_in_instruction, so maybe modifying the index is the best course of action, but perhaps if this pass was "internal" to the Brillig generation, then at least we wouldn't need to keep the offset on the instructions.

@vezenovm
Copy link
Contributor

  • In BrilligBlock::convert_ssa_instruction we recalculate the offset and call a new DataFlowGraph::get_constant method which looks up the previously prepared constants. It's ICE if they are not there.
  • Since these constants don't appear in the SSA, we call brillig_array_get_set as part of Brillig gen at an appropriate place.

This feels like added complexity for little gain imo (all we get is that removed the offset field). A little nit-pick is also that if we are already inside of BrilligBlock we have skipped constant allocation and will see performance regressions (this mostly will affect array accesses in loops). Doing something like what you propose would require modifying the constant allocation analysis.

This way the internal memory layout of Brillig remains its own concern, and we don't need the offset field any more. Since we are using make_constant, as long as they are not forgotten, they get deduplicated with any other use of the same value ID.

The key thing is that they will only be deduplicated in the DFG. Not in Brillig memory. As you said Brillig is responsible for its own memory allocation and must determine how to allocate registers for constants as this is not a concept in our SSA. I contend it would be cleaner to declare constants while still in SSA form and lower them with a single strategy in Brillig gen. I agree though that this logic could be more clearly made Brillig only.

Although I do realise this doesn't mesh well with functions like variables_used_in_instruction, so maybe modifying the index is the best course of action, but perhaps if this pass was "internal" to the Brillig generation, then at least we wouldn't need to keep the offset on the instructions.

I think it may be cleanest to still just keep the offset while the two runtimes still share array instructions. We can add a restriction on SSA that ACIR array instructions must always have ArrayOffset::None. Another option would be to introduce a BrilligArrayGet and BrilligArraySet and that has an offset and is only introduced after this pass. I don't really like this option though. I think the best middle ground would be moving this to be an internal Brillig SSA pass. In general, separate compilation pipelines for our runtime semantics that differ significantly would be ideal so I think doing this would be a move in that direction.

@aakoshh
Copy link
Contributor Author

aakoshh commented Sep 22, 2025

Thanks, yeah, I see that it's all about the DFG. I'll open a PR soon that removes the offset from the instructions, while keeping this logic in the pass, and does its best to even present it to humans.

Here: #9956

@aakoshh
Copy link
Contributor Author

aakoshh commented Sep 23, 2025

Closing in favour of #9956

@aakoshh aakoshh closed this Sep 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants