Skip to content

fix: Add a remove_unreachable_instructions SSA pass (zeroing out terminators)#8979

Closed
asterite wants to merge 46 commits intomasterfrom
ab/ssa-remove-unreachable-instructions
Closed

fix: Add a remove_unreachable_instructions SSA pass (zeroing out terminators)#8979
asterite wants to merge 46 commits intomasterfrom
ab/ssa-remove-unreachable-instructions

Conversation

@asterite
Copy link
Collaborator

@asterite asterite commented Jun 19, 2025

Description

Problem

Resolves #8774
Resolves #8994

Summary

Alternative to #8943, or maybe something to do in addition to that (unsure yet).

I think the main issue in #8774 is that it ends up generating a load instruction for a value that can never happen, because a previous constrain is always false, and it's trying to get a value from an empty array. The solution here is to remove any instruction that comes after an "always fails" constrain in a block, also replacing that block's terminator zeroed values.

Additional Context

Documentation

Check one:

  • No documentation needed.
  • Documentation included in this PR.
  • [For Experimental Features] Documentation to be submitted in a separate PR.

PR Checklist

  • I have tested the changes locally.
  • I have formatted the changes with Prettier and/or cargo fmt on default settings.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 19, 2025

Changes to Brillig bytecode sizes

Generated at commit: aaba70c381f94304973d3ee46d4cbe6964c7df2e, compared to commit: 5fbc2a4b975f0f8141f9385a4d65884944092056

🧾 Summary (10% most significant diffs)

Program Brillig opcodes (+/-) %
conditional_1_inliner_min -34 ✅ -6.58%
conditional_1_inliner_zero -34 ✅ -6.58%

Full diff report 👇
Program Brillig opcodes (+/-) %
conditional_1_inliner_max 523 (-34) -6.10%
conditional_1_inliner_min 483 (-34) -6.58%
conditional_1_inliner_zero 483 (-34) -6.58%

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Execution Time'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: c4407d0 Previous: 2cfc786 Ratio
rollup-merge 0.004 s 0.003 s 1.33

This comment was automatically generated by workflow using github-action-benchmark.

CC: @TomAFrench

@asterite
Copy link
Collaborator Author

This is trickier than what I initially thought.

It's easy to remove all instructions after we find a constrain that'll always fail. It's also easy to zero that block's terminator.

But then we need to consider that block's successors as unreachable too, but if they can only be reached from that block. And this is the part that's hard to implement.

@asterite asterite added the bench-show Display benchmark results on PR label Jun 19, 2025
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACVM Benchmarks

Details
Benchmark suite Current: 2c957c6 Previous: 5fbc2a4 Ratio
purely_sequential_opcodes 251838 ns/iter (± 440) 253077 ns/iter (± 261) 1.00
perfectly_parallel_opcodes 222385 ns/iter (± 3434) 222890 ns/iter (± 10024) 1.00
perfectly_parallel_batch_inversion_opcodes 2794248 ns/iter (± 3270) 2777457 ns/iter (± 6234) 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compilation Time

Details
Benchmark suite Current: 2c957c6 Previous: 5fbc2a4 Ratio
private-kernel-inner 2.424 s 2.354 s 1.03
private-kernel-reset 7.852 s 7.446 s 1.05
private-kernel-tail 1.088 s 1.086 s 1.00
rollup-base-private 16.04 s 15.56 s 1.03
rollup-base-public 14.3 s 12.98 s 1.10
rollup-block-root-empty 18.92 s 18.56 s 1.02
rollup-block-root-single-tx 184 s 187 s 0.98
rollup-block-root 188 s 185 s 1.02
rollup-merge 1.262 s 1.368 s 0.92
rollup-root 1.274 s 1.348 s 0.95
semaphore-depth-10 0.763 s 0.765 s 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Artifact Size

Details
Benchmark suite Current: 2c957c6 Previous: 5fbc2a4 Ratio
private-kernel-inner 1135.2 KB 1135.2 KB 1
private-kernel-reset 2072.4 KB 2072.4 KB 1
private-kernel-tail 589.1 KB 589.1 KB 1
rollup-base-private 4933.9 KB 4933.9 KB 1
rollup-base-public 3974.5 KB 3974.5 KB 1
rollup-block-root-empty 3876.9 KB 3876.9 KB 1
rollup-block-root-single-tx 32734 KB 32734 KB 1
rollup-block-root 32763.9 KB 32763.9 KB 1
rollup-merge 185.8 KB 185.8 KB 1
rollup-root 410.2 KB 410.2 KB 1
semaphore-depth-10 636.4 KB 636.4 KB 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Execution Time

Details
Benchmark suite Current: 2c957c6 Previous: 5fbc2a4 Ratio
private-kernel-inner 0.029 s 0.027 s 1.07
private-kernel-reset 0.163 s 0.162 s 1.01
private-kernel-tail 0.011 s 0.011 s 1
rollup-base-private 0.288 s 0.293 s 0.98
rollup-base-public 0.192 s 0.185 s 1.04
rollup-block-root 13.3 s 13.4 s 0.99
rollup-merge 0.004 s 0.004 s 1
rollup-root 0.005 s 0.005 s 1
semaphore-depth-10 0.022 s 0.022 s 1

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test Suite Duration

Details
Benchmark suite Current: 2c957c6 Previous: 3838c69 Ratio
test_report_AztecProtocol_aztec-packages_noir-projects_aztec-nr 86 s 75 s 1.15
test_report_AztecProtocol_aztec-packages_noir-projects_noir-contracts 127 s 127 s 1
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_blob 147 s 141 s 1.04
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_private-kernel-lib 212 s 204 s 1.04
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_rollup-lib 564 s
test_report_AztecProtocol_aztec-packages_noir-projects_noir-protocol-circuits_crates_types 74 s 73 s 1.01
test_report_noir-lang_noir-bignum_ 388 s
test_report_noir-lang_noir_bigcurve_ 243 s
test_report_noir-lang_sha512_ 15 s 15 s 1
test_report_zkpassport_noir-ecdsa_ 106 s 115 s 0.92
test_report_zkpassport_noir_rsa_ 3 s 2 s 1.50

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compilation Memory

Details
Benchmark suite Current: 2c957c6 Previous: 5fbc2a4 Ratio
private-kernel-inner 293.27 MB 293.23 MB 1.00
private-kernel-reset 533.88 MB 534.34 MB 1.00
private-kernel-tail 191.87 MB 191.86 MB 1.00
rollup-base-private 1400 MB 1400 MB 1
rollup-base-public 1530 MB 1530 MB 1
rollup-block-root-empty 1090 MB 1090 MB 1
rollup-block-root-single-tx 9410 MB 9410 MB 1
rollup-block-root 9420 MB 9420 MB 1
rollup-merge 344.77 MB 344.77 MB 1
rollup-root 354.76 MB 354.76 MB 1
semaphore_depth_10 106.38 MB 106.44 MB 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Execution Memory

Details
Benchmark suite Current: 2c957c6 Previous: 5fbc2a4 Ratio
private-kernel-inner 202.35 MB 202.35 MB 1
private-kernel-reset 226.1 MB 226.1 MB 1
private-kernel-tail 177.64 MB 177.64 MB 1
rollup-base-private 507.42 MB 507.42 MB 1
rollup-base-public 440.61 MB 440.61 MB 1
rollup-block-root 1510 MB 1510 MB 1
rollup-merge 329.7 MB 329.7 MB 1
rollup-root 331.8 MB 331.8 MB 1
semaphore_depth_10 71.01 MB 71.01 MB 1

This comment was automatically generated by workflow using github-action-benchmark.

@asterite asterite requested a review from a team June 19, 2025 21:22
@asterite
Copy link
Collaborator Author

This code seems to crash the compiler now:

fn main() {
    unsafe { func_2(true) };
}

unconstrained fn func_2(cond: bool) {
    loop {
        if cond {
            break
        }

        assert(false);

        loop {
            if cond {
                break
            }
        }
    }
}

We end up with this SSA after purity analysis, then it crashes:

acir(inline) predicate_pure fn main f0 {
  b0():
    call f1(u1 1)
    return
}
brillig(inline) predicate_pure fn func_2 f1 {
  b0(v0: u1):
    jmpif v0 then: b1, else: b2
  b1():
    return
  b2():
    constrain u1 0 == u1 1
    jmp b3()
  b3():
    jmp b3()
}

The error is:

The application panicked (crashed).
Message:  internal error: entered unreachable code: Post order for unreachable block is undefined
Location: compiler/noirc_evaluator/src/ssa/ir/dom.rs:79

Not sure why it errors... maybe because it keeps looping forever in b3?

I could try to fix it, but... I really feel like having an unreachable terminator is the more correct way to do this. I understand that it doesn't work if we run that pass first in the pipeline, but it doesn't work either without unreachable. Then, my idea with this pass was to run it right before acir/brillig so that those don't have to deal with undefined values (I think in LLVM these are poison values, not sure!). It leads to cleaner SSA and I think the only thing missing would be to handle these during inlining, but it could be done later on, if really needed.

Also, with unreachable we can determine that a function never returns, if it doesn't have any return. This could happen if a branch with a return has an instruction that always fails, then its terminator is replaced with unreachable. Then when inlining that function into other functions we can know that calls to it are unreachable and keep pruning code. If we use zeroed values we can't do that.

It also aligns with at least LLVM, which has an unreachable terminator: https://llvm.org/docs/LangRef.html#unreachable-instruction

So I feel like reverting back to 6e4e138

@aakoshh
Copy link
Contributor

aakoshh commented Jun 24, 2025

I'll have a look.

@aakoshh
Copy link
Contributor

aakoshh commented Jun 24, 2025

Right, because of the zeroed values, what starts out as this:

  b5(): 
    jmpif v0 then: b7, else: b8 // The 2nd  `if cond { break }`

is replaced with this:

  b5():
    jmpif u1 0 then: b7, else: b8

and a subsequent step optimises out the break and now we have a tight loop, which during LICM breaks the dominator tree.

I'll try if my earlier suggestion of using an undefined reference could prevent these optimisations, but this crash can be induced without your pass, so I'd consider it a standalone bug as well that needs to be investigated. I'll create a ticket about that.

@aakoshh
Copy link
Contributor

aakoshh commented Jun 24, 2025

I hear what you are saying about focusing on fixing the ACIR generation, and not leaving an SSA that defies expectations. Thanks for pointing out that LLVM has unreachable as well.

I do wonder though: if we just want to make sure ACIR generation doesn't go into undefined territory, whether the detection of always-failing constraints should be placed there, instead of preparing an SSA that is difficult for other passes to process. That is, to quit producing ACIR for a block after the constraint has been encountered.

@aakoshh aakoshh force-pushed the ab/ssa-remove-unreachable-instructions branch from ba520ed to 8abe8f8 Compare June 24, 2025 10:38
@aakoshh aakoshh force-pushed the ab/ssa-remove-unreachable-instructions branch from 6037ffa to 2e13bd6 Compare June 24, 2025 11:00
@aakoshh
Copy link
Contributor

aakoshh commented Jun 24, 2025

Your example program that crashed above works now:

After Remove Unreachable Instructions (1) (step 1):
acir(inline) fn main f0 {
  b0():
    call f1(u1 1)
    return
}
brillig(inline) fn func_2 f1 {
  b0(v0: u1):
    jmp b1()
  b1():
    jmpif v0 then: b3, else: b4
  b2():
    return
  b3():
    jmp b2()
  b4():
    constrain u1 0 == u1 1
    jmp b5()
  b5():
    v3 = allocate -> &mut u1
    v4 = load v3 -> u1
    jmpif v4 then: b7, else: b8
  b6():
    jmp b1()
  b7():
    jmp b6()
  b8():
    jmp b5()
}
...
After Dead Instruction Elimination - ACIR (1) (step 48):
acir(inline) predicate_pure fn main f0 {
  b0():
    return
}

@asterite
Copy link
Collaborator Author

That is, to quit producing ACIR for a block after the constraint has been encountered.

We could do that. But then if in Brillig we also have a case of not wanting some instructions past an always failing instruction we'd have to do the same logic. In my mind SSA should be prepared so that ACIR and brillig do the minimal thing.

I've been also thinking that with unreachable the logic of this pass is simpler than what I wrote: when we change a block's terminator to unreachable then automatically all dominated blocks become unreachable... because they won't be reached anymore! So we don't need to do this with the dom.

I think I'll open a separate PR with the unreachable logic, as I mentioned before, but I'll try again to make it work with inlining. It shouldn't be that hard: if we try to inline a call that has no returns then it's unreachable. Then we should stop inserting instructions past that call, change the terminator to unreachable... but then we'd need to avoid visiting dominated blocks, and I think here we'd need to use the Dom type. I'll check :-)

@aakoshh
Copy link
Contributor

aakoshh commented Jun 24, 2025

I've been also thinking that with unreachable the logic of this pass is simpler than what I wrote: when we change a block's terminator to unreachable then automatically all dominated blocks become unreachable... because they won't be reached anymore!

Having seen both, I don't know this pass will be that much simpler; it wasn't easy to spot the differences, as you remember my questions. But your fantastic test coverage provides examples that cover the whole range 👍

The flip side of any simplification here is potentially making other passes more complex, because we have to think about a loop that starts but never finishes because one of its blocks is has an unreachable terminator.

Definitely do explore this option! I just wanted to avoid yolo'ing into it without updating the other passes to handle it.

@aakoshh aakoshh self-requested a review June 24, 2025 11:42
@aakoshh aakoshh changed the title fix: add a remove_unreachable_instructions SSA pass fix: Add a remove_unreachable_instructions SSA pass Jun 24, 2025
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Execution Memory'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: d2cf2b7 Previous: 5fbc2a4 Ratio
rollup-base-private 1400 MB 507.42 MB 2.76

This comment was automatically generated by workflow using github-action-benchmark.

CC: @TomAFrench

@asterite asterite changed the title fix: Add a remove_unreachable_instructions SSA pass fix: Add a remove_unreachable_instructions SSA pass (zeroing out terminators) Jun 24, 2025
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Execution Memory'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: e141268 Previous: 5fbc2a4 Ratio
rollup-base-private 1400 MB 507.42 MB 2.76

This comment was automatically generated by workflow using github-action-benchmark.

CC: @TomAFrench

@asterite
Copy link
Collaborator Author

Definitely do explore this option! I just wanted to avoid yolo'ing into it without updating the other passes to handle it.

And you are still not in favor of introducing the unreachable terminator in a pass near the end, avoiding the need to handle it during inlining, and having that be done in a later PR?

@aakoshh
Copy link
Contributor

aakoshh commented Jun 24, 2025

Well, having worked on this for a while now, I came around to your position that having an explicit unreachable would be better than what I'm doing now, which is to try to construct an SSA that doesn't break anything, because it seems very tricky to do so.

I'd ask around whether everyone is on board with having it left panicky in earlier passes.

@aakoshh
Copy link
Contributor

aakoshh commented Jun 24, 2025

I found the instance where unrolling doesn't work.

Say we start with this:

  b13():
    constrain u1 0 == u1 1, "OSA"
    jmp b15(u32 1338103543)
  b15(v34: u32):
    v68 = lt v34, u32 1338103546
    jmpif v68 then: b16, else: b17

after this pass, it becomes:

  b13():
    constrain u1 0 == u1 1, "OSA"
    jmp b15(u32 1338103543)

  b15(v34: u32):
    v67 = allocate -> &mut u1
    v68 = load v67 -> u1
    jmpif v68 then: b16, else: b17

But unrolling expects to find an lt and look at its arguments to find the upper bound.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Test Suite Duration'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: 2c957c6 Previous: 3838c69 Ratio
test_report_zkpassport_noir_rsa_ 3 s 2 s 1.50

This comment was automatically generated by workflow using github-action-benchmark.

CC: @TomAFrench

@asterite
Copy link
Collaborator Author

Closed in favor of #9008

@asterite asterite closed this Jun 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bench-show Display benchmark results on PR

Projects

None yet

2 participants