Skip to content

fix: handle unconditional break in SSA gen in all cases#8849

Closed
asterite wants to merge 2 commits intomasterfrom
ab/unconditional-break-take-2
Closed

fix: handle unconditional break in SSA gen in all cases#8849
asterite wants to merge 2 commits intomasterfrom
ab/unconditional-break-take-2

Conversation

@asterite
Copy link
Collaborator

@asterite asterite commented Jun 9, 2025

Description

Problem

Resolves #8836

Summary

In the end I ended up implementing this in the way that I initially wanted to avoid, mainly because it involved a large change: changing codegen_expression to return an optional value instead of a value. No value is returned when a break or continue were found.

Because a code like { break; 1 } can happen in any expression, there's a change that any expression doesn't return a value. In the compiler that translates to codegen_expression returning an optional value too. Then we are forced to handle this missing value everywhere. Before this it was only handled in a few places but there was no way the compiler would force this handling like that.

Additional Context

Let me know if there's a Rust pattern to avoid writing so many let Some(...) = ... else { return Ok(None); }. The problem here is that it's an Option nested in a Result.

Documentation

Check one:

  • No documentation needed.
  • Documentation included in this PR.
  • [For Experimental Features] Documentation to be submitted in a separate PR.

PR Checklist

  • I have tested the changes locally.
  • I have formatted the changes with Prettier and/or cargo fmt on default settings.

@asterite asterite requested a review from a team June 9, 2025 18:26
@vezenovm
Copy link
Contributor

vezenovm commented Jun 9, 2025

No value is returned when a break or continue were found.

Could we instead introduce a never/! type and return a value with a never of ! type like in Rust from a break? I noticed you mentioned it in the initial PR that fixes some unconditional break cases #8712. ! can then be coerced into any type.

@asterite
Copy link
Collaborator Author

asterite commented Jun 9, 2025

No value is returned when a break or continue were found.

Oh, I mean, None is returned when break or continue where found during codegen. So Ok(None) is returned in that case.

It would still be nice to eventually introduce !, but I don't think it would affect the codegen code.

@vezenovm
Copy link
Contributor

vezenovm commented Jun 9, 2025

It would still be nice to eventually introduce !, but I don't think it would affect the codegen code.

Ah you're right I think it would not affect that code gen still would not return any value. Perhaps we can avoid the Option threading though? In the other PR we implemented an open/closed block state, which I would prefer to use here as well. Why was using this state insufficient in this case? I'm assuming it was insufficient because we are returning a unit value which we call eval upon. Could we introduce a dummy value for this case?

Also, instead of Values being just a type alias for Tree<Value> we could expand it to a more general code gen state that tracks control flow divergence explicitly (e.g. Values becomes an enum). I find this clearer than current_block_closed as well we always know whether we are performing code gen and it is not the responsibility of the caller to fetch current_block_is_closed to determine whether to code gen. This can be done in later work though just something I noticed when looking at how we tracking the open/close block state.

@asterite
Copy link
Collaborator Author

asterite commented Jun 9, 2025

I'll try the enum approach.

In the other PR we implemented an open/closed block state, which I would prefer to use here as well. Why was using this state insufficient in this case?

The problem with that approach is that we had to explicitly check whether a block was closed before continuing generating code in some cases. For example the codegen for array generates the code for each expression, then calls insert_array. That function will insert the instruction and call .first() on the returned value, because it assumes it will return a single value. However, because the block is closed the instruction is removed, and .first() panics. Then we need to check if the block is closed before calling insert_array. So we need to remember to check that condition in many cases. With codegen_expression returning Option now, there's no way to call insert_array if we get None from that.

But I'll try the enum approach to see if it makes the code a bit better.

@vezenovm
Copy link
Contributor

vezenovm commented Jun 9, 2025

Then we need to check if the block is closed before calling insert_array. So we need to remember to check that condition in many cases. With codegen_expression returning Option now, there's no way to call insert_array if we get None from that.

Makes sense. Having to check the open/closed state in many places was part of the motivation of returning the control flow state explicitly. I think in this case it would be better to move an enum that tracks the control flow state and whether we have a divergence.

@vezenovm
Copy link
Contributor

vezenovm commented Jun 9, 2025

I also got the issue snippet compiling and executing with this change:

    fn codegen_non_tuple_expression(&mut self, expr: &Expression) -> Result<ValueId, RuntimeError> {
        let expression = self.codegen_expression(expr)?;
        if self.builder.current_block_is_closed() {
            let zero = self.builder.numeric_constant(0_u128, NumericType::NativeField);
            return Ok(zero);
        }
        Ok(expression.into_leaf().eval(self))
    }

@jfecher
Copy link
Contributor

jfecher commented Jun 9, 2025

Because a code like { break; 1 } can happen in any expression, there's a change that any expression doesn't return a value

Maybe we should change it to not return an Option then. I don't know about the enum approach - I think that will end up being more code as well. I haven't gotten to read the PR yet but why do we return Ok(None) instead of an Err? This should be invalid code that we want to error on so it seems simpler to fail faster.

@vezenovm
Copy link
Contributor

vezenovm commented Jun 9, 2025

I think that will end up being more code as well. I haven't gotten to read the PR yet but why do we return Ok(None) instead of an Err? This should be invalid code that we want to error on so it seems simpler to fail faster.

We could just error, however, in Rust the code in the issue snippet does compile and execute (although it does nothing aside break out of a loop).

@asterite
Copy link
Collaborator Author

asterite commented Jun 9, 2025

Right, in Rust it's a bit strange because the type of { break; true } is bool, not !. But the type is ! when break is the last statement in a block. Maybe they do this for simplicity, so a block's type is always that of the last statement?

@jfecher
Copy link
Contributor

jfecher commented Jun 9, 2025

Maybe it'd be best to try to filter break/continue in statements before SSA then? We already have a check for the warning. Maybe we should filter out statements after that point. E.g. still elaborate them but don't insert them to check for type errors but don't actually insert them into the block.

Then we should be done since our parser should only parse break/continue in statements and not arbitrary expressions.

Edit: I suppose we'd still need to handle {} at arbitrary points, e.g. foo(3, { break; }).

@jfecher
Copy link
Contributor

jfecher commented Jun 9, 2025

Maybe they do this for simplicity, so a block's type is always that of the last statement?

Yeah, it makes sense. It doesn't break any typing rules and composes better since break could also e.g. be behind an if { if foo { break; } 42 }. Why have an additional type rule for when break is directly a statement when it isn't needed?

@jfecher
Copy link
Contributor

jfecher commented Jun 9, 2025

We could just error, however, in Rust the code in the issue snippet does compile and execute (although it does nothing aside break out of a loop).

Good point. We could define a variant like RuntimeError::Break that is only locally an error and meant to be caught at other points. This avoids the Ok(None) handling. That or define our own type and implement Try for it (although this requires nightly still IIRC).

The first approach with RuntimeError::Break is what the comptime interpreter does for break and it seems to work rather well there.

@asterite
Copy link
Collaborator Author

asterite commented Jun 9, 2025

Edit: I suppose we'd still need to handle {} at arbitrary points, e.g. foo(3, { break; }).

Right, I was about to reply that I'll try the filtering approach (which was there some time ago before I made the typing like I commented above) but then in code like { break; 1 } + 2 we'd have to check whether the lhs stopped because of a break, etc., which is more or less what's done in the codegen here.

I'll still try to think of a way to avoid all of this let Some(...) = ... else { return Ok(None); } code.

@jfecher
Copy link
Contributor

jfecher commented Jun 9, 2025

I'll still try to think of a way to avoid all of this let Some(...) = ... else { return Ok(None); } code.

I'd recommend an error variant like RuntimeError::Break. It's a bit of a misuse of Result but I think it works and isn't too bad in practice. At least while Try is still unstable: https://doc.rust-lang.org/std/ops/trait.Try.html

@asterite
Copy link
Collaborator Author

asterite commented Jun 9, 2025

I'd recommend an error variant like RuntimeError::Break

Ah, that was how I initially tried to implement it 😅 I got confused because I wasn't sure where I'd need to handle breaks (I initially thought it was in codegen_block) so I abandoned that approach. After implementing the current approach now I understand why it didn't work (or at least why I think it wouldn't easily work).

The problem is that right now, in this PR's code, whenever we call codegen_expression we have two ways to handle it:

  1. Don't continue generating code in the current function
  2. Can't use the returned value so don't use it for something, but continue generating code

1 is the most common scenario.

An example of 2 is when we generate an if's "then" branch: if the "then" had a break, we don't try to jump after the body because it already jumped, but we still continue switching to the "else" branch and generating that. If that retuned RuntimeError::Break we'd incorrectly skip generating the else branch. I know we can specifically handle this case, but we'd have to remember to handle it without the compiler's help.

Another example is when generating a loop's body: it's fine if the body breaks, but we still want to call exit_loop, etc. With RuntimeError::Break we'd again have to remember to handle this scenario.

@jfecher
Copy link
Contributor

jfecher commented Jun 9, 2025

I got confused because I wasn't sure where I'd need to handle breaks (I initially thought it was in codegen_block)

We'd have to handle it wherever there is branching. We can think of break as terminating the current line of execution so we either percolate up to the current loop, or we hit an expression like If or Match where the current branch can end without sinking all the other branches.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Execution Time'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: 39bdeb9 Previous: bc6b68b Ratio
rollup-merge 0.004 s 0.003 s 1.33

This comment was automatically generated by workflow using github-action-benchmark.

CC: @TomAFrench

@asterite
Copy link
Collaborator Author

asterite commented Jun 9, 2025

I'll open another PR with that approach.

@asterite asterite closed this Jun 10, 2025
@asterite asterite deleted the ab/unconditional-break-take-2 branch June 10, 2025 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Compiler crash: into_leaf called on Tree::Branch

3 participants