Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add failing test for proper termination #99

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Conversation

timholy
Copy link
Member

@timholy timholy commented Jan 21, 2024

This is related to recent work on handling control-flow properly. In brief, this PR contains an example for which our current approach fails and which seems to require a breaking change to fix. @aviatesk, since JET is the only other (besides Revise) direct dependent of LoweredCodeUtils, I thought we should coordinate about this before I take any steps to implement this.

Longer version:

Currently we use a heuristic to avoid marking the exit of the last basic block, expecting it to be a return and so that execution will cease whether we evaluate it or not. However, it is possible to write code for which the final statement is a GotoNode and in this case we can get incorrect answers if we fail to evaluate it.

This example illustrates a tricky point: return both terminates execution but may also return a value. If we don't require the value, we still might need to terminate execution. This example seems to illustrate that having isrequired[i] be either true or false may be insufficiently expressive; we might need it to be three states, :no, :yes, :exit. During marking, encountering :exit would not force one to evaluate the returned SSAValue.

@aviatesk
Copy link
Member

since JET is the only other (besides Revise) direct dependent of LoweredCodeUtils, I thought we should coordinate about this before I take any steps to implement this.

Thank you for the reminder. JET uses LoweredCodeUtils, but it has its own implementation for CFG tracking (https://github.com/aviatesk/JET.jl/blob/25d95fb6687e30aa377b1eef6d5c9cf85e6b9768/src/toplevel/virtualprocess.jl#L1174-L1240). I haven't considered which is superior in general cases, but when updating to [email protected], simply using LCU's add_control_flow! did not pass JET's test cases, so I implemented it myself.
So even if there are breaking changes in control flow tracking on the LCU side, it shouldn't cause significant issues for JET. Making isrequired a three-valued variable might be a good idea, but if the issue can be addressed by simply improving the algorithm of LCU.add_control_flow!, that would be preferable.

@timholy
Copy link
Member Author

timholy commented Jul 17, 2024

Thanks for the pointer! I will test JET's version and see if it's usable in LCU.

@timholy
Copy link
Member Author

timholy commented Jul 18, 2024

JET's version (unless I've copied it incorrectly) marks too much. For example, in this test:

ex = quote
flag2 = true
z2 = 15
if flag2
x2 = 5
a2 = 1
else
y2 = 7
a2 = 2
end
a2
end
frame = Frame(ModSelective, ex)
src = frame.framecode.src
edges = CodeEdges(ModSelective, src)
isrequired = lines_required(GlobalRef(ModSelective, :a2), src, edges)

the goal is to mark the lines needed for a2 but not x2 or y2, yet with the JET version I get

julia> LoweredCodeUtils.print_with_code(stdout, src, isrequired)
 1 t 1 ── %1  = Core.get_binding_type(Main.ModSelective, :flag2)
 2 t │          #s452 = true
 3 t │    %3  = #s452 isa %1
 4 t └───       goto #3 if not %3
 5 t 2 ──       goto #4
 6 t 3 ──       #s452 = Base.convert(%1, #s452)
 7 t 4 ┄─       flag2 = #s452
 8 t │    %8  = Core.get_binding_type(Main.ModSelective, :z2)
 9 t │          @_2 = 15
10 t │    %10 = @_2 isa %8
11 t └───       goto #6 if not %10
12 t 5 ──       goto #7
13 t 6 ──       @_2 = Base.convert(%8, @_2)
14 f 7 ┄─       z2 = @_2
15 t └───       goto #15 if not flag2
16 t 8 ── %16 = Core.get_binding_type(Main.ModSelective, :x2)
17 t │          @_3 = 5
18 t │    %18 = @_3 isa %16
19 t └───       goto #10 if not %18
20 t 9 ──       goto #11
21 t 10@_3 = Base.convert(%16, @_3)
22 f 11 ┄       x2 = @_3
23 t │    %23 = Core.get_binding_type(Main.ModSelective, :a2)
24 t │          @_4 = 1
25 t │    %25 = @_4 isa %23
26 t └───       goto #13 if not %25
27 t 12 ─       goto #14
28 t 13@_4 = Base.convert(%23, @_4)
29 t 14 ┄       a2 = @_4
30 f └───       goto #22
31 t 15%31 = Core.get_binding_type(Main.ModSelective, :y2)
32 t │          @_5 = 7
33 t │    %33 = @_5 isa %31
34 t └───       goto #17 if not %33
35 t 16 ─       goto #18
36 t 17@_5 = Base.convert(%31, @_5)
37 f 18 ┄       y2 = @_5
38 t │    %38 = Core.get_binding_type(Main.ModSelective, :a2)
39 t │          @_6 = 2
40 t │    %40 = @_6 isa %38
41 t └───       goto #20 if not %40
42 t 19 ─       goto #21
43 t 20@_6 = Base.convert(%38, @_6)
44 t 21 ┄       a2 = @_6
45 f 22return a2

With the exception of the line evaluating z2 (which is the easiest one to avoid), basically everything is marked. I'll try the converse and see what's missing in JET, and at the same time try to fix this issue. (Probably fix this issue first, then take a look at JET.)

Currently we use a heuristic to avoid marking the exit of the last
basic block, expecting it to be a `return` and so that execution
will cease whether we evaluate it or not. However, it is possible
to write code for which the final statement is a `GotoNode` and
in this case we can get incorrect answers if we fail to evaluate it.

This example illustrates a tricky point: `return` both terminates
execution but may also return a value. If we don't require the
value, we still might need to terminate execution. This example
seems to illustrate that having `isrequired[i]` be either `true` or
`false` may be insufficiently expressive; we might need it to be
three states, `:no`, `:yes`, `:exit`. During marking, encountering
`:exit` would not force one to evaluate the returned SSAValue.
@timholy
Copy link
Member Author

timholy commented Jul 21, 2024

Dang, I thought I had it, but it turns out that it's not robust to differences in lowering across Julia versions. This is a difficult problem, and I'm even beginning to wonder if it's a well-defined problem.

aviatesk added a commit to aviatesk/JET.jl that referenced this pull request Sep 5, 2024
Specifically, this commit aims to review the implementation of
`add_control_flow!` and improves its accuracy. Ideally, it should pass
JET's existing test cases as well as the newly added ones, including the
test cases from JuliaDebug/LoweredCodeUtils.jl#99. The goal is to share
the same high-precision CFG selection logic between LoweredCodeUtils
and JET.

The current implementation is based on
[this paper](https://www.cse.msu.edu/~cse870/Public/Homework/SS2003/HW5/p439-weiser.pdf),
and it has been modified to use an algorithm that checks for liveness in
the reachable blocks up to the nearest common postdominator of the
successors of a conditional terminator.
aviatesk added a commit to aviatesk/JET.jl that referenced this pull request Sep 6, 2024
Specifically, this commit aims to review the implementation of
`add_control_flow!` and improves its accuracy. Ideally, it should pass
JET's existing test cases as well as the newly added ones, including the
test cases from JuliaDebug/LoweredCodeUtils.jl#99. The goal is to share
the same high-precision CFG selection logic between LoweredCodeUtils
and JET.

The current implementation is based on
[this paper](https://www.cse.msu.edu/~cse870/Public/Homework/SS2003/HW5/p439-weiser.pdf),
and it has been modified to use an algorithm that checks for liveness in
the reachable blocks up to the nearest common postdominator of the
successors of a conditional terminator.
aviatesk added a commit to aviatesk/JET.jl that referenced this pull request Sep 7, 2024
Specifically, this commit aims to review the implementation of
`add_control_flow!` and improves its accuracy. Ideally, it should pass
JET's existing test cases as well as the newly added ones, including the
test cases from JuliaDebug/LoweredCodeUtils.jl#99. The goal is to share
the same high-precision CFG selection logic between LoweredCodeUtils
and JET.

The new algorithm is based on what was proposed in [^Wei84]. If there is
even one active block in the blocks reachable from a conditional branch
up to its successors' nearest common post-dominator (referred to as
**INFL** in the paper), it is necessary to follow that conditional
branch and execute the code. Otherwise, execution can be short-circuited
from the conditional branch to the nearest common post-dominator.

COMBAK: It is important to note that in Julia's IR (`CodeInfo`),
"short-circuiting" a specific code region is not a simple task. Simply
ignoring the path to the post-dominator does not guarantee fall-through
to the post-dominator. Therefore, a more careful implementation is
required for this aspect.

[Wei84]: M. Weiser, "Program Slicing," IEEE Transactions on Software Engineering, 10, pages 352-357, July 1984.
aviatesk added a commit to aviatesk/JET.jl that referenced this pull request Sep 9, 2024
Specifically, this commit aims to review the implementation of
`add_control_flow!` and improves its accuracy. Ideally, it should pass
JET's existing test cases as well as the newly added ones, including the
test cases from JuliaDebug/LoweredCodeUtils.jl#99. The goal is to share
the same high-precision CFG selection logic between LoweredCodeUtils
and JET.

The new algorithm is based on what was proposed in [^Wei84]. If there is
even one active block in the blocks reachable from a conditional branch
up to its successors' nearest common post-dominator (referred to as
**INFL** in the paper), it is necessary to follow that conditional
branch and execute the code. Otherwise, execution can be short-circuited
from the conditional branch to the nearest common post-dominator.

COMBAK: It is important to note that in Julia's IR (`CodeInfo`),
"short-circuiting" a specific code region is not a simple task. Simply
ignoring the path to the post-dominator does not guarantee fall-through
to the post-dominator. Therefore, a more careful implementation is
required for this aspect.

[Wei84]: M. Weiser, "Program Slicing," IEEE Transactions on Software Engineering, 10, pages 352-357, July 1984.
aviatesk added a commit to aviatesk/JET.jl that referenced this pull request Sep 9, 2024
Specifically, this commit aims to review the implementation of
`add_control_flow!` and improves its accuracy. Ideally, it should pass
JET's existing test cases as well as the newly added ones, including the
test cases from JuliaDebug/LoweredCodeUtils.jl#99. The goal is to share
the same high-precision CFG selection logic between LoweredCodeUtils
and JET.

The new algorithm is based on what was proposed in [^Wei84]. If there is
even one active block in the blocks reachable from a conditional branch
up to its successors' nearest common post-dominator (referred to as
**INFL** in the paper), it is necessary to follow that conditional
branch and execute the code. Otherwise, execution can be short-circuited
from the conditional branch to the nearest common post-dominator.

COMBAK: It is important to note that in Julia's IR (`CodeInfo`),
"short-circuiting" a specific code region is not a simple task. Simply
ignoring the path to the post-dominator does not guarantee fall-through
to the post-dominator. Therefore, a more careful implementation is
required for this aspect.

[Wei84]: M. Weiser, "Program Slicing," IEEE Transactions on Software Engineering, 10, pages 352-357, July 1984.
aviatesk added a commit to aviatesk/JET.jl that referenced this pull request Sep 9, 2024
Specifically, this commit aims to review the implementation of
`add_control_flow!` and improves its accuracy. Ideally, it should pass
JET's existing test cases as well as the newly added ones, including the
test cases from JuliaDebug/LoweredCodeUtils.jl#99. The goal is to share
the same high-precision CFG selection logic between LoweredCodeUtils
and JET.

The new algorithm is based on what was proposed in [^Wei84]. If there is
even one active block in the blocks reachable from a conditional branch
up to its successors' nearest common post-dominator (referred to as
**INFL** in the paper), it is necessary to follow that conditional
branch and execute the code. Otherwise, execution can be short-circuited
from the conditional branch to the nearest common post-dominator.

COMBAK: It is important to note that in Julia's IR (`CodeInfo`),
"short-circuiting" a specific code region is not a simple task. Simply
ignoring the path to the post-dominator does not guarantee fall-through
to the post-dominator. Therefore, a more careful implementation is
required for this aspect.

[Wei84]: M. Weiser, "Program Slicing," IEEE Transactions on Software Engineering, 10, pages 352-357, July 1984.
aviatesk added a commit that referenced this pull request Sep 10, 2024
I am writing a solution for #99 by
overloading `next_or_nothing!`. This PR serves as a preparation for
that, while the basic usage of LCU remains unchanged.
aviatesk added a commit to aviatesk/JET.jl that referenced this pull request Sep 10, 2024
Specifically, this commit aims to review the implementation of
`add_control_flow!` and improves its accuracy. Ideally, it should pass
JET's existing test cases as well as the newly added ones, including the
test cases from JuliaDebug/LoweredCodeUtils.jl#99. The goal is to share
the same high-precision CFG selection logic between LoweredCodeUtils
and JET.

The new algorithm is based on what was proposed in [^Wei84]. If there is
even one active block in the blocks reachable from a conditional branch
up to its successors' nearest common post-dominator (referred to as
**INFL** in the paper), it is necessary to follow that conditional
branch and execute the code. Otherwise, execution can be short-circuited
from the conditional branch to the nearest common post-dominator.

COMBAK: It is important to note that in Julia's IR (`CodeInfo`),
"short-circuiting" a specific code region is not a simple task. Simply
ignoring the path to the post-dominator does not guarantee fall-through
to the post-dominator. Therefore, a more careful implementation is
required for this aspect.

[Wei84]: M. Weiser, "Program Slicing," IEEE Transactions on Software Engineering, 10, pages 352-357, July 1984.
aviatesk added a commit that referenced this pull request Sep 10, 2024
I am writing a solution for #99 by
overloading `next_or_nothing!`. This PR serves as a preparation for
that, while the basic usage of LCU remains unchanged.
aviatesk added a commit to aviatesk/JET.jl that referenced this pull request Sep 10, 2024
Specifically, this commit aims to review the implementation of
`add_control_flow!` and improves its accuracy. Ideally, it should pass
JET's existing test cases as well as the newly added ones, including the
test cases from JuliaDebug/LoweredCodeUtils.jl#99. The goal is to share
the same high-precision CFG selection logic between LoweredCodeUtils
and JET.

The new algorithm is based on what was proposed in [^Wei84]. If there is
even one active block in the blocks reachable from a conditional branch
up to its successors' nearest common post-dominator (referred to as
**INFL** in the paper), it is necessary to follow that conditional
branch and execute the code. Otherwise, execution can be short-circuited
from the conditional branch to the nearest common post-dominator.

COMBAK: It is important to note that in Julia's IR (`CodeInfo`),
"short-circuiting" a specific code region is not a simple task. Simply
ignoring the path to the post-dominator does not guarantee fall-through
to the post-dominator. Therefore, a more careful implementation is
required for this aspect.

[Wei84]: M. Weiser, "Program Slicing," IEEE Transactions on Software Engineering, 10, pages 352-357, July 1984.
aviatesk added a commit to aviatesk/JET.jl that referenced this pull request Sep 10, 2024
Specifically, this commit aims to review the implementation of
`add_control_flow!` and improves its accuracy. Ideally, it should pass
JET's existing test cases as well as the newly added ones, including the
test cases from JuliaDebug/LoweredCodeUtils.jl#99. The goal is to share
the same high-precision CFG selection logic between LoweredCodeUtils
and JET.

The new algorithm is based on what was proposed in [^Wei84]. If there is
even one active block in the blocks reachable from a conditional branch
up to its successors' nearest common post-dominator (referred to as
**INFL** in the paper), it is necessary to follow that conditional
branch and execute the code. Otherwise, execution can be short-circuited
from the conditional branch to the nearest common post-dominator.

COMBAK: It is important to note that in Julia's IR (`CodeInfo`),
"short-circuiting" a specific code region is not a simple task. Simply
ignoring the path to the post-dominator does not guarantee fall-through
to the post-dominator. Therefore, a more careful implementation is
required for this aspect.

[Wei84]: M. Weiser, "Program Slicing," IEEE Transactions on Software Engineering, 10, pages 352-357, July 1984.
@aviatesk
Copy link
Member

With the exception of the line evaluating z2 (which is the easiest one to avoid), basically everything is marked. I'll try the converse and see what's missing in JET, and at the same time try to fix this issue.

This issue was resolved by JET's new algorithm. The algorithm was ported in #116.

aviatesk added a commit that referenced this pull request Sep 10, 2024
This PR is an alternative to #99.
This is built on top of #116.

With this PR, the following test cases now pass correctly:
```julia
    # Final block is not a `return`: Need to use `config::SelectiveEvalRecurse` explicitly
    ex = quote
        x = 1
        yy = 7
        @Label loop
        x += 1
        x < 5 || return yy
        @goto loop
    end
    frame = Frame(ModSelective, ex)
    src = frame.framecode.src
    edges = CodeEdges(ModSelective, src)
    config = SelectiveEvalRecurse()
    isrequired = lines_required(GlobalRef(ModSelective, :x), src, edges, config)
    selective_eval_fromstart!(config, frame, isrequired, true)
    @test ModSelective.x == 5
    @test !isdefined(ModSelective, :yy)
```

The basic approach is overloading `JuliaInterpreter.step_expr!` and
`LoweredCodeUtils.next_or_nothing!` for the new `SelectiveEvalController`
type, as described below, to perform correct selective execution.

When `SelectiveEvalController` is passed as the `recurse` argument of
`selective_eval!`, the selective execution is adjusted as follows:

- **Implicit return**: In Julia's IR representation (`CodeInfo`), the
  final block does not necessarily return and may `goto` another block.
  And if the `return` statement is not included in the slice in such
  cases, it is necessary to terminate `selective_eval!` when execution
  reaches such implicit return statements. `controller.implicit_returns`
  records the PCs of such return statements, and `selective_eval!` will
  return when reaching those statements.
  This is the core part of the fix for the test cases in
  #99.

- **CFG short-cut**: When the successors of a conditional branch are
  inactive, and it is safe to move the program counter from the
  conditional branch to the nearest common post-dominator of those
  successors, this short-cut is taken. This short-cut is not merely an
  optimization but is actually essential for the correctness of the
  selective execution. This is because, in `CodeInfo`, even if we simply
  fall-through dead blocks (i.e., increment the program counter without
  executing the statements of those blocks), it does not necessarily
  lead to the nearest common post-dominator block.

And now [`lines_required`](@ref) or [`lines_required!`](@ref) will
update the `SelectiveEvalController` passed as their argument to be
appropriate for the program slice generated.

One thing to note is that currently, the `controller` is not be recursed.
That said, in Revise, which is the main consumer of LCU, there is no
need for recursive selective execution, and so `selective_eval!` does
not provide a system for inter-procedural selective evaluation.
Accordingly `SelectiveEvalController` does not recurse too, but this can
be left as a future extension.
aviatesk added a commit that referenced this pull request Sep 10, 2024
This PR is an alternative to #99.
This is built on top of #116.

With this PR, the following test cases now pass correctly:
```julia
    # Final block is not a `return`: Need to use `config::SelectiveEvalRecurse` explicitly
    ex = quote
        x = 1
        yy = 7
        @Label loop
        x += 1
        x < 5 || return yy
        @goto loop
    end
    frame = Frame(ModSelective, ex)
    src = frame.framecode.src
    edges = CodeEdges(ModSelective, src)
    config = SelectiveEvalRecurse()
    isrequired = lines_required(GlobalRef(ModSelective, :x), src, edges, config)
    selective_eval_fromstart!(config, frame, isrequired, true)
    @test ModSelective.x == 5
    @test !isdefined(ModSelective, :yy)
```

The basic approach is overloading `JuliaInterpreter.step_expr!` and
`LoweredCodeUtils.next_or_nothing!` for the new `SelectiveEvalController`
type, as described below, to perform correct selective execution.

When `SelectiveEvalController` is passed as the `recurse` argument of
`selective_eval!`, the selective execution is adjusted as follows:

- **Implicit return**: In Julia's IR representation (`CodeInfo`), the
  final block does not necessarily return and may `goto` another block.
  And if the `return` statement is not included in the slice in such
  cases, it is necessary to terminate `selective_eval!` when execution
  reaches such implicit return statements. `controller.implicit_returns`
  records the PCs of such return statements, and `selective_eval!` will
  return when reaching those statements.
  This is the core part of the fix for the test cases in
  #99.

- **CFG short-cut**: When the successors of a conditional branch are
  inactive, and it is safe to move the program counter from the
  conditional branch to the nearest common post-dominator of those
  successors, this short-cut is taken. This short-cut is not merely an
  optimization but is actually essential for the correctness of the
  selective execution. This is because, in `CodeInfo`, even if we simply
  fall-through dead blocks (i.e., increment the program counter without
  executing the statements of those blocks), it does not necessarily
  lead to the nearest common post-dominator block.

And now [`lines_required`](@ref) or [`lines_required!`](@ref) will
update the `SelectiveEvalController` passed as their argument to be
appropriate for the program slice generated.

One thing to note is that currently, the `controller` is not be recursed.
That said, in Revise, which is the main consumer of LCU, there is no
need for recursive selective execution, and so `selective_eval!` does
not provide a system for inter-procedural selective evaluation.
Accordingly `SelectiveEvalController` does not recurse too, but this can
be left as a future extension.
aviatesk added a commit that referenced this pull request Sep 10, 2024
This PR is an alternative to #99.
This is built on top of #116.

With this PR, the following test cases now pass correctly:
```julia
    # Final block is not a `return`: Need to use `config::SelectiveEvalRecurse` explicitly
    ex = quote
        x = 1
        yy = 7
        @Label loop
        x += 1
        x < 5 || return yy
        @goto loop
    end
    frame = Frame(ModSelective, ex)
    src = frame.framecode.src
    edges = CodeEdges(ModSelective, src)
    config = SelectiveEvalRecurse()
    isrequired = lines_required(GlobalRef(ModSelective, :x), src, edges, config)
    selective_eval_fromstart!(config, frame, isrequired, true)
    @test ModSelective.x == 5
    @test !isdefined(ModSelective, :yy)
```

The basic approach is overloading `JuliaInterpreter.step_expr!` and
`LoweredCodeUtils.next_or_nothing!` for the new `SelectiveEvalController`
type, as described below, to perform correct selective execution.

When `SelectiveEvalController` is passed as the `recurse` argument of
`selective_eval!`, the selective execution is adjusted as follows:

- **Implicit return**: In Julia's IR representation (`CodeInfo`), the
  final block does not necessarily return and may `goto` another block.
  And if the `return` statement is not included in the slice in such
  cases, it is necessary to terminate `selective_eval!` when execution
  reaches such implicit return statements. `controller.implicit_returns`
  records the PCs of such return statements, and `selective_eval!` will
  return when reaching those statements.
  This is the core part of the fix for the test cases in
  #99.

- **CFG short-cut**: When the successors of a conditional branch are
  inactive, and it is safe to move the program counter from the
  conditional branch to the nearest common post-dominator of those
  successors, this short-cut is taken. This short-cut is not merely an
  optimization but is actually essential for the correctness of the
  selective execution. This is because, in `CodeInfo`, even if we simply
  fall-through dead blocks (i.e., increment the program counter without
  executing the statements of those blocks), it does not necessarily
  lead to the nearest common post-dominator block.

And now [`lines_required`](@ref) or [`lines_required!`](@ref) will
update the `SelectiveEvalController` passed as their argument to be
appropriate for the program slice generated.

One thing to note is that currently, the `controller` is not be recursed.
That said, in Revise, which is the main consumer of LCU, there is no
need for recursive selective execution, and so `selective_eval!` does
not provide a system for inter-procedural selective evaluation.
Accordingly `SelectiveEvalController` does not recurse too, but this can
be left as a future extension.
aviatesk added a commit that referenced this pull request Sep 11, 2024
This PR is an alternative to #99.
This is built on top of #116.

With this PR, the following test cases now pass correctly:
```julia
    # Final block is not a `return`: Need to use `config::SelectiveEvalRecurse` explicitly
    ex = quote
        x = 1
        yy = 7
        @Label loop
        x += 1
        x < 5 || return yy
        @goto loop
    end
    frame = Frame(ModSelective, ex)
    src = frame.framecode.src
    edges = CodeEdges(ModSelective, src)
    config = SelectiveEvalRecurse()
    isrequired = lines_required(GlobalRef(ModSelective, :x), src, edges, config)
    selective_eval_fromstart!(config, frame, isrequired, true)
    @test ModSelective.x == 5
    @test !isdefined(ModSelective, :yy)
```

The basic approach is overloading `JuliaInterpreter.step_expr!` and
`LoweredCodeUtils.next_or_nothing!` for the new `SelectiveEvalController`
type, as described below, to perform correct selective execution.

When `SelectiveEvalController` is passed as the `recurse` argument of
`selective_eval!`, the selective execution is adjusted as follows:

- **Implicit return**: In Julia's IR representation (`CodeInfo`), the
  final block does not necessarily return and may `goto` another block.
  And if the `return` statement is not included in the slice in such
  cases, it is necessary to terminate `selective_eval!` when execution
  reaches such implicit return statements. `controller.implicit_returns`
  records the PCs of such return statements, and `selective_eval!` will
  return when reaching those statements.
  This is the core part of the fix for the test cases in
  #99.

- **CFG short-cut**: When the successors of a conditional branch are
  inactive, and it is safe to move the program counter from the
  conditional branch to the nearest common post-dominator of those
  successors, this short-cut is taken. This short-cut is not merely an
  optimization but is actually essential for the correctness of the
  selective execution. This is because, in `CodeInfo`, even if we simply
  fall-through dead blocks (i.e., increment the program counter without
  executing the statements of those blocks), it does not necessarily
  lead to the nearest common post-dominator block.

And now [`lines_required`](@ref) or [`lines_required!`](@ref) will
update the `SelectiveEvalController` passed as their argument to be
appropriate for the program slice generated.

One thing to note is that currently, the `controller` is not be recursed.
That said, in Revise, which is the main consumer of LCU, there is no
need for recursive selective execution, and so `selective_eval!` does
not provide a system for inter-procedural selective evaluation.
Accordingly `SelectiveEvalController` does not recurse too, but this can
be left as a future extension.
aviatesk added a commit that referenced this pull request Sep 11, 2024
This PR is an alternative to #99.
This is built on top of #116.

With this PR, the following test cases now pass correctly:
```julia
    # Final block is not a `return`: Need to use `config::SelectiveEvalRecurse` explicitly
    ex = quote
        x = 1
        yy = 7
        @Label loop
        x += 1
        x < 5 || return yy
        @goto loop
    end
    frame = Frame(ModSelective, ex)
    src = frame.framecode.src
    edges = CodeEdges(ModSelective, src)
    config = SelectiveEvalRecurse()
    isrequired = lines_required(GlobalRef(ModSelective, :x), src, edges, config)
    selective_eval_fromstart!(config, frame, isrequired, true)
    @test ModSelective.x == 5
    @test !isdefined(ModSelective, :yy)
```

The basic approach is overloading `JuliaInterpreter.step_expr!` and
`LoweredCodeUtils.next_or_nothing!` for the new `SelectiveEvalController`
type, as described below, to perform correct selective execution.

When `SelectiveEvalController` is passed as the `recurse` argument of
`selective_eval!`, the selective execution is adjusted as follows:

- **Implicit return**: In Julia's IR representation (`CodeInfo`), the
  final block does not necessarily return and may `goto` another block.
  And if the `return` statement is not included in the slice in such
  cases, it is necessary to terminate `selective_eval!` when execution
  reaches such implicit return statements. `controller.implicit_returns`
  records the PCs of such return statements, and `selective_eval!` will
  return when reaching those statements.
  This is the core part of the fix for the test cases in
  #99.

- **CFG short-cut**: When the successors of a conditional branch are
  inactive, and it is safe to move the program counter from the
  conditional branch to the nearest common post-dominator of those
  successors, this short-cut is taken. This short-cut is not merely an
  optimization but is actually essential for the correctness of the
  selective execution. This is because, in `CodeInfo`, even if we simply
  fall-through dead blocks (i.e., increment the program counter without
  executing the statements of those blocks), it does not necessarily
  lead to the nearest common post-dominator block.

And now [`lines_required`](@ref) or [`lines_required!`](@ref) will
update the `SelectiveEvalController` passed as their argument to be
appropriate for the program slice generated.

One thing to note is that currently, the `controller` is not be recursed.
That said, in Revise, which is the main consumer of LCU, there is no
need for recursive selective execution, and so `selective_eval!` does
not provide a system for inter-procedural selective evaluation.
Accordingly `SelectiveEvalController` does not recurse too, but this can
be left as a future extension.
aviatesk added a commit that referenced this pull request Sep 19, 2024
This PR is an alternative to #99.
This is built on top of #116.

With this PR, the following test cases now pass correctly:
```julia
    # Final block is not a `return`: Need to use `config::SelectiveEvalRecurse` explicitly
    ex = quote
        x = 1
        yy = 7
        @Label loop
        x += 1
        x < 5 || return yy
        @goto loop
    end
    frame = Frame(ModSelective, ex)
    src = frame.framecode.src
    edges = CodeEdges(ModSelective, src)
    config = SelectiveEvalRecurse()
    isrequired = lines_required(GlobalRef(ModSelective, :x), src, edges, config)
    selective_eval_fromstart!(config, frame, isrequired, true)
    @test ModSelective.x == 5
    @test !isdefined(ModSelective, :yy)
```

The basic approach is overloading `JuliaInterpreter.step_expr!` and
`LoweredCodeUtils.next_or_nothing!` for the new `SelectiveEvalController`
type, as described below, to perform correct selective execution.

When `SelectiveEvalController` is passed as the `recurse` argument of
`selective_eval!`, the selective execution is adjusted as follows:

- **Implicit return**: In Julia's IR representation (`CodeInfo`), the
  final block does not necessarily return and may `goto` another block.
  And if the `return` statement is not included in the slice in such
  cases, it is necessary to terminate `selective_eval!` when execution
  reaches such implicit return statements. `controller.implicit_returns`
  records the PCs of such return statements, and `selective_eval!` will
  return when reaching those statements.
  This is the core part of the fix for the test cases in
  #99.

- **CFG short-cut**: When the successors of a conditional branch are
  inactive, and it is safe to move the program counter from the
  conditional branch to the nearest common post-dominator of those
  successors, this short-cut is taken. This short-cut is not merely an
  optimization but is actually essential for the correctness of the
  selective execution. This is because, in `CodeInfo`, even if we simply
  fall-through dead blocks (i.e., increment the program counter without
  executing the statements of those blocks), it does not necessarily
  lead to the nearest common post-dominator block.

And now [`lines_required`](@ref) or [`lines_required!`](@ref) will
update the `SelectiveEvalController` passed as their argument to be
appropriate for the program slice generated.

One thing to note is that currently, the `controller` is not be recursed.
That said, in Revise, which is the main consumer of LCU, there is no
need for recursive selective execution, and so `selective_eval!` does
not provide a system for inter-procedural selective evaluation.
Accordingly `SelectiveEvalController` does not recurse too, but this can
be left as a future extension.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants