optimizer: enable SROA of mutable φ-nodes #43505

aviatesk · 2021-12-21T17:06:40Z

This commit allows elimination of mutable φ-node (and its predecessor mutable allocations).
As an contrived example, it allows this mutable_ϕ_elim(::String, ::Vector{String})
to run without any allocations at all:

function mutable_ϕ_elim(x, xs)
    r = Ref(x)
    for x in xs
        r = Ref(x)
    end
    return r[]
end

let xs = String[string(gensym()) for _ in 1:100]
    mutable_ϕ_elim("init", xs)
    @test @allocated(mutable_ϕ_elim("init", xs)) == 0
end

This mutable ϕ-node elimination is still limited though.
Most notably, the current implementation doesn't work if a mutable
allocation forms multiple ϕ-nodes, since we check allocation eliminability
(i.e. escapability) by counting usages by def-chain walking and thus
it's hard to reason about multiple ϕ-nodes at a time.
For example, currently mutable allocations involved in cases like below
will still not be eliminated:

code_typed((Bool,String,String),) do cond, x, y
    if cond
        ϕ2 = ϕ1 = Ref(x)
    else
        ϕ2 = ϕ1 = Ref(y)
    end
    ϕ1[], ϕ2[]
end

# more realistic example
mutable struct Point{T}
    x::T
    y::T
end
add(a::Point, b::Point) = Point(a.x + b.x, a.y + b.y)
function compute(a::Point{ComplexF64}, b::Point{ComplexF64})
    for i in 0:(100000000-1)
        a = add(add(a, b), b)
    end
    a.x, a.y
end

I'd say this limitation should be addressed by first introducing a better
abstraction for reasoning escape information. More specifically, I'd like
introduce EscapeAnalysis.jl into Julia base first, and then gradually
adapt it to improve our SROA pass, since EA will allow us to reason about
all escape information imposed on whatever object more easily and should
help us get rid of the complexities of our current SROA implementation.

For now, I'd like to get in this enhancement even though it has the
limitation elaborated above, as far as this commit doesn't introduce
latency problem (which I believe is unlikely).

@test

This commit allows elimination of mutable φ-node (and its predecessor mutables allocations). As an contrived example, it allows this `mutable_ϕ_elim(::String, ::Vector{String})` to run without any allocations at all: ```julia function mutable_ϕ_elim(x, xs) r = Ref(x) for x in xs r = Ref(x) end return r[] end let xs = String[string(gensym()) for _ in 1:100] mutable_ϕ_elim("init", xs) @test @allocated(mutable_ϕ_elim("init", xs)) == 0 end ``` This mutable ϕ-node elimination is still limited though. Most notably, the current implementation doesn't work if a mutable allocation forms multiple ϕ-nodes, since we check allocation eliminability (i.e. escapability) by counting usages counts and thus it's hard to reason about multiple ϕ-nodes at a time. For example, currently mutable allocations involved in cases like below will still not be eliminated: ```julia code_typed((Bool,String,String),) do cond, x, y if cond ϕ2 = ϕ1 = Ref(x) else ϕ2 = ϕ1 = Ref(y) end ϕ1[], ϕ2[] end \# more realistic example mutable struct Point{T} x::T y::T end add(a::Point, b::Point) = Point(a.x + b.x, a.y + b.y) function compute(a::Point{ComplexF64}, b::Point{ComplexF64}) for i in 0:(100000000-1) a = add(add(a, b), b) end a.x, a.y end ``` I'd say this limitation should be addressed by first introducing a better abstraction for reasoning escape information. More specifically, I'd like introduce EscapeAnalysis.jl into Julia base first, and then gradually adapt it to improve our SROA pass, since EA will allow us to reason about all escape information imposed on whatever object more easily and should help us get rid of the complexities of our current SROA implementation. For now, I'd like to get in this enhancement even though it has the limitation elaborated above, as far as this commit doesn't introduce latency problem (which is unlikely).

fingolfin · 2024-07-24T20:49:05Z

@aviatesk what's the status of this PR? It's 3 years old and actually targets another branch of your, avi/multisroa

aviatesk added needs nanosoldier run This PR should have benchmarks run on it compiler:optimizer Optimization passes (mostly in base/compiler/ssair/) labels Dec 21, 2021

aviatesk force-pushed the avi/multisroa branch from dec65e1 to 738df81 Compare December 22, 2021 08:21

aviatesk force-pushed the avi/mutablephi branch from 73fbd3d to 1b8be01 Compare December 22, 2021 08:21

aviatesk force-pushed the avi/multisroa branch from 738df81 to dd6d086 Compare December 22, 2021 09:10

aviatesk force-pushed the avi/mutablephi branch from 1b8be01 to d2cf634 Compare December 22, 2021 09:10

aviatesk force-pushed the avi/multisroa branch from dd6d086 to 250059f Compare December 23, 2021 05:57

aviatesk force-pushed the avi/mutablephi branch 2 times, most recently from c19c4e4 to 2ddf09f Compare December 23, 2021 09:07

aviatesk force-pushed the avi/multisroa branch 2 times, most recently from 0908ab4 to e1e502e Compare January 6, 2022 03:44

aviatesk force-pushed the avi/mutablephi branch from 2ddf09f to 29fd2ac Compare January 6, 2022 03:44

aviatesk force-pushed the avi/multisroa branch from e1e502e to a9c6daf Compare January 8, 2022 04:56

aviatesk force-pushed the avi/mutablephi branch from 29fd2ac to bf97c29 Compare January 8, 2022 04:57

aviatesk mentioned this pull request Jan 21, 2022

optimizer: alias-aware SROA #43888

Draft

fingolfin closed this Feb 10, 2024

fingolfin reopened this Feb 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimizer: enable SROA of mutable φ-nodes #43505

optimizer: enable SROA of mutable φ-nodes #43505

aviatesk commented Dec 21, 2021 •

edited

Loading

fingolfin commented Jul 24, 2024

optimizer: enable SROA of mutable φ-nodes #43505

Are you sure you want to change the base?

optimizer: enable SROA of mutable φ-nodes #43505

Conversation

aviatesk commented Dec 21, 2021 • edited Loading

fingolfin commented Jul 24, 2024

aviatesk commented Dec 21, 2021 •

edited

Loading