Skip to content

Commit

Permalink
optimizer: EscapeAnalysis.jl-powered SROA
Browse files Browse the repository at this point in the history
Unleashes the potential for SROA of mutables using the novel Julia-level
escape analysis (on top of #43800):
1. alias-aware SROA, mutable ϕ-node elimination
2. `isdefined` check elimination
3. load-forwarding for non-eliminable but non-escaping mutables

---

1. alias-aware SROA, mutable ϕ-node elimination

EA's alias analysis allows this new SROA to handle nested mutables allocations
pretty well. Now we can eliminate the heap allocations completely from
this insanely nested examples by the single analysis/optimization pass:
```julia
julia> function refs(x)
           (Ref(Ref(Ref(Ref(Ref(Ref(Ref(Ref(Ref(Ref((x))))))))))))[][][][][][][][][][]
       end
refs (generic function with 1 method)

julia> refs("julia"); @allocated refs("julia")
0
```

EA can also analyze escape of ϕ-node as well as its aliasing.
Mutable ϕ-nodes would be eliminated even for a very tricky case as like:
```julia
julia> code_typed((Bool,String,)) do cond, x
           # these allocation form multiple ϕ-nodes
           if cond
               ϕ2 = ϕ1 = Ref{Any}("foo")
           else
               ϕ2 = ϕ1 = Ref{Any}("bar")
           end
           ϕ2[] = x
           y = ϕ1[] # => x
           return y
       end
1-element Vector{Any}:
 CodeInfo(
1 ─     goto #3 if not cond
2 ─     goto #4
3 ─     nothing::Nothing
4 ┄     return x
) => Any
```

Combined with the powerful alias analysis and ϕ-node handling,
the following realistic examples will be fully optimized:
```julia
julia> # demonstrate the power of our field / alias analysis with realistic end to end examples
       # adapted from http://wiki.luajit.org/Allocation-Sinking-Optimization#implementation%5B
       abstract type AbstractPoint{T} end

julia> struct Point{T} <: AbstractPoint{T}
           x::T
           y::T
       end

julia> mutable struct MPoint{T} <: AbstractPoint{T}
           x::T
           y::T
       end

julia> add(a::P, b::P) where P<:AbstractPoint = P(a.x + b.x, a.y + b.y)
add (generic function with 1 method)

julia> function compute_point(T, n, ax, ay, bx, by)
           a = T(ax, ay)
           b = T(bx, by)
           for i in 0:(n-1)
               a = add(add(a, b), b)
           end
           a.x, a.y
       end
compute_point (generic function with 2 methods)

julia> function compute_point(n, a, b)
           for i in 0:(n-1)
               a = add(add(a, b), b)
           end
           a.x, a.y
       end
compute_point (generic function with 2 methods)

julia> function compute_point!(n, a, b)
           for i in 0:(n-1)
               a′ = add(add(a, b), b)
               a.x = a′.x
               a.y = a′.y
           end
       end
compute_point! (generic function with 1 method)

julia> compute_point(MPoint, 10, 1+.5, 2+.5, 2+.25, 4+.75);

julia> compute_point(MPoint, 10, 1+.5im, 2+.5im, 2+.25im, 4+.75im);

julia> @test @allocated(compute_point(MPoint, 10000, 1+.5, 2+.5, 2+.25, 4+.75)) == 0
Test Passed

julia> @test @allocated(compute_point(MPoint, 10000, 1+.5im, 2+.5im, 2+.25im, 4+.75im)) == 0
Test Passed

julia> compute_point(10, MPoint(1+.5, 2+.5), MPoint(2+.25, 4+.75));

julia> compute_point(10, MPoint(1+.5im, 2+.5im), MPoint(2+.25im, 4+.75im));

julia> @test @allocated(compute_point(10000, MPoint(1+.5, 2+.5), MPoint(2+.25, 4+.75))) == 0
Test Passed

julia> @test @allocated(compute_point(10000, MPoint(1+.5im, 2+.5im), MPoint(2+.25im, 4+.75im))) == 0
Test Passed

julia> af, bf = MPoint(1+.5, 2+.5), MPoint(2+.25, 4+.75);

julia> ac, bc = MPoint(1+.5im, 2+.5im), MPoint(2+.25im, 4+.75im);

julia> compute_point!(10, af, bf);

julia> compute_point!(10, ac, bc);

julia> @test @allocated(compute_point!(10000, af, bf)) == 0
Test Passed

julia> @test @allocated(compute_point!(10000, ac, bc)) == 0
Test Passed
```

2. `isdefined` check elimination

This commit also implements a simple optimization to eliminate
`isdefined` call by checking load-fowardability.
This optimization may be especially useful to eliminate extra allocation
involved with a capturing closure, e.g.:
```julia
julia> callit(f, args...) = f(args...);

julia> function isdefined_elim()
           local arr::Vector{Any}
           callit() do
               arr = Any[]
           end
           return arr
       end;

julia> code_typed(isdefined_elim)
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Vector{Any}, svec(Any, Int64), 0, :(:ccall), Vector{Any}, 0, 0))::Vector{Any}
└──      goto #3 if not true
2 ─      goto #4
3 ─      $(Expr(:throw_undef_if_not, :arr, false))::Any
4 ┄      return %1
) => Vector{Any}
```

3. load-forwarding for non-eliminable but non-escaping mutables

EA also allows us to forward loads even when the mutable allocation
can't be eliminated but still its fields are known precisely.
The load forwarding might be useful since it may derive new type information
that succeeding optimization passes can use (or just because it allows
simpler code transformations down the load):
```julia
julia> code_typed((Bool,String,)) do c, s
           r = Ref{Any}(s)
           if c
               return r[]::String # adce_pass! will further eliminate this type assert call also
           else
               return r
           end
       end
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = %new(Base.RefValue{Any}, s)::Base.RefValue{Any}
└──      goto #3 if not c
2 ─      return s
3 ─      return %1
) => Union{Base.RefValue{Any}, String}
```

---

Please refer to the newly added test cases for more examples.
Also, EA's alias analysis already succeeds to reason about arrays, and
so this EA-based SROA will hopefully be generalized for array SROA as well.
  • Loading branch information
aviatesk committed Jan 21, 2022
1 parent eccc077 commit 351ef8a
Show file tree
Hide file tree
Showing 2 changed files with 1,023 additions and 398 deletions.
Loading

0 comments on commit 351ef8a

Please sign in to comment.