Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dead code is causing memory allocations #48798

Closed
ufechner7 opened this issue Feb 26, 2023 · 7 comments
Closed

Dead code is causing memory allocations #48798

ufechner7 opened this issue Feb 26, 2023 · 7 comments

Comments

@ufechner7
Copy link

ufechner7 commented Feb 26, 2023

Code that is disabled, because it is in an if false; ...; end branch is causing memory allocations.

This makes it difficult to execute variants of your algorithm depending on a constant fast.

julia> versioninfo()
Julia Version 1.8.5
Commit 17cfb8e65ea (2023-01-08 06:45 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
  Threads: 1 on 8 virtual cores

MWE:

using StaticArrays, LinearAlgebra

const KVec3    = MVector{3, Float64}

Base.@kwdef mutable struct KPS3{S, T, P}
    v_apparent::T =       zeros(S, 3)
end

const kps3= KPS3{Float64, KVec3, 6+4+1}()
const KITE_PARTICLES = 4

const WINCH = false

function residual!(res, yd, y::MVector{S, Float64}, s::KPS3, time) where S
    if WINCH
        T = S-2 # T: three times the number of particles excluding the origin
        segments = div(T,6) - KITE_PARTICLES
        length,  v_reel_out  = y[end-1],  y[end]
        lengthd, v_reel_outd = yd[end-1], yd[end]
        # extract the data of the particles
        y_  = @view y[1:end-2]
        yd_ = @view yd[1:end-2]
        part  = reshape(SVector{T}(y_),  Size(3, div(T,6), 2))
        partd = reshape(SVector{T}(yd_), Size(3, div(T,6), 2))
        # if you comment the following line there are no allocations
        pos1 = part[:,:,1]
    else
        part = reshape(SVector{S}(y),  Size(3, div(S,6), 2))
        partd = reshape(SVector{S}(yd),  Size(3, div(S,6), 2))
        pos1, vel1 = part[:,:,1], part[:,:,2]
        pos = SVector{div(S,6)+1}(if i==1 SVector(0.0,0,0) else SVector(pos1[:,i-1]) end for i in 1:div(S,6)+1)
        vel = SVector{div(S,6)+1}(if i==1 SVector(0.0,0,0) else SVector(vel1[:,i-1]) end for i in 1:div(S,6)+1)
        posd1, veld1 = partd[:,:,1], partd[:,:,2]
        posd = SVector{div(S,6)+1}(if i==1 SVector(0.0,0,0) else SVector(posd1[:,i-1]) end for i in 1:div(S,6)+1)
        veld = SVector{div(S,6)+1}(if i==1 SVector(0.0,0,0) else SVector(veld1[:,i-1]) end for i in 1:div(S,6)+1)
    end
    nothing
end

function test_residual()
    if WINCH
        y0 = MVector{62, Float64}([13.970413450119487, 0.0, 21.238692070636343, 27.65581376097752, 0.0, 42.66213714321849, 40.976226230518435, 0.0, 64.314401166278, 53.87184032029182, 0.0, 86.22231803750196, 66.28915240374937, 0.0, 108.4048292516046, 78.17713830204762, 0.0, 130.87545423106485, 79.56930502428155, 0.0, 135.70836376062155, 80.90383289255747, 0.0, 137.7696816741141, 80.60126812407692, 2.4016533873456325, 135.3023287520457, 80.60126812407692, -2.4016533873456325, 135.3023287520457, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 150.0, 0.0])
        yd0= MVector{62, Float64}([0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0])
    else
        y0 = MVector{60, Float64}([13.970413450119487, 0.0, 21.238692070636343, 27.65581376097752, 0.0, 42.66213714321849, 40.976226230518435, 0.0, 64.314401166278, 53.87184032029182, 0.0, 86.22231803750196, 66.28915240374937, 0.0, 108.4048292516046, 78.17713830204762, 0.0, 130.87545423106485, 79.56930502428155, 0.0, 135.70836376062155, 80.90383289255747, 0.0, 137.7696816741141, 80.60126812407692, 2.4016533873456325, 135.3023287520457, 80.60126812407692, -2.4016533873456325, 135.3023287520457, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])
        yd0= MVector{60, Float64}([0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81])
    end
    time = 0.1
    res1 = zeros(SVector{6, KVec3})
    res2 = deepcopy(res1)
    res = reduce(vcat, vcat(res1, res2))
    # call first time to compile
    residual!(res, yd0, y0, kps3, time)
    # measure allocations
    @allocated residual!(res, yd0, y0, kps3, time)
end

test_residual()

This shows 1728 allocation which disappear if I comment the line pos1 = part[:,:,1] .

@vchuravy
Copy link
Sponsor Member

This is reminiscent of #15276

pos1 is closure captured by pos = SVector{div(S,6)+1}(if i==1 SVector(0.0,0,0) else SVector(pos1[:,i-1]) end for i in 1:div(S,6)+1) creating a Box.

At least that's what code_warntype tells me

residual!(res, yd, y::MVector{S, Float64}, s::KPS3, time) where S in Main at REPL[7]:1
Variables
  #self#::Core.Const(residual!)
  res::MVector{36, Float64}
  yd::MVector{60, Float64}
  y::MVector{60, Float64}
  s::KPS3{Float64, MVector{3, Float64}, 11}
  time::Float64
  #6::var"#6#10"{SMatrix{3, 10, Float64, 30}}
  #5::var"#5#9"{SMatrix{3, 10, Float64, 30}}
  #4::var"#4#8"{SMatrix{3, 10, Float64, 30}}
  #3::var"#3#7"
  veld::SVector{11, SVector{3, Float64}}
  posd::SVector{11, SVector{3, Float64}}
  veld1::SMatrix{3, 10, Float64, 30}
  posd1::SMatrix{3, 10, Float64, 30}
  vel::SVector{11, SVector{3, Float64}}
  pos::Any
  vel1::SMatrix{3, 10, Float64, 30}
  pos1::Core.Box
  partd::SArray{Tuple{3, 10, 2}, Float64, 3, 60}
  part::SArray{Tuple{3, 10, 2}, Float64, 3, 60}

@ufechner7
Copy link
Author

But why is this NOT happening if I execute the code by setting const WINCH = true?

@Seelengrab
Copy link
Contributor

Because there's no closure in the true branch and constant propagation seems to fix this here.

@ufechner7
Copy link
Author

ufechner7 commented Feb 26, 2023

But that is not logical. Why does this code not create any allocations:

using StaticArrays, LinearAlgebra

const KVec3    = MVector{3, Float64}

Base.@kwdef mutable struct KPS3{S, T, P}
    v_apparent::T =       zeros(S, 3)
end

const kps3= KPS3{Float64, KVec3, 6+4+1}()
const KITE_PARTICLES = 4

const WINCH = false

function residual!(res, yd, y::MVector{S, Float64}, s::KPS3, time) where S
    part = reshape(SVector{S}(y),  Size(3, div(S,6), 2))
    partd = reshape(SVector{S}(yd),  Size(3, div(S,6), 2))
    pos1, vel1 = part[:,:,1], part[:,:,2]
    pos = SVector{div(S,6)+1}(if i==1 SVector(0.0,0,0) else SVector(pos1[:,i-1]) end for i in 1:div(S,6)+1)
    vel = SVector{div(S,6)+1}(if i==1 SVector(0.0,0,0) else SVector(vel1[:,i-1]) end for i in 1:div(S,6)+1)
    posd1, veld1 = partd[:,:,1], partd[:,:,2]
    posd = SVector{div(S,6)+1}(if i==1 SVector(0.0,0,0) else SVector(posd1[:,i-1]) end for i in 1:div(S,6)+1)
    veld = SVector{div(S,6)+1}(if i==1 SVector(0.0,0,0) else SVector(veld1[:,i-1]) end for i in 1:div(S,6)+1)
    nothing
end

function test_residual()
    if WINCH
        y0 = MVector{62, Float64}([13.970413450119487, 0.0, 21.238692070636343, 27.65581376097752, 0.0, 42.66213714321849, 40.976226230518435, 0.0, 64.314401166278, 53.87184032029182, 0.0, 86.22231803750196, 66.28915240374937, 0.0, 108.4048292516046, 78.17713830204762, 0.0, 130.87545423106485, 79.56930502428155, 0.0, 135.70836376062155, 80.90383289255747, 0.0, 137.7696816741141, 80.60126812407692, 2.4016533873456325, 135.3023287520457, 80.60126812407692, -2.4016533873456325, 135.3023287520457, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 150.0, 0.0])
        yd0= MVector{62, Float64}([0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0])
    else
        y0 = MVector{60, Float64}([13.970413450119487, 0.0, 21.238692070636343, 27.65581376097752, 0.0, 42.66213714321849, 40.976226230518435, 0.0, 64.314401166278, 53.87184032029182, 0.0, 86.22231803750196, 66.28915240374937, 0.0, 108.4048292516046, 78.17713830204762, 0.0, 130.87545423106485, 79.56930502428155, 0.0, 135.70836376062155, 80.90383289255747, 0.0, 137.7696816741141, 80.60126812407692, 2.4016533873456325, 135.3023287520457, 80.60126812407692, -2.4016533873456325, 135.3023287520457, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])
        yd0= MVector{60, Float64}([0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81, 0.0, 0.0, -9.81])
    end
    time = 0.1
    res1 = zeros(SVector{6, KVec3})
    res2 = deepcopy(res1)
    res = reduce(vcat, vcat(res1, res2))
    # call first time to compile
    residual!(res, yd0, y0, kps3, time)
    # measure allocations
    @allocated residual!(res, yd0, y0, kps3, time)
end

test_residual()

It should contain the same closure as the original code...

@Seelengrab
Copy link
Contributor

The "call twice to compile" trick is only a thing on the top level. In a function, the calls are already compiled.

Further, the pos1 in this example are not captured in a way that triggers the closure boxing issue, because the pos1 is always created and the type of the variable does not depend on any dynamic value.

@KristofferC
Copy link
Sponsor Member

Indeed dup of #15276.

@ufechner7
Copy link
Author

Why is it closed as completed and not as duplicate?

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants