Manually union split chunksize calculation #1536

ChrisRackauckas · 2021-12-09T12:07:53Z

using DifferentialEquations, SnoopCompile

function lorenz(du,u,p,t)
 du[1] = 10.0(u[2]-u[1])
 du[2] = u[1]*(28.0-u[3]) - u[2]
 du[3] = u[1]*u[2] - (8/3)*u[3]
end

u0 = [1.0;0.0;0.0]
tspan = (0.0,100.0)
prob = ODEProblem(lorenz,u0,tspan)
alg = Rodas5()
tinf = @snoopi_deep solve(prob,alg)

Before:

InferenceTimingNode: 1.524478/15.326828 on Core.Compiler.Timings.ROOT() with 4 direct children

julia> inference_triggers(tinf)
3-element Vector{InferenceTrigger}:
 Inference triggered to call (NamedTuple{(:chunk_size,)})(::Tuple{Val{3}}) from prepare_alg (C:\Users\accou\.julia\dev\OrdinaryDiffEq\src\alg_utils.jl:174) with specialization DiffEqBase.prepare_alg(::Rodas5{0, true, DefaultLinSolve, Val{:forward}}, ::Vector{Float64}, ::SciMLBase.NullParameters, ::ODEProblem{Vector{Float64}, Tuple{Float64, Float64}, true, SciMLBase.NullParameters, ODEFunction{true, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem})
 Inference triggered to call DiffEqBase.solve_call(::ODEProblem{Vector{Float64}, Tuple{Float64, Float64}, true, SciMLBase.NullParameters, ODEFunction{true, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, ::Rodas5{3, true, DefaultLinSolve, Val{:forward}}) from #solve_up#44 (C:\Users\accou\.julia\packages\DiffEqBase\b1nST\src\solve.jl:87) with specialization DiffEqBase.var"#solve_up#44"(::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, ::typeof(DiffEqBase.solve_up), ::ODEProblem{Vector{Float64}, Tuple{Float64, Float64}, true, SciMLBase.NullParameters, ODEFunction{true, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, ::Nothing, ::Vector{Float64}, ::SciMLBase.NullParameters, ::Rodas5{0, true, DefaultLinSolve, Val{:forward}})
 Inference triggered to call OrdinaryDiffEq.jacobian2W!(::Matrix{Float64}, ::LinearAlgebra.UniformScaling{Bool}, ::Float64, ::Matrix{Float64}, ::Bool) called from toplevel

After:

InferenceTimingNode: 3.082193/16.376914 on Core.Compiler.Timings.ROOT() with 2 direct children

julia> inference_triggers(tinf)
1-element Vector{InferenceTrigger}:
 Inference triggered to call OrdinaryDiffEq.jacobian2W!(::Matrix{Float64}, ::LinearAlgebra.UniformScaling{Bool}, ::Float64, ::Matrix{Float64}, ::Bool) called from toplevel

That's without the static array handling branch.

ChrisRackauckas · 2021-12-09T16:18:09Z

Compile time is not improved, but having everything be non-dynamic in its behavior is nice for other reasons...

chriselrod

I think it might be worth refactoring things a bit.

Ideally, the choice of chunk size would be made at a point where types converge again, so that you don't ever actually have to deal with these unions or worry about max_methods.

chriselrod · 2021-12-09T16:31:08Z

src/alg_utils.jl

+        elseif chunk_size > 4
+          cs = Val{4}()
+          remake(alg,chunk_size=cs)
+        else


Is it worth adding branches for 2 and 3? It'd cost some compile time, but I'm worried about possible runtime regressions for small problems.

This comment is also conditional on max_methods.
Currently, you're getting a union of 3 return types.
That breaks when we set max_methods<3 (e.g. =1).

My proposal (5 return types) breaks currently. Hence, the above comment on performing this split at a point of return type convergence.

YingboMa

This would break PreallocationTools.jl because it just defaults to pickchunksize https://github.com/SciML/PreallocationTools.jl/blob/master/src/PreallocationTools.jl#L26

ChrisRackauckas · 2021-12-09T19:41:58Z

It can handle resizing to smaller chunksizes though.

YingboMa · 2021-12-09T19:49:36Z

Did you run the script on the same computer? Looks like that the inference time regressed.

ChrisRackauckas · 2021-12-09T20:01:59Z

This comment is also conditional on max_methods.
Currently, you're getting a union of 3 return types.
That breaks when we set max_methods<3 (e.g. =1).
My proposal (5 return types) breaks currently. Hence, the above comment on performing this split at a point of return type convergence.

Yeah.. 1,2,3,4,8 would be best. 1 or 5 are the best choices, 3 is a bad one 😅

ChrisRackauckas · 2021-12-09T20:02:24Z

@static if maxmethods = 3... 🤣 🤣 🤣 🤣 🤣 🤣 🤣 🤣 🤣 🤣 🤣 🤣 🤣

Dear god the compiler team would hate me.

ChrisRackauckas · 2021-12-09T20:03:11Z

Did you run the script on the same computer? Looks like that the inference time regressed.

Yeah, I mentioned that above.

Compile time is not improved, but having everything be non-dynamic in its behavior is nice for other reasons...

I don't know if there's also a runtime hit too. So it's hard to tell if this is a real PR or something mostly for testing.

chriselrod · 2021-12-09T20:29:33Z

I don't know if there's also a runtime hit too. So it's hard to tell if this is a real PR or something mostly for testing.

I think 4 and 8 will be good, especially with JuliaDiff/ForwardDiff.jl@d033d2a
which should guarantee those SIMD well.
Although my testing suggests the explicit SIMD does not actually help the HCV Pumas model's runtime performance, while hurting the compilation time.

YingboMa · 2021-12-09T20:33:52Z

Have you tried to only split 4 and 1?

src/alg_utils.jl

ChrisRackauckas added 2 commits December 9, 2021 07:07

manually union split chunksize calculation

ce659bd

remake in branches to help inference

46bc5c8

ChrisRackauckas changed the title ~~WIP: manually union split chunksize calculation~~ Manually union split chunksize calculation Dec 9, 2021

ChrisRackauckas requested review from chriselrod and YingboMa December 9, 2021 16:16

handle static chunk size

f184c10

chriselrod reviewed Dec 9, 2021

View reviewed changes

YingboMa reviewed Dec 9, 2021

View reviewed changes

src/alg_utils.jl Outdated Show resolved Hide resolved

Update alg_utils.jl

e8305f9

ChrisRackauckas mentioned this pull request Dec 11, 2021

Make the chunksize determination static when static #1540

Merged

ChrisRackauckas closed this Jan 25, 2022

ChrisRackauckas deleted the union_split branch January 25, 2022 07:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Manually union split chunksize calculation #1536

Manually union split chunksize calculation #1536

ChrisRackauckas commented Dec 9, 2021 •

edited

Loading

ChrisRackauckas commented Dec 9, 2021

chriselrod left a comment

chriselrod Dec 9, 2021 •

edited

Loading

YingboMa left a comment

ChrisRackauckas commented Dec 9, 2021

YingboMa commented Dec 9, 2021

ChrisRackauckas commented Dec 9, 2021

ChrisRackauckas commented Dec 9, 2021

ChrisRackauckas commented Dec 9, 2021

chriselrod commented Dec 9, 2021

YingboMa commented Dec 9, 2021

Manually union split chunksize calculation #1536

Manually union split chunksize calculation #1536

Conversation

ChrisRackauckas commented Dec 9, 2021 • edited Loading

ChrisRackauckas commented Dec 9, 2021

chriselrod left a comment

Choose a reason for hiding this comment

chriselrod Dec 9, 2021 • edited Loading

Choose a reason for hiding this comment

YingboMa left a comment

Choose a reason for hiding this comment

ChrisRackauckas commented Dec 9, 2021

YingboMa commented Dec 9, 2021

ChrisRackauckas commented Dec 9, 2021

ChrisRackauckas commented Dec 9, 2021

ChrisRackauckas commented Dec 9, 2021

chriselrod commented Dec 9, 2021

YingboMa commented Dec 9, 2021

ChrisRackauckas commented Dec 9, 2021 •

edited

Loading

chriselrod Dec 9, 2021 •

edited

Loading