Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Pkg's own precompilation to make Pkg.instantiate() in a script faster. #4079

Open
NHDaly opened this issue Nov 7, 2024 · 3 comments

Comments

@NHDaly
Copy link
Member

NHDaly commented Nov 7, 2024

We want to run a script locally, and we want to include an instantiate() step to make sure that the user has everything installed. The instantiate should be a noop most of the time, once things are installed. However, it takes almost 1 second the first time, vs a quarter of a second after that.

Can we better precompile Pkg.instantiate so this doesn't have any compile time?:

julia> @time @eval import Pkg
  0.000365 seconds (356 allocations: 35.180 KiB)

julia> @time @eval Pkg.instantiate()
  0.914451 seconds (3.33 M allocations: 267.919 MiB, 9.12% gc time, 41.20% compilation time)

julia> @time @eval Pkg.instantiate()
  0.265228 seconds (551.15 k allocations: 50.819 MiB, 1.63% gc time)

julia> @time @eval Pkg.instantiate()
  0.244608 seconds (551.15 k allocations: 50.817 MiB, 1.28% gc time)
@KristofferC
Copy link
Member

This is the trace output from running Pkg.instantiate (sorted by time):

❯ sort -g trace.jl        
#=    2.0 ms =# precompile(Tuple{Base.BottomRF{typeof(Base.:(+))}, Base._InitialValue, Int64}) # recompile
#=    2.0 ms =# precompile(Tuple{Type{Memory{String}}, UndefInitializer, Int64}) # recompile
#=    2.0 ms =# precompile(Tuple{Type{Pair{A, B} where B where A}, String, Dates.DateTime})
#=    2.0 ms =# precompile(Tuple{typeof(Base.Iterators.enumerate), Array{String, 1}}) # recompile
#=    2.0 ms =# precompile(Tuple{typeof(Base.string), Module}) # recompile
#=    2.1 ms =# precompile(Tuple{typeof(Base.write), Base.TTY, Array{UInt8, 1}}) # recompile
#=    2.1 ms =# precompile(Tuple{typeof(TOML.Internals.Printer.is_array_of_tables), Array{Base.Dict{String, Dates.DateTime}, 1}})
#=    2.2 ms =# precompile(Tuple{typeof(Base.get), Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, Symbol, Nothing}) # recompile
#=    2.3 ms =# precompile(Tuple{typeof(Base.isopen), Base.GenericIOBuffer{Memory{UInt8}}})
#=    2.4 ms =# precompile(Tuple{typeof(Base.join), Base.GenericIOBuffer{Memory{UInt8}}, Tuple{UInt32}, Char})
#=    2.5 ms =# precompile(Tuple{typeof(Base.deepcopy_internal), Tuple{String}, Base.IdDict{Any, Any}})
#=    2.6 ms =# precompile(Tuple{typeof(Base.map), Function, Array{Base.Dict{String, Dates.DateTime}, 1}})
#=    2.6 ms =# precompile(Tuple{typeof(Base.setindex!), Array{String, 1}, String, Int64}) # recompile
#=    2.7 ms =# precompile(Tuple{typeof(Base.Filesystem.contractuser), String}) # recompile
#=    2.8 ms =# precompile(Tuple{Base.var"##invokelatest#1", Base.Pairs{Symbol, Any, NTuple{4, Symbol}, NamedTuple{(:indent, :sorted, :by, :inline_tables), Tuple{Int64, Bool, typeof(Base.identity), Base.IdSet{Base.Dict{String, V} where V}}}}, typeof(Base.invokelatest), Any, Any, Vararg{Any}}) # recompile
#=    2.8 ms =# precompile(Tuple{typeof(Base._prechecked_iterate), Base.OneTo{Int64}, Int64}) # recompile
#=    2.8 ms =# precompile(Tuple{typeof(Base.similar), Array{Any, 1}}) # recompile
#=    2.9 ms =# precompile(Tuple{typeof(Base.map), Function, Array{Any, 1}})
#=    3.1 ms =# precompile(Tuple{typeof(Base.similar), Array{String, 1}}) # recompile
#=    3.2 ms =# precompile(Tuple{Type{Array{Dates.DateTime, 1}}, UndefInitializer, Tuple{Int64}})
#=    3.3 ms =# precompile(Tuple{typeof(Core.kwcall), NamedTuple{(:indent, :sorted, :by, :inline_tables), Tuple{Int64, Bool, typeof(Base.identity), Base.IdSet{Base.Dict{String, V} where V}}}, typeof(Base.invokelatest), Any, Any, Vararg{Any}})
#=    3.3 ms =# precompile(Tuple{typeof(Core.kwcall), NamedTuple{(:indent, :sorted, :by, :inline_tables), Tuple{Int64, Bool, typeof(Base.identity), Base.IdSet{Base.Dict{String, V} where V}}}, typeof(TOML.Internals.Printer.print_table), Nothing, Base.IOStream, Base.Dict{String, Dates.DateTime}, Array{String, 1}})
#=    3.4 ms =# precompile(Tuple{typeof(Base.empty), Base.Dict{Any, Any}, Type{String}, Type{Base.UUID}})
#=    3.6 ms =# precompile(Tuple{Base.var"#691#692"{Base.Process}})
#=    3.6 ms =# precompile(Tuple{typeof(Base.uv_shutdowncb_task), Ptr{Nothing}, Int32}) # recompile
#=    4.9 ms =# precompile(Tuple{Type{Base.IOContext{IO_t} where IO_t<:IO}, Base.GenericIOBuffer{Memory{UInt8}}, Base.TTY}) # recompile
#=    9.7 ms =# precompile(Tuple{typeof(Base.CoreLogging.shouldlog), Base.CoreLogging.ConsoleLogger, Base.CoreLogging.LogLevel, Module, Symbol, Symbol})
#=   13.3 ms =# precompile(Tuple{typeof(Core.kwcall), NamedTuple{(:internal_call, :strict, :warn_loaded, :timing, :_from_loading, :configs, :manifest, :io), Tuple{Bool, Bool, Bool, Bool, Bool, Pair{Base.Cmd, Base.CacheFlags}, Bool, Base.TTY}}, typeof(Base.Precompilation.precompilepkgs), Array{String, 1}})
#=   13.8 ms =# precompile(Tuple{typeof(Base.deepcopy_internal), Array{String, 1}, Base.IdDict{Any, Any}})
#=   15.3 ms =# precompile(Tuple{typeof(Base.maximum), Array{Dates.DateTime, 1}})
#=   15.9 ms =# precompile(Tuple{typeof(Base.convert), Type{Base.Dict{String, Union{Array{String, 1}, String}}}, Base.Dict{String, Any}})
#=   15.9 ms =# precompile(Tuple{typeof(Base.deepcopy_internal), Base.Dict{String, Base.UUID}, Base.IdDict{Any, Any}})
#=   16.2 ms =# precompile(Tuple{typeof(Base.deepcopy_internal), Base.Dict{String, Array{String, 1}}, Base.IdDict{Any, Any}})
#=   19.0 ms =# precompile(Tuple{typeof(Base.readbytes!), Base.PipeEndpoint, Array{UInt8, 1}, Int64})
#=   22.0 ms =# precompile(Tuple{typeof(Base.CoreLogging.default_metafmt), Base.CoreLogging.LogLevel, Vararg{Any, 5}})
#=   22.1 ms =# precompile(Tuple{typeof(Base.deepcopy_internal), Base.Dict{String, Union{Array{String, 1}, String}}, Base.IdDict{Any, Any}})
#=   25.5 ms =# precompile(Tuple{typeof(Base.closewrite), Base.PipeEndpoint})
#=   32.5 ms =# precompile(Tuple{typeof(Base.deepcopy_internal), Base.Dict{String, Base.Dict{String, String}}, Base.IdDict{Any, Any}})
#=   33.8 ms =# precompile(Tuple{typeof(Base.unsafe_read), Base.PipeEndpoint, Ptr{UInt8}, UInt64})
#=   46.8 ms =# precompile(Tuple{Base.Compiler.var"#get_infer_result#typeinf_edge##0"{Method, Bool, Bool, Base.Compiler.InferenceResult, Base.Compiler.Future{Base.Compiler.MethodCallResult}}, Base.Compiler.NativeInterpreter, Base.Compiler.IRInterpretationState})
#=   59.8 ms =# precompile(Tuple{Base.Precompilation.var"#_precompilepkgs##31#_precompilepkgs##32"{Bool, Array{Pair{Base.Cmd, Base.CacheFlags}, 1}, Bool, Base.Dict{Tuple{Base.PkgId, Pair{Base.Cmd, Base.CacheFlags}}, String}, Base.Set{Tuple{Base.PkgId, Pair{Base.Cmd, Base.CacheFlags}}}, String, String, String, String, Base.Precompilation.var"#ansi_moveup#_precompilepkgs##19", Base.Event, Base.Event, Base.ReentrantLock, Array{Tuple{Base.PkgId, Pair{Base.Cmd, Base.CacheFlags}}, 1}, Base.Dict{Tuple{Base.PkgId, Pair{Base.Cmd, Base.CacheFlags}}, String}, Array{Tuple{Base.PkgId, Pair{Base.Cmd, Base.CacheFlags}}, 1}, Base.Dict{Tuple{Base.PkgId, Pair{Base.Cmd, Base.CacheFlags}}, Bool}, Base.Dict{Tuple{Base.PkgId, Pair{Base.Cmd, Base.CacheFlags}}, Bool}, Array{Base.PkgId, 1}, Base.Precompilation.var"#describe_pkg#_precompilepkgs##1"{Base.Dict{Base.PkgId, String}, Base.Precompilation.var"#color_string#_precompilepkgs##0"{Bool}, Int64}, Base.Dict{Base.PkgId, Array{Base.PkgId, 1}}, Base.Precompilation.var"#color_string#_precompilepkgs##0"{Bool}}}) # recompile
#=  128.3 ms =# precompile(Tuple{Base.Precompilation.var"#_precompilepkgs##45#_precompilepkgs##46"{Bool, Bool, Array{Task, 1}, Base.Dict{Tuple{Base.PkgId, Pair{Base.Cmd, Base.CacheFlags}}, String}, Base.Dict{Tuple{Base.PkgId, Pair{Base.Cmd, Base.CacheFlags}}, Base.GenericIOBuffer{Memory{UInt8}}}, Base.Event, Base.Event, Base.ReentrantLock, Array{Tuple{Base.PkgId, Pair{Base.Cmd, Base.CacheFlags}}, 1}, Base.Dict{Tuple{Base.PkgId, Pair{Base.Cmd, Base.CacheFlags}}, String}, Array{Tuple{Base.PkgId, Pair{Base.Cmd, Base.CacheFlags}}, 1}, Array{Base.PkgId, 1}, Base.Dict{Tuple{Base.PkgId, Pair{Base.Cmd, Base.CacheFlags}}, Bool}, Base.Dict{Tuple{Base.PkgId, Pair{Base.Cmd, Base.CacheFlags}}, Base.Event}, Base.Dict{Tuple{Base.PkgId, Pair{Base.Cmd, Base.CacheFlags}}, Bool}, Array{Base.PkgId, 1}, Base.Precompilation.var"#describe_pkg#_precompilepkgs##1"{Base.Dict{Base.PkgId, String}, Base.Precompilation.var"#color_string#_precompilepkgs##0"{Bool}, Int64}, Base.Dict{Base.PkgId, Array{Base.PkgId, 1}}, Base.Dict{Base.PkgId, String}, Base.Dict{Base.PkgId, Array{String, 1}}, Base.Dict{Tuple{Base.PkgId, UInt128, String, String}, Bool}, Base.Precompilation.var"#color_string#_precompilepkgs##0"{Bool}, Bool, Base.Semaphore, Bool, String, Array{String, 1}, Array{Base.PkgId, 1}, Base.PkgId, Base.CacheFlags, Base.Cmd, Pair{Base.Cmd, Base.CacheFlags}, Tuple{Base.PkgId, Pair{Base.Cmd, Base.CacheFlags}}}}) # recompile
#=  301.6 ms =# precompile(Tuple{typeof(Base.CoreLogging.handle_message), Base.CoreLogging.ConsoleLogger, Base.CoreLogging.LogLevel, Vararg{Any, 6}})

So it seems most time is spent in:

  • Precompiling Base.Precompilation.precompilepkgs which is invalidated for some reason (60m + 128ms).
  • Compiling some logging code for Vararg{Any, 6} (300 ms).

So we need to figure out why the precompile code is invalidated and why the handle_message gets specialized on the number of arguments...

@martinholters
Copy link
Member

So we need to figure out [...] why the handle_message gets specialized on the number of arguments...

Because it has a fixed number of arguments:
https://github.com/JuliaLang/julia/blob/001c666086de77101b6937c3d31f0888a35325db/base/logging/ConsoleLogger.jl#L106-L108
It appears that the compilation is logged as

precompile(Tuple{typeof(Base.CoreLogging.handle_message), Base.CoreLogging.ConsoleLogger, Base.CoreLogging.LogLevel, Vararg{Any, 6}})

but the generated specialization is indeed

MethodInstance for Base.CoreLogging.handle_message(::Base.CoreLogging.ConsoleLogger, ::Base.CoreLogging.LogLevel, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any)

@KristofferC
Copy link
Member

Something related to that it is nospecialize combined with being a kwarg function?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants