Avoid unnecessary convert in setindex! for IdDict#38455
Avoid unnecessary convert in setindex! for IdDict#38455
convert in setindex! for IdDict#38455Conversation
The `@nospecialize` in `IdSet` and `IdDict` operations are necessary to reduce latency, but they introduce backedges that are at risk of invalidation. This PR reduces the impact for callers of `setindex!(::IdDict, val, key)` who know that `val` doesn't need conversion. One of my previous specializations of dict-related methods resulted in an excessive increase in the size of the sysimage, but this PR doesn't hurt: ```sh tim@diva:~/src/julia-master$ ls -lh usr/lib/julia/sys.so # pre -rwxr-xr-x 1 tim holy 141M Nov 16 03:11 usr/lib/julia/sys.so tim@diva:~/src/julia-master$ ls -lh usr/lib/julia/sys.so # post -rwxr-xr-x 1 tim holy 140M Nov 16 03:48 usr/lib/julia/sys.so ```
|
I was a little bit worried that having the type parameter inside the julia> using MethodAnalysis
julia> @time mis = methodinstances();
2.087641 seconds (3.59 M allocations: 214.998 MiB, 6.01% gc time, 24.31% compilation time)
julia> @time mis = methodinstances();
1.375807 seconds (2.30 M allocations: 139.877 MiB, 1.69% gc time)on this PR, vs julia> @time mis = methodinstances();
2.932984 seconds (3.92 M allocations: 209.983 MiB, 4.38% gc time)
julia> @time mis = methodinstances();
2.364180 seconds (2.84 M allocations: 153.206 MiB, 1.26% gc time)on Julia 1.5. I don't know whether this PR or something else is responsible for the improvement (it well could be this PR), but the important point is there is no regression. |
|
Test failures on mac and linux are weird (something fails after the test fails) but the Windows one seems real: |
|
Hmm. I don't have a Windows machine I can build on to test. Any ideas? |
|
If we need to revert #37193 to get tests working, by all means go for it. I don't really know how to diagnose the windows failure, it's a bit too bad it cropped up after this was initially submitted since seeing it appear might have given more clues. |
|
Putting a static parameter (or any type with free variables) on an argument will definitely undo |
|
To clarify, I expect some specialization, I was just worried about whether this would essentially break the julia> using MethodAnalysis
julia> d = IdDict{Function,Type}()
IdDict{Function, Type}()
julia> d[sin] = Float32
Float32
julia> d[sort] = Int
Int64
julia> uptrs = Set{Ptr{Nothing}}()
Set{Ptr{Nothing}}()
julia> visit(setindex!) do item
isa(item, Core.CodeInstance) || return true
Base.unwrap_unionall(item.def.specTypes).parameters[2] <: IdDict || return false
push!(uptrs, item.invoke)
return true
end
julia> uptrs
Set{Ptr{Nothing}} with 2 elements:
Ptr{Nothing} @0x0000000000000000
Ptr{Nothing} @0x00007fb109ef6fc0
and I think this (and many other pieces of evidence) indicate that it's not compiling this Conversely, on this branch I get julia> uptrs
Set{Ptr{Nothing}} with 7 elements:
Ptr{Nothing} @0x00007f5f48d40fc0
Ptr{Nothing} @0x00007f5f3af0b9f0
Ptr{Nothing} @0x00007f5f3af0c1c0
Ptr{Nothing} @0x00007f5f3af0be90
Ptr{Nothing} @0x0000000000000000
Ptr{Nothing} @0x00007f5f3af0c500
Ptr{Nothing} @0x00007f5f3af0bc30which is a lot more, but if I now add another: then the number doesn't grow: julia> visit(setindex!) do item
isa(item, Core.CodeInstance) || return true
Base.unwrap_unionall(item.def.specTypes).parameters[2] <: IdDict || return false
push!(uptrs, item.invoke)
return true
end
julia> uptrs
Set{Ptr{Nothing}} with 7 elements:
Ptr{Nothing} @0x00007f5f48d40fc0
Ptr{Nothing} @0x00007f5f3af0b9f0
Ptr{Nothing} @0x00007f5f3af0c1c0
Ptr{Nothing} @0x00007f5f3af0be90
Ptr{Nothing} @0x0000000000000000
Ptr{Nothing} @0x00007f5f3af0c500
Ptr{Nothing} @0x00007f5f3af0bc30So I think what this means is that the |
|
If you look at the |
|
Thanks for clarifying. To further probe whether this is a good idea, I exploited your observation that if this works as I hoped it would, there's no reason not to apply the same treatment to the julia> @time mis = methodinstances();
204.905677 seconds (318.25 M allocations: 18.999 GiB, 1.64% gc time, 97.27% compilation time)which if you compare it to the timings in #38455 (comment) is disastrous. To me this is a pretty compelling argument that we should not merge this PR, at least not without groundwork changing the way |
|
Yes, in theory that should be possible just by giving |
|
This (meaning, "this issue" and not the change in this PR, which should not be merged) has just gotten a bit more urgent. #39434 fixed a handful of invalidations but it turns out added a whole bunch in other circumstances (#39593 (comment)). The big culprit is: julia> methinvs = trees[end]
inserting convert(::Type{Nothing}, po::PyCall.PyObject) in PyCall at /home/tim/.julia/packages/PyCall/tqyST/src/conversions.jl:64 invalidated:
backedges: 1: superseding convert(::Type{Nothing}, x) in Base at some.jl:37 with MethodInstance for convert(::Type{Nothing}, ::Any) (1235 children)
julia> root = methinvs.backedges[end]
MethodInstance for convert(::Type{Nothing}, ::Any) at depth 0 with 1235 children
julia> ascend(root)
Choose a call for analysis (q to quit):
convert(::Type{Nothing}, ::Any)
setindex!(::IdDict{Any, Nothing}, ::Any, ::Any)
> push!(::Base.IdSet{Any}, ::Vector{_A} where _A)
+ parse_array(::Base.TOML.Parser)There are many below that; not all come through the TOML |
|
Ah, that's unfortunate, sorry about that! Feel free to revert that PR, if you think it overall makes things worse currently until we fix these other issues. |
|
It might make sense, but overall no worries. If you do more of this and want to avoid this fate, https://github.com/timholy/MethodAnalysis.jl/blob/d2b3157be307471697635d0bd6663e93dea1a504/demos/abstract.jl followed by I need to find a way to get back to having #37193 (which got reverted), but indeed this PR is one of the obstacles to having it! |
This addresses #37193 (comment)
The
@nospecializeinIdSetandIdDictoperations are necessaryto reduce latency, but they introduce backedges that are at risk of
invalidation. This PR reduces the impact for callers of
setindex!(::IdDict, val, key)who know thatvaldoesn't need conversion.
One of my previous specializations of dict-related methods resulted in
an excessive increase in the size of the sysimage, but this PR doesn't
hurt: