Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Truncated stacktraces on Windows for specific code #41750

Closed
timholy opened this issue Aug 1, 2021 · 5 comments · Fixed by #41813
Closed

Truncated stacktraces on Windows for specific code #41750

timholy opened this issue Aug 1, 2021 · 5 comments · Fixed by #41813
Assignees
Labels
bug Indicates an unexpected problem or unintended behavior system:windows Affects only Windows

Comments

@timholy
Copy link
Member

timholy commented Aug 1, 2021

SnoopCompile can now inventory all runtime dispatches that result in type-inference, but @ChrisRackauckas and I have found a case where this works on Linux but not on Windows.. Here's a demo that works on both platforms:

julia> using SnoopCompile

julia> f(x::Real) = sizeof(x)
f (generic function with 1 method)

julia> g(x) = f(Base.inferencebarrier(x))
g (generic function with 1 method)

julia> tinf = @snoopi_deep g(1)
InferenceTimingNode: 0.006272/0.006679 on Core.Compiler.Timings.ROOT() with 2 direct children

julia> itrigs = inference_triggers(tinf)
2-element Vector{InferenceTrigger}:
 Inference triggered to call g(::Int64) from eval (.\boot.jl:373) inlined into REPL.eval_user_input(::Any, ::REPL.REPLBackend) (C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.7\REPL\src\REPL.jl:150)
 Inference triggered to call f(::Int64) from g (.\REPL[3]:1) with specialization g(::Int64)

julia> stacktrace(itrigs[2])
19-element Vector{Base.StackTraces.StackFrame}:
 exit_current_timer at typeinfer.jl:164 [inlined]
 typeinf(interp::Core.Compiler.NativeInterpreter, frame::Core.Compiler.InferenceState) at typeinfer.jl:206
 typeinf_ext(interp::Core.Compiler.NativeInterpreter, mi::Core.MethodInstance) at typeinfer.jl:908
 typeinf_ext_toplevel(interp::Core.Compiler.NativeInterpreter, linfo::Core.MethodInstance) at typeinfer.jl:941
 typeinf_ext_toplevel(mi::Core.MethodInstance, world::UInt64) at typeinfer.jl:937
 g(x::Int64) at REPL[3]:1
 top-level scope at snoopi_deep.jl:53
 eval at boot.jl:373 [inlined]
 eval_user_input(ast::Any, backend::REPL.REPLBackend) at REPL.jl:150
 repl_backend_loop(backend::REPL.REPLBackend) at REPL.jl:241
 start_repl_backend(backend::REPL.REPLBackend, consumer::Any) at REPL.jl:226
 run_repl(repl::REPL.AbstractREPL, consumer::Any; backend_on_current_task::Bool) at REPL.jl:359
 run_repl(repl::REPL.AbstractREPL, consumer::Any) at REPL.jl:346
 (::Base.var"#919#921"{Bool, Bool, Bool})(REPL::Module) at client.jl:394
 #invokelatest#2 at essentials.jl:716 [inlined]
 invokelatest at essentials.jl:714 [inlined]
 run_main_repl(interactive::Bool, quiet::Bool, banner::Bool, history_file::Bool, color_set::Bool) at client.jl:379
 exec_options(opts::Base.JLOptions) at client.jl:309
 _start() at client.jl:495

The key line is the g(x::Int64) at REPL[3]:1 (6th one down), which is the caller of f. You can see the full stacktrace including C-frames with stacktrace(itrigs[2].node.bt, true).

So, this much works for both of us. However, this one does not (on 1.7.0-beta3):

(@v1.7) pkg> add DiffEqBase#cthulhu OrdinaryDiffEq#cthulhu SnoopCompile

using OrdinaryDiffEq, SnoopCompile
function lorenz(du,u,p,t)
 du[1] = 10.0(u[2]-u[1])
 du[2] = u[1]*(28.0-u[3]) - u[2]
 du[3] = u[1]*u[2] - (8/3)*u[3]
end
u0 = [1.0;0.0;0.0]
tspan = (0.0,100.0)
prob = ODEProblem(OrdinaryDiffEq.lorenz,u0,tspan)
alg = Tsit5()
tinf = @snoopi_deep solve(prob,alg)
itrigs = inference_triggers(tinf)

Now for all the itrigs the stacktrace terminates after compiler/typeinfer.jl (just the first 5 entries).

Full details: JuliaDebug/Cthulhu.jl#185

@timholy timholy added the system:windows Affects only Windows label Aug 1, 2021
@timholy
Copy link
Member Author

timholy commented Aug 1, 2021

@NHDaly may also be interested in this.

@ChrisRackauckas
Copy link
Member

I can confirm it seems to be generally a Windows thing since I got it setup and running on my laptop with the same error as my desktop. Both Julia v1.7-beta3.

@JeffBezanson JeffBezanson added the bug Indicates an unexpected problem or unintended behavior label Aug 4, 2021
@kimikage
Copy link
Contributor

kimikage commented Aug 4, 2021

Incidentally, this happens often enough that I don't even notice it's a bug. (It also happens on Julia 1.6.2.)

The stacktraces are terminated with unnamed IPs.
AFAIK, they are the backedges of jl_invoke at gf.c:2244 (Julia v1.6.2), but the addresses are not constant.
https://github.com/JuliaLang/julia/blob/v1.6.2/src/gf.c#L2244
or
https://github.com/JuliaLang/julia/blob/v1.7.0-beta3/src/gf.c#L2252

@kimikage
Copy link
Contributor

kimikage commented Aug 5, 2021

So, there is another example, which is independent of development branches:

julia> using SnoopCompile

julia> using Colors # v0.12.8

julia> tinf = @snoopi_deep colordiff(RGB(0.0, 0.0, 0.0), RGB(1.0, 1.0, 1.0))
InferenceTimingNode: 0.082452/0.135515 on InferenceFrameInfo for Core.Compiler.Timings.ROOT() with 7 direct children

julia> itrigs = inference_triggers(tinf)
7-element Vector{InferenceTrigger}:
 Inference triggered to call MethodInstance for RGB(::Float64, ::Float64, ::Float64) from eval (.\boot.jl:360) inlined into MethodInstance for eval_user_input(::Any, ::REPL.REPLBackend) (C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\REPL\src\REPL.jl:139)
 Inference triggered to call MethodInstance for colordiff(::RGB{Float64}, ::RGB{Float64}) from eval (.\boot.jl:360) inlined into MethodInstance for eval_user_input(::Any, ::REPL.REPLBackend) (C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\REPL\src\REPL.jl:139)
 Inference triggered to call MethodInstance for pow12_5(::Float64) called from toplevel
 Inference triggered to call MethodInstance for cbrt(::Float64) from fxyz2lab (C:\Users\username\.julia\packages\Colors\yDxFN\src\conversions.jl:415) inlined into MethodInstance for cnvt(::Type{Lab{Float64}}, ::XYZ{Float64}, ::XYZ{Float64}) (C:\Users\username\.julia\packages\Colors\yDxFN\src\conversions.jl:418)
 Inference triggered to call MethodInstance for atan(::Float64, ::Float64) from atand (.\special\trig.jl:1290) inlined into MethodInstance for _colordiff(::RGB{Float64}, ::RGB{Float64}, ::DE_2000) (C:\Users\username\.julia\packages\Colors\yDxFN\src\differences.jl:180)
 Inference triggered to call MethodInstance for sind(::Float64) from _colordiff (C:\Users\username\.julia\packages\Colors\yDxFN\src\differences.jl:193) with specialization MethodInstance for _colordiff(::RGB{Float64}, ::RGB{Float64}, ::DE_2000)
 Inference triggered to call MethodInstance for cosd(::Float64) from _colordiff (C:\Users\username\.julia\packages\Colors\yDxFN\src\differences.jl:213) with specialization MethodInstance for _colordiff(::RGB{Float64}, ::RGB{Float64}, ::DE_2000)

julia> stacktrace(itrigs[3].node.bt, true)
16-element Vector{Base.StackTraces.StackFrame}:
 exit_current_timer at typeinfer.jl:164 [inlined]
 typeinf(interp::Core.Compiler.NativeInterpreter, frame::Core.Compiler.InferenceState) at typeinfer.jl:206
 typeinf_ext(interp::Core.Compiler.NativeInterpreter, mi::Core.MethodInstance) at typeinfer.jl:892
 typeinf_ext_toplevel(interp::Core.Compiler.NativeInterpreter, linfo::Core.MethodInstance) at typeinfer.jl:925
 typeinf_ext_toplevel(mi::Core.MethodInstance, world::UInt64) at typeinfer.jl:921
 typeinf_ext_toplevel(mi::Core.MethodInstance, world::UInt64) at sys.dll:?
 _jl_invoke at gf.c:2237 [inlined]
 jl_apply_generic at gf.c:2419 [inlined]
 jl_apply at julia.h:1703 [inlined]
 jl_type_infer at gf.c:298
 jl_generate_fptr at jitlayers.cpp:340
 jl_compile_method_internal at gf.c:1970
 jl_compile_method_internal at gf.c:1924 [inlined]
 _jl_invoke at gf.c:2229 [inlined]
 jl_invoke at gf.c:2244
 ip:0x61135c09

julia> versioninfo()
Julia Version 1.6.2
Commit 1b93d53fc4 (2021-07-14 15:36 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, tigerlake)
Debian on WSL2
julia> foreach(println, stacktrace(itrigs[3].node.bt, true))
exit_current_timer at typeinfer.jl:164 [inlined]
typeinf(interp::Core.Compiler.NativeInterpreter, frame::Core.Compiler.InferenceState) at typeinfer.jl:206
typeinf_ext(interp::Core.Compiler.NativeInterpreter, mi::Core.MethodInstance) at typeinfer.jl:892
typeinf_ext_toplevel(interp::Core.Compiler.NativeInterpreter, linfo::Core.MethodInstance) at typeinfer.jl:925
typeinf_ext_toplevel(mi::Core.MethodInstance, world::UInt64) at typeinfer.jl:921
typeinf_ext_toplevel(mi::Core.MethodInstance, world::UInt64) at sys.so:?
_jl_invoke at gf.c:2237 [inlined]
jl_apply_generic at gf.c:2419
jl_apply at julia.h:1703 [inlined]
jl_type_infer at gf.c:298
jl_generate_fptr at jitlayers.cpp:340
jl_compile_method_internal at gf.c:1970
jl_compile_method_internal at gf.c:2236 [inlined]
_jl_invoke at gf.c:2229 [inlined]
jl_invoke at gf.c:2244
ip:0x7f0d6801afa6
ip:0x7f0d6801af35
invert_srgb_compand at conversions.jl:283 [inlined]
cnvt(#unused#::Type{XYZ{Float64}}, c::RGB{Float64}) at conversions.jl:309
_convert at conversions.jl:51 [inlined]
cconvert at conversions.jl:76 [inlined]
convert at conversions.jl:73 [inlined]
cnvt at conversions.jl:493 [inlined]
_convert at conversions.jl:51 [inlined]
cconvert at conversions.jl:76 [inlined]
convert at conversions.jl:73 [inlined]
_colordiff(ai::RGB{Float64}, bi::RGB{Float64}, m::DE_2000) at differences.jl:173
#colordiff#22 at differences.jl:462 [inlined]
colordiff(ai::RGB{Float64}, bi::RGB{Float64}) at differences.jl:462
colordiff(ai::RGB{Float64}, bi::RGB{Float64})
_jl_invoke at gf.c:2237 [inlined]
jl_apply_generic at gf.c:2419
jl_apply at julia.h:1703 [inlined]
do_call at interpreter.c:115
eval_value at interpreter.c:204
eval_stmt_value at interpreter.c:155 [inlined]
eval_body at interpreter.c:562
eval_body at interpreter.c:490
jl_interpret_toplevel_thunk at interpreter.c:670
top-level scope at snoopi_deep.jl:53
jl_toplevel_eval_flex at toplevel.c:877
jl_toplevel_eval_flex at toplevel.c:825
jl_toplevel_eval_in at toplevel.c:929
eval at boot.jl:360 [inlined]
eval_user_input(ast::Any, backend::REPL.REPLBackend) at REPL.jl:139
repl_backend_loop(backend::REPL.REPLBackend) at REPL.jl:200
start_repl_backend(backend::REPL.REPLBackend, consumer::Any) at REPL.jl:185
run_repl(repl::REPL.AbstractREPL, consumer::Any; backend_on_current_task::Bool) at REPL.jl:317
run_repl(repl::REPL.AbstractREPL, consumer::Any) at REPL.jl:305
_jl_invoke at gf.c:2237 [inlined]
jl_apply_generic at gf.c:2419
(::Base.var"#874#876"{Bool, Bool, Bool})(REPL::Module) at client.jl:387
(::Base.var"#874#876"{Bool, Bool, Bool})(REPL::Module) at sys.so:?
_jl_invoke at gf.c:2237 [inlined]
jl_apply_generic at gf.c:2419
jl_apply at julia.h:1703 [inlined]
jl_f__call_latest at builtins.c:714
#invokelatest#2 at essentials.jl:708 [inlined]
invokelatest at essentials.jl:706 [inlined]
run_main_repl(interactive::Bool, quiet::Bool, banner::Bool, history_file::Bool, color_set::Bool) at client.jl:372
exec_options(opts::Base.JLOptions) at client.jl:302
_start() at client.jl:485
_start() at sys.so:?
_jl_invoke at gf.c:2237 [inlined]
jl_apply_generic at gf.c:2419
jl_apply at julia.h:1703 [inlined]
true_main at jlapi.c:560
repl_entrypoint at jlapi.c:702
main at loader_exe.c:51
__libc_start_main at libc.so.6:?
_start at julia:?

@vtjnash
Copy link
Member

vtjnash commented Aug 5, 2021

Ah, this feels ironic, since Win64 is the only platform where we promise that it will work. This turns out to be one risk of using text-based IR builders for LLVM, as the bitcode has backwards-compatible guarantees, but API calls and llvmcall do not. Seems like it should be a trivial fix, since we just need to repair a deprecation.

@vtjnash vtjnash self-assigned this Aug 6, 2021
vtjnash added a commit that referenced this issue Aug 7, 2021
We depend on taking the address of these functions to support unwinding
properly, particularly on Win64.

Fixes #41750
vtjnash added a commit that referenced this issue Aug 10, 2021
We depend on taking the address of these functions to support unwinding
properly, particularly on Win64.

Fixes #41750
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior system:windows Affects only Windows
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants