Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another unnecessary GC root #17342

Closed
JeffBezanson opened this issue Jul 8, 2016 · 3 comments · Fixed by #17379
Closed

Another unnecessary GC root #17342

JeffBezanson opened this issue Jul 8, 2016 · 3 comments · Fixed by #17379
Labels
compiler:codegen Generation of LLVM IR and native code performance Must go faster

Comments

@JeffBezanson
Copy link
Member

JeffBezanson commented Jul 8, 2016

#15402 was and still is fixed, but I found another similar case. This is hurting the performance of #16622.

julia> g = Base.Generator(x->x, rand(5))
Base.Generator{Array{Float64,1},##3#4}(#3,[0.428471,0.0120129,0.899264,0.103639,0.700651])

julia> function h(g)
        s = 0.0
        for x in g
          s += x
        end
       s
       end 
h (generic function with 1 method)

julia> @code_llvm h(g)

Inner loop:

idxend:                                           ; preds = %if
  %25 = load double*, double** %20, align 8
  %26 = getelementptr double, double* %25, i64 %22
  %27 = load double, double* %26, align 8
  %28 = add i64 %"#temp#.08", 1
  %29 = fadd double %s.09, %27
  %30 = load %jl_value_t*, %jl_value_t** %9, align 8
  store %jl_value_t* %30, %jl_value_t** %2, align 8
  %31 = getelementptr inbounds %jl_value_t, %jl_value_t* %30, i64 1
  %32 = bitcast %jl_value_t* %31 to i64*
  %33 = load i64, i64* %32, align 8
  %34 = icmp eq i64 %"#temp#.08", %33
  br i1 %34, label %L3.loopexit, label %if

There's that most-unwelcome store again.

@JeffBezanson JeffBezanson added performance Must go faster compiler:codegen Generation of LLVM IR and native code labels Jul 8, 2016
@yuyichao
Copy link
Contributor

yuyichao commented Jul 8, 2016

We've attached enough metadata for LLVM to hoist the store and opt does that in -O1 so maybe we could get llvm optimize it out by adding/tweaking LLVM passes. The better solution would still be to fix #15369 instead of pushing LLVM harder....

@JeffBezanson
Copy link
Member Author

I would accept either approach, as this is quite bad. Might be related to #17321 as well.

@yuyichao
Copy link
Contributor

yuyichao commented Jul 9, 2016

Oh, and just to be clear by opt does that in -O1 I mean it does that on our optimized IR. opt -O3 doesn't optimize that on our unoptimized IR. Another interesting observations is that the bound check is removed by LLVM automatically in this case but it can't hoist the store at the same time (in a single pass).

vtjnash added a commit that referenced this issue Jul 11, 2016
when loading a value from a struct that didn't need a local gcroot,
the loaded value also doesn't require a gcroot if the struct was immutable

fix #17342
mfasi pushed a commit to mfasi/julia that referenced this issue Sep 5, 2016
when loading a value from a struct that didn't need a local gcroot,
the loaded value also doesn't require a gcroot if the struct was immutable

fix JuliaLang#17342
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code performance Must go faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants