tbaa_gcframe #13463

yuyichao · 2015-10-06T02:01:32Z

Add tbaa_gcframe and use it to decorate all stores and loads to/from the GC frame.
This doesn't rely on the the codegen rewrite so it should be easily back ported (probably after 0.4.0).

This is also yet another way to fix the performance issue in #13459 and #13461 .

Close #13301 (Load from gc frame is still bad but it doesn't prevent vectorization anymore)

@vtjnash @simonster .

@carnaval , you also mentioned this during JuliaCon.

yuyichao · 2015-10-06T02:36:00Z

..... hopefully compatible with LLVM 3.3 this time....

Edit: OK, that failed, so there's basically no way around ifdef's....

yuyichao · 2015-10-06T04:31:15Z

Finally make it compatible with LLVM 3.3 and all CI passes...

KristofferC · 2015-10-06T09:08:22Z

Can confirm that for example fill!(A, 0.) and A[:] = 0. has the same performance on this branch.

vtjnash · 2015-10-06T15:36:32Z

I'm concerned this can manage to walk too far through a gcframe and mark a load of a field of a variable. We don't usually emit code of that form currently, but I don't think such a scenario is too unlikely. For the PR, I think marking the load/store in the emit_var / emit_assignment sections should be sufficient. The temporary variables slot usages will always be closely preceding a function call anyways.

For testing against multiple llvm versions locally, what I do is make configure O=build-llvm37 && cd build-llvm37 && echo 'LLVM_VER = 3.7.0' >> Make.user && make. It'll inherit any settings from your global Make.user file, then override those with the settings in the subfolder.

yuyichao · 2015-10-06T17:17:19Z

I'm concerned this can manage to walk too far through a gcframe and mark a load of a field of a variable.

Will this matter? If we endup storing random structures (as oppose to just the pointer to them) to the GC frame, those load are still going to be from the GC frame since AFAIK the instructions I'm following getelementptr and bitcast only does pointer algorithms and not load. (i.e. gcframe[offset]->field won't be marked).

vtjnash · 2015-10-06T17:38:00Z

it matters for alias analysis, since you could have another load/store to the same location that doesn't get marked the same way

…me with it Close #13301

yuyichao · 2016-04-01T01:51:06Z

I rebased this on top of current master. It is still tracing the gep's for some instructions but should be more fine grain than before. Depending on how we add stack allocated object, in the worst case we can add a special gc root allocation intrinsics that is ignored by this marking (if we simply allocate it on the stack with a tag we might not need to do anything special).

With this patch the issue in #15717 still exists (seems that there's too many stores for llvm to naively hoist out of the loop) but SIMD finally works with -O3.

@vtjnash

yuyichao · 2016-04-01T13:49:58Z

runbenchmarks("simd", vs = "JuliaLang/julia:master")

nanosoldier · 2016-04-01T19:22:13Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

yuyichao added performance Must go faster compiler:codegen Generation of LLVM IR and native code labels Oct 6, 2015

yuyichao force-pushed the yyc/gc/tbaa branch from 1d7fb92 to 925e78b Compare October 6, 2015 02:35

yuyichao force-pushed the yyc/gc/tbaa branch from 925e78b to 5854308 Compare October 6, 2015 02:53

KristofferC mentioned this pull request Oct 6, 2015

make A[:] = x::Number fall back to fill! #13459

Closed

yuyichao mentioned this pull request Oct 6, 2015

Avoid introducing local variable (and GC frame store) in unsafe_setindex! #13461

Merged

yuyichao force-pushed the yyc/gc/tbaa branch from 5854308 to b772a46 Compare October 14, 2015 16:19

yuyichao mentioned this pull request Oct 20, 2015

SIMD performance regression tests #13686

Closed

yuyichao force-pushed the yyc/gc/tbaa branch from b772a46 to 7124144 Compare January 31, 2016 15:18

yuyichao added 2 commits March 31, 2016 21:43

Fix const cast warning

4d5f00a

Create tbaa_gcframe and decorate all stores and loads from the GC fra…

0051c47

…me with it Close #13301

yuyichao force-pushed the yyc/gc/tbaa branch from 7124144 to 0051c47 Compare April 1, 2016 01:44

yuyichao merged commit 3fae299 into master Apr 2, 2016

yuyichao deleted the yyc/gc/tbaa branch April 2, 2016 06:05

yuyichao mentioned this pull request Apr 2, 2016

Store to GC frame preventing vectorization #15717

Closed

yuyichao mentioned this pull request May 5, 2016

Unnecessary GC root for getfield of SSA immutable object #15402

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tbaa_gcframe #13463

tbaa_gcframe #13463

yuyichao commented Oct 6, 2015

yuyichao commented Oct 6, 2015

yuyichao commented Oct 6, 2015

KristofferC commented Oct 6, 2015

vtjnash commented Oct 6, 2015

yuyichao commented Oct 6, 2015

vtjnash commented Oct 6, 2015

yuyichao commented Apr 1, 2016

yuyichao commented Apr 1, 2016

nanosoldier commented Apr 1, 2016

tbaa_gcframe #13463

tbaa_gcframe #13463

Conversation

yuyichao commented Oct 6, 2015

yuyichao commented Oct 6, 2015

yuyichao commented Oct 6, 2015

KristofferC commented Oct 6, 2015

vtjnash commented Oct 6, 2015

yuyichao commented Oct 6, 2015

vtjnash commented Oct 6, 2015

yuyichao commented Apr 1, 2016

yuyichao commented Apr 1, 2016

nanosoldier commented Apr 1, 2016