-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tbaa_gcframe #13463
tbaa_gcframe #13463
Conversation
..... hopefully compatible with LLVM 3.3 this time.... Edit: OK, that failed, so there's basically no way around |
Finally make it compatible with LLVM 3.3 and all CI passes... |
Can confirm that for example |
I'm concerned this can manage to walk too far through a gcframe and mark a load of a field of a variable. We don't usually emit code of that form currently, but I don't think such a scenario is too unlikely. For the PR, I think marking the load/store in the emit_var / emit_assignment sections should be sufficient. The temporary variables slot usages will always be closely preceding a function call anyways. For testing against multiple llvm versions locally, what I do is |
Will this matter? If we endup storing random structures (as oppose to just the pointer to them) to the GC frame, those load are still going to be from the GC frame since AFAIK the instructions I'm following |
it matters for alias analysis, since you could have another load/store to the same location that doesn't get marked the same way |
I rebased this on top of current master. It is still tracing the gep's for some instructions but should be more fine grain than before. Depending on how we add stack allocated object, in the worst case we can add a special gc root allocation intrinsics that is ignored by this marking (if we simply allocate it on the stack with a tag we might not need to do anything special). With this patch the issue in #15717 still exists (seems that there's too many stores for llvm to naively hoist out of the loop) but SIMD finally works with |
|
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
Add
tbaa_gcframe
and use it to decorate all stores and loads to/from the GC frame.This doesn't rely on the the codegen rewrite so it should be easily back ported (probably after 0.4.0).
This is also yet another way to fix the performance issue in #13459 and #13461 .
Close #13301 (Load from gc frame is still bad but it doesn't prevent vectorization anymore)
@vtjnash @simonster .
@carnaval , you also mentioned this during JuliaCon.