-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC/WIP: Try to run gcroot lowering after some LLVM optimizations #17363
Conversation
dad001b
to
1455659
Compare
} | ||
// Mark GC use before **and** after the llvmcall to make sure the arguments | ||
// are alive during the llvmcall even if the llvmcall has `unreachable`. | ||
// If the llvmcall generates GC safepoint, it might need to emit it's own |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its own
1455659
to
8e4a732
Compare
typedef unsigned id; | ||
enum { | ||
// an assignment to a gcroot exists in the basic-block | ||
// (potentially no live-in from the predecessor basic-blocks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does "live-in" mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value in the slot when getting into this basic block is live at that point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could that be explained in a handful more words? having confusing jargon in the comments sort of defeats the explanatory purpose of them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just reading some doc about liveness analysis and I've seen a few use of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a reference link would work too
Does this fix #17342? |
Unfortunately no (This PR is only about code rearrangement and fixing incompatibility rather than improving the optimizations). With the load forwarding done before the gc frame lowering, it should be much easier to fix that issue though.... |
@@ -52,6 +54,9 @@ static void addOptimizationPasses(T *PM) | |||
#ifndef INSTCOMBINE_BUG | |||
PM->add(createInstructionCombiningPass()); // Cleanup for scalarrepl. | |||
#endif | |||
// Let the InstCombine pass remove the unnecessary load of | |||
// safepoint address first | |||
PM->add(createLowerPTLSPass(imaging_mode, tbaa_const)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a module pass, but addOptimizationPasses
was templated to accept the FunctionPassManager too. I think we only need to make module pass managers now (which might also make the LowerGCFramePass easier, since it can walk the uses list instead of walking the Function Instructions), so it might be good to update the function signature.
fixed upstream
8a53e5f
to
1c114f5
Compare
|
||
// This file defines two entry points: | ||
// global function annotateSimdLoop: mark a loop as a SIMD loop. | ||
// createLowerSimdLoopPass: construct LLVM for lowering a marked loop later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these comments don't seem accurate
// DLLImport only needs to be set for the shadow module | ||
// it just gets annoying in the JIT | ||
proto->setDLLStorageClass(GlobalValue::DefaultStorageClass); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can go away (it isn't needed unless you call addComdat)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but if we bring addComdat back we'd need to add this back?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it's permanently gone. besides, we don't need comdat on variables or dllimports.
but that does bring up the point that this declaration will be wrong for a no-threads build on windows, since at the end of the pass, it'll be missing the DLLImport attribute (usually would have been fixed up by the merge-module pass).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what needs to be fixed here? I have no idea how DLLImport works on windows....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You just have to mark AoT which symbols will be statically linked into the binary and which will be found via the dynamic linker (dllimport)
lgtm. can you rebase & update, to make sure there aren't logical merge conflicts with the codegen / jitlayers split, & merge. |
265e955
to
2e26033
Compare
proto->setDLLStorageClass(GlobalValue::DLLImportStorageClass); | ||
#else | ||
proto->setLinkage(GlobalValue::DLLImportLinkage); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vtjnash like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
2e26033
to
4a8932b
Compare
* Make the pass more robust against dead code elimination * Remove module level setup code (so that the the GC frame lowering can be a `FunctionPass`) * Create PTLS lowering (module) pass since it needs to be run after the GC frame lowering * Set DLLImport attribute for `jl_tls_states` on windows, since it is not handled by module merger anymore
4a8932b
to
c9b5d9d
Compare
I believe some constant propagation (mostly store to load forwarding) would help some GC frame optimization to identify possible duplicate roots. Instead of making our own for special cases, it would be nice if we could let llvm do that. This requires running some LLVM passes before we do the GC frame lowering.
This PR solves the easy part of it, i.e. making the lowering a true LLVM pass and making sure we have enough information in the optimized IR for liveness analysis. (mostly to make sure that the
julia.gc_kill
won't be deleted due to dead branch).The goal is to at least be able to run the lowering after mem2reg (and the instcombine after it). A lot of issues still need to be solve.
If this approach looks good and tests passes, we can probably merge this first and improve the algorithm later.
Based on #17352 to reduce conflicts.
c.c. @vtjnash