-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable hot/cold splitting of EH funclets #71236
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsThis PR enables hot/cold splitting for functions containing exception handling. Currently, EH funclets are placed after the function's main body in memory. Via the heuristic that EH funclets are infrequently run, this enables a simple splitting implementation:
Being able to split each funclet independently would likely be more performant, but this simpler approach is beneficial for various reasons:
|
/azp run runtime-jit-experimental |
Azure Pipelines successfully started running 1 pipeline(s). |
@@ -8473,7 +8470,7 @@ void emitter::emitIns_J(instruction ins, BasicBlock* dst, int instrCount) | |||
|
|||
id->idIns(ins); | |||
id->idInsFmt(fmt); | |||
id->idjShort = idjShort; | |||
id->idjShort = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When branching to a finally
funclet on ARM64, we emit an INS_bl_local
instruction -- previously, we could guarantee this jump would stay within the hot section, and thus would set idjShort = true
. Now that EH funclets can be moved to the cold section, we can no longer assume this, and must set idjKeepLong
based on if we're crossing hot/cold sections. Without changing this logic, we'll hit asserts in emitOutputLJ
due to the branch being too short.
@AndyAyersMS Since Bruce is OOF, could you PTAL? Thank you! |
src/coreclr/jit/flowgraph.cpp
Outdated
{ | ||
for (BasicBlock* block = fgFirstFuncletBB; block != nullptr; block = block->bbNext) | ||
{ | ||
#if HANDLER_ENTRY_MUST_BE_IN_HOT_SECTION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we always define HANDLER_ENTRY_MUST_BE_IN_HOT_SECTION
to be 1
.
Does that mean we're actually never splitting off EH any regions, or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't notice HANDLER_ENTRY_MUST_BE_IN_HOT_SECTION
is always 1. During testing, I brute-forced splitting at fgFirstFuncletBB
via JitStressProcedureSplitting
, so I didn't notice this condition was blocking splitting. I've removed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may not be that simple. Presumably HANDLER_ENTRY_MUST_BE_IN_HOT_SECTION
exists for some reason. It would be good to know what that reason is.
Looking at the git blame for jit.h this define has been like this for years now. Perhaps it is no longer applicable, or perhaps it reflects some limitation in the way the jit must report EH information back to the runtime.
@jkotas do you know offhand why the jit has this constraint? If not, I can do some digging.
If it is indeed OK to change this, I'd recommend changing the define instead, so that the rest of the jit is freed from this constraint as well, and do that as a prerequisite, stand-alone change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not know the reason behind this. The EE side is using GetCodeAddressForRelOffset
method to resolve the handler offset to IP. This method deals with handler entrypoints in the cold section correctly.
src/coreclr/jit/flowgraph.cpp
Outdated
* section is cold. If any of the funclets are hot, then it may not be | ||
* beneficial to split at fgFirstFuncletBB, moving all funclets to the | ||
* cold section. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use the "new style" function comments? They look like
//------------------------------------------------------------------------------
// methodName: short description
//
// Notes:
// (text you have above)
//
bool Compiler::fgFuncletsAreCold()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing; fixed.
/azp run runtime-jit-experimental |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-jit-experimental |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Can you show an asm diff example of a small test case before/after with EH and stress splitting enabled?
Rebasing on top of my change disabling |
/azp run runtime-jit-experimental |
Azure Pipelines successfully started running 1 pipeline(s). |
Sure thing. Below is a JIT disasm (on Win x64) for a short function that catches a
Note the lack of hot/cold splitting. Now, with EH splitting:
|
@BruceForstall Would you also like to see an asmdiff with/without stress splitting? That implementation still splits after the first basic block, so I'm not sure how well it will demonstrate EH funclet splitting. |
I guess I was wondering if there is a case where you can see visible asm diff or unwind/EH info diffs, but I guess with funclets there won't be any hot/cold branches that change, because there are no direct branches to/from the funclets, except for BBJ_CALLFINALLY to a finally block. For a catch block that doesn't end in a |
The relocation isn't present in the above example because the |
This PR enables hot/cold splitting for functions containing exception handling. Currently, EH funclets are placed after the function's main body in memory. Via the heuristic that EH funclets are infrequently run, this enables a simple splitting implementation:
fgFirstFuncletBB
, so that all EH funclets are placed in the cold section.Being able to split each funclet independently would likely be more performant, but this simpler approach is beneficial for various reasons: