-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SimdAsHWIntrinsic improvements and cleanup #80134
Conversation
…lpers where one path was already
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak Issue DetailsThis does some general cleanup to the As part of that, this finishes the cleanup to use
|
d4d8dac
to
e8516fa
Compare
/azp run runtime-coreclr jitstress-isas-x86, runtime-coreclr jitstress-isas-arm, runtime-coreclr outerloop |
Azure Pipelines successfully started running 3 pipeline(s). |
CC. @dotnet/jit-contrib. Some pretty substantial wins here, reducing the size by 40k bytes on Arm64 and 90-102k bytes on x64. There is a small gain for Arm64 minopts and a small regression for x64 minopts, but also some nice throughput gains (-0.14% for libraries.pmi on x64) for full opts and no TP change for minopts. |
The few regressions are cases where we are spilling an operand ( The vast majority of these are simply the For |
The mono changes look ok. |
case NI_VectorT256_op_Division: | ||
#endif // TARGET_XARCH | ||
{ | ||
return gtNewSimdBinOpNode(GT_DIV, retType, op1, op2, simdBaseJitType, simdSize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume eventually we might need to implement the magic division optimization if op2 is CNS_VEC? 🙂 (or pdivp is fast enough as is?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, there are some optimization opportunities here that we can more easily enable/centralize in the future with this change.
Nice diffs |
…t on downlevel hardware
/azp run runtime-coreclr jitstress-isas-x86, runtime-coreclr jitstress-isas-arm, runtime-coreclr outerloop |
Azure Pipelines successfully started running 3 pipeline(s). |
Merged slightly early (only mono llvmfullaot legs left). CI had already been passing except for two |
This does some general cleanup to the
simdashwintrinsic
code by sharing code paths where possible and using thegtNewSimd*Node
helpers where they exist.As part of that, this resolves an issue with the operand order used for a
gtNewSimdWithElementNode
call. I additionally attempted to finishe the cleanup to usefgMakeMultiUse
rather thanimpCloneExpr
, however it hit various asserts and needs more followup (I plan on doing this in a separate PR: #80242).