-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose the FusedMultiplyAdd and MultiplyAddEstimate APIs on relevant vector and scalar types #102181
Conversation
Note regarding the
|
CC. @dotnet/jit-contrib for the JIT side. This is one of the few remaining cases that should need JIT side work, most of the remaining APIs in #93513 will be implemented purely in managed code. The |
Should we rather introduce a new attribute to prevent methods from being pre-jitted (and inlined into pre-jitted code)? |
// AdvSimd.FusedMultiplyAdd expects (addend, left, right), while the APIs take (left, right, addend) | ||
// We expect op1 and op2 to have already been spilled | ||
|
||
std::swap(op1, op3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am just curious - who's responsible to spill them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, nvm, I see
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically it's the importer code, prior to calling this method. Latter phases that might call such an API (like morph) are responsible for ensuring the swap is safe in the face of potential side effects.
There's, unfortunately, not really a way for us to do such a swap safely in the API itself (at least that I know of) nor to know if the caller has actually done it. So the typical approach has been to do the swap and comment that callers should be doing the validation.
impSpillSideEffect(true, verCurrentState.esStackDepth - | ||
3 DEBUGARG("Spilling op1 side effects for MultiplyAddEstimate")); | ||
|
||
impSpillSideEffect(true, verCurrentState.esStackDepth - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't better/simpler to use impSpillSideEffects(true, CHECK_SPILL_ALL DEBUGARG("spilling side-effects"));
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not something we've been doing in other scenarios.
AFAIR it comes down to not spilling values on the stack that aren't impacted. For example, we take 3 args, but the stack could have 4+ on it (spilling these is unnecessary since they are still processed in order with respect to our own op1
/op2
/op3
; otherwise everyone who pops the stack would need to consider the need to spill such additional entries) and we don't need to ever spill the stack top (op3
).
So we're doing this to ensure only the minimum number of items that need to be spilled are spilled.
We technically have one already, it's I'm also not sure the number of -- I do want to get to a point, someday, where we can move a lot of complexity out of the JIT and into the BCL instead. Ideally just the platform specific intrinsics and core xplat APIs which were exposed instead of platform specific APIs (like |
CC. @stephentoub for the libraries side. The one thing to note is that given how |
I've logged #102275 to track prototyping this work and seeing if we can identify some of the places that are still preventing it from happening. This seems like a good thing to try and do around the time .NET 9 RC1 snaps and we have a bit of a small break before .NET 10 work really gets going. |
…vector and scalar types (dotnet#102181) * Expose FusedMultiplyAdd and MultiplyAddEstimate on the scalar and vector types * Adding tests covering FusedMultiplyAdd and MultiplyAddEstimate for the vector types * Ensure TensorPrimitives uses the xplat APIs on .NET 9+ * Apply formatting patch * Fix an accidental change to GenericVectorTests * Ensure Arm64 passes fma operands in the correct order * Apply formatting patch * Ensure all the Arm64 code paths are spilling and swapping operands * Apply formatting patch * Don't pop the stack value twice
This resolves #98053 and makes progress towards #93513