[AArch64] Fix fmaxnm/fminnm SNAN handling#176625
[AArch64] Fix fmaxnm/fminnm SNAN handling#176625youknowone wants to merge 1 commit intollvm:mainfrom
Conversation
AArch64's fmaxnm/fminnm convert SNAN to QNAN, unlike fcmp+select. Extend isProfitableToCombineMinNumMaxNum hook to prevent combine unless SNAN is proven absent or nnan flag is set.
|
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
davemgreen
left a comment
There was a problem hiding this comment.
This is incorrect. AArch64 and Arm have always used the IEEE 754 2008 definition for snan of minnum/maxnum. Considering where this comes from this was correct - the frontend clang uses them for fmax clib calls with under gcc have ieee 754 2008 meaning, and vmaxq intrinsics that are obviously meant to mirror the instructions (or midend optimizations that are usually nnan so don't matter which is picked).
Unfortunately the midend was documented incorrectly and each backend had their own different meaning. The documentation was fixed up in #112852, but unfortunately in the meantime rust had picked the wrong intrinsic for min, and really wanted minumumnum. The latest is in https://discourse.llvm.org/t/rfc-a-consistent-set-of-semantics-for-the-floating-point-minimum-and-maximum-operations/89006 and #172012. My understanding is that rust intends to move to llvm.minimumnum once some issues are worked out with support of llvm.minimumnum.
LLVM's handling of snan is always flakey though, and would require at least strict-fp and/or a lot of work to make it correct (for more than just min/max).
|
@davemgreen Thank you so much! |
I had read the bug report wrong, and a lot of what I said was about a different issue :(. I've updated #176624 with some more details. |
You can test this locally with the following command:git-clang-format --diff origin/main HEAD --extensions h,cpp -- llvm/include/llvm/CodeGen/TargetLowering.h llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp llvm/lib/Target/AArch64/AArch64ISelLowering.cpp llvm/lib/Target/AArch64/AArch64ISelLowering.h --diff_from_common_commit
View the diff from clang-format here.diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h b/llvm/include/llvm/CodeGen/TargetLowering.h
index ca34c3001..e51de4dcc 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -2523,9 +2523,9 @@ public:
}
virtual bool isProfitableToCombineMinNumMaxNum(EVT VT, SDValue LHS,
- SDValue RHS,
- const SDNodeFlags &Flags,
- SelectionDAG &DAG) const {
+ SDValue RHS,
+ const SDNodeFlags &Flags,
+ SelectionDAG &DAG) const {
return true;
}
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.h b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
index 2c2ab6a32..30fd1beaa 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -221,8 +221,8 @@ public:
bool isProfitableToHoist(Instruction *I) const override;
bool isProfitableToCombineMinNumMaxNum(EVT VT, SDValue LHS, SDValue RHS,
- const SDNodeFlags &Flags,
- SelectionDAG &DAG) const override;
+ const SDNodeFlags &Flags,
+ SelectionDAG &DAG) const override;
bool isZExtFree(Type *Ty1, Type *Ty2) const override;
bool isZExtFree(EVT VT1, EVT VT2) const override;
|
Fix #176624
I’ve confirmed that this patch addresses the issue, but I’m very new to the LLVM project and I’m not at all confident that this is the correct fix. Please treat this patch only as a reference to help illustrate the issue in more detail.
If you’re willing to provide guidance on how it should be improved, I’d be more than happy to revise it accordingly.