Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ARM] Fix VBICimm and VORRimm generation under Big endian. #107813

Merged
merged 1 commit into from
Sep 13, 2024

Conversation

davemgreen
Copy link
Collaborator

This is a smaller follow on to #105519 that fixes VBICimm and VORRimm too. The logic behind lowering vector immediates under big endian Neon/MVE is to treat them in natural lane ordering (same as little endian), and VECTOR_REG_CAST them to the correct type (as opposed to creating the constants in big endian form and bitcasting them). This makes sure that is done when creating VORRIMM and VBICIMM.

This is a smaller follow on to llvm#105519 that fixes VBICimm and VORRimm too. The
logic behind lowering vector immediates under big endian Neon/MVE is to treat
them in natural lane ordering (same as little endian), and VECTOR_REG_CAST them
to the correct type (as opposed to creating the constants in big endian form
and bitcasting them). This makes sure that is done when creating VORRIMM and
VBICIMM.
@llvmbot
Copy link
Member

llvmbot commented Sep 9, 2024

@llvm/pr-subscribers-backend-arm

Author: David Green (davemgreen)

Changes

This is a smaller follow on to #105519 that fixes VBICimm and VORRimm too. The logic behind lowering vector immediates under big endian Neon/MVE is to treat them in natural lane ordering (same as little endian), and VECTOR_REG_CAST them to the correct type (as opposed to creating the constants in big endian form and bitcasting them). This makes sure that is done when creating VORRIMM and VBICIMM.


Full diff: https://github.com/llvm/llvm-project/pull/107813.diff

3 Files Affected:

  • (modified) llvm/lib/Target/ARM/ARMISelLowering.cpp (+4-4)
  • (modified) llvm/test/CodeGen/ARM/big-endian-vmov.ll (+4-6)
  • (modified) llvm/test/CodeGen/Thumb2/mve-vmovimm.ll (+4-6)
diff --git a/llvm/lib/Target/ARM/ARMISelLowering.cpp b/llvm/lib/Target/ARM/ARMISelLowering.cpp
index 8c82161bd15c65..67653a17512d2c 100644
--- a/llvm/lib/Target/ARM/ARMISelLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMISelLowering.cpp
@@ -14443,9 +14443,9 @@ static SDValue PerformANDCombine(SDNode *N,
                                       DAG, dl, VbicVT, VT, OtherModImm);
       if (Val.getNode()) {
         SDValue Input =
-          DAG.getNode(ISD::BITCAST, dl, VbicVT, N->getOperand(0));
+            DAG.getNode(ARMISD::VECTOR_REG_CAST, dl, VbicVT, N->getOperand(0));
         SDValue Vbic = DAG.getNode(ARMISD::VBICIMM, dl, VbicVT, Input, Val);
-        return DAG.getNode(ISD::BITCAST, dl, VT, Vbic);
+        return DAG.getNode(ARMISD::VECTOR_REG_CAST, dl, VT, Vbic);
       }
     }
   }
@@ -14739,9 +14739,9 @@ static SDValue PerformORCombine(SDNode *N,
                             SplatBitSize, DAG, dl, VorrVT, VT, OtherModImm);
       if (Val.getNode()) {
         SDValue Input =
-          DAG.getNode(ISD::BITCAST, dl, VorrVT, N->getOperand(0));
+            DAG.getNode(ARMISD::VECTOR_REG_CAST, dl, VorrVT, N->getOperand(0));
         SDValue Vorr = DAG.getNode(ARMISD::VORRIMM, dl, VorrVT, Input, Val);
-        return DAG.getNode(ISD::BITCAST, dl, VT, Vorr);
+        return DAG.getNode(ARMISD::VECTOR_REG_CAST, dl, VT, Vorr);
       }
     }
   }
diff --git a/llvm/test/CodeGen/ARM/big-endian-vmov.ll b/llvm/test/CodeGen/ARM/big-endian-vmov.ll
index 327d4334ad83a1..e3647cf6234c82 100644
--- a/llvm/test/CodeGen/ARM/big-endian-vmov.ll
+++ b/llvm/test/CodeGen/ARM/big-endian-vmov.ll
@@ -219,7 +219,6 @@ define arm_aapcs_vfpcc <8 x i16> @vmvn_v16i8_m1() {
   ret <8 x i16> <i16 65535, i16 65534, i16 65535, i16 65534, i16 65535, i16 65534, i16 65535, i16 65534>
 }
 
-; FIXME: This is incorrect for BE
 define arm_aapcs_vfpcc <8 x i16> @and_v8i16_m1(<8 x i16> %a) {
 ; CHECK-LE-LABEL: and_v8i16_m1:
 ; CHECK-LE:       @ %bb.0:
@@ -228,15 +227,14 @@ define arm_aapcs_vfpcc <8 x i16> @and_v8i16_m1(<8 x i16> %a) {
 ;
 ; CHECK-BE-LABEL: and_v8i16_m1:
 ; CHECK-BE:       @ %bb.0:
-; CHECK-BE-NEXT:    vrev64.32 q8, q0
+; CHECK-BE-NEXT:    vrev64.16 q8, q0
 ; CHECK-BE-NEXT:    vbic.i32 q8, #0x10000
-; CHECK-BE-NEXT:    vrev64.32 q0, q8
+; CHECK-BE-NEXT:    vrev64.16 q0, q8
 ; CHECK-BE-NEXT:    bx lr
   %b = and <8 x i16> %a, <i16 65535, i16 65534, i16 65535, i16 65534, i16 65535, i16 65534, i16 65535, i16 65534>
   ret <8 x i16> %b
 }
 
-; FIXME: This is incorrect for BE
 define arm_aapcs_vfpcc <8 x i16> @or_v8i16_1(<8 x i16> %a) {
 ; CHECK-LE-LABEL: or_v8i16_1:
 ; CHECK-LE:       @ %bb.0:
@@ -245,9 +243,9 @@ define arm_aapcs_vfpcc <8 x i16> @or_v8i16_1(<8 x i16> %a) {
 ;
 ; CHECK-BE-LABEL: or_v8i16_1:
 ; CHECK-BE:       @ %bb.0:
-; CHECK-BE-NEXT:    vrev64.32 q8, q0
+; CHECK-BE-NEXT:    vrev64.16 q8, q0
 ; CHECK-BE-NEXT:    vorr.i32 q8, #0x10000
-; CHECK-BE-NEXT:    vrev64.32 q0, q8
+; CHECK-BE-NEXT:    vrev64.16 q0, q8
 ; CHECK-BE-NEXT:    bx lr
   %b = or <8 x i16> %a, <i16 0, i16 1, i16 0, i16 1, i16 0, i16 1, i16 0, i16 1>
   ret <8 x i16> %b
diff --git a/llvm/test/CodeGen/Thumb2/mve-vmovimm.ll b/llvm/test/CodeGen/Thumb2/mve-vmovimm.ll
index 9cf92663e3b052..1970ff35f183fa 100644
--- a/llvm/test/CodeGen/Thumb2/mve-vmovimm.ll
+++ b/llvm/test/CodeGen/Thumb2/mve-vmovimm.ll
@@ -1331,7 +1331,6 @@ entry:
   ret <2 x i64> %s
 }
 
-; FIXME: This is incorrect for BE
 define arm_aapcs_vfpcc <8 x i16> @and_v8i16_m1(<8 x i16> %a) {
 ; CHECKLE-LABEL: and_v8i16_m1:
 ; CHECKLE:       @ %bb.0:
@@ -1340,15 +1339,14 @@ define arm_aapcs_vfpcc <8 x i16> @and_v8i16_m1(<8 x i16> %a) {
 ;
 ; CHECKBE-LABEL: and_v8i16_m1:
 ; CHECKBE:       @ %bb.0:
-; CHECKBE-NEXT:    vrev64.32 q1, q0
+; CHECKBE-NEXT:    vrev64.16 q1, q0
 ; CHECKBE-NEXT:    vbic.i32 q1, #0x10000
-; CHECKBE-NEXT:    vrev64.32 q0, q1
+; CHECKBE-NEXT:    vrev64.16 q0, q1
 ; CHECKBE-NEXT:    bx lr
   %b = and <8 x i16> %a, <i16 65535, i16 65534, i16 65535, i16 65534, i16 65535, i16 65534, i16 65535, i16 65534>
   ret <8 x i16> %b
 }
 
-; FIXME: This is incorrect for BE
 define arm_aapcs_vfpcc <8 x i16> @or_v8i16_1(<8 x i16> %a) {
 ; CHECKLE-LABEL: or_v8i16_1:
 ; CHECKLE:       @ %bb.0:
@@ -1357,9 +1355,9 @@ define arm_aapcs_vfpcc <8 x i16> @or_v8i16_1(<8 x i16> %a) {
 ;
 ; CHECKBE-LABEL: or_v8i16_1:
 ; CHECKBE:       @ %bb.0:
-; CHECKBE-NEXT:    vrev64.32 q1, q0
+; CHECKBE-NEXT:    vrev64.16 q1, q0
 ; CHECKBE-NEXT:    vorr.i32 q1, #0x10000
-; CHECKBE-NEXT:    vrev64.32 q0, q1
+; CHECKBE-NEXT:    vrev64.16 q0, q1
 ; CHECKBE-NEXT:    bx lr
   %b = or <8 x i16> %a, <i16 0, i16 1, i16 0, i16 1, i16 0, i16 1, i16 0, i16 1>
   ret <8 x i16> %b

@davemgreen
Copy link
Collaborator Author

@Zhenhang1213 FYI. I couldn't add you as a reviewer, but let me know if you have any comments.

@Zhenhang1213
Copy link
Contributor

@Zhenhang1213 FYI. I couldn't add you as a reviewer, but let me know if you have any comments.

this case I think right, and I have a question,Is there other similarities that needs to be modified? like:

SDValue NewVal = isVMOVModifiedImm(iVal & 0xffffffffU, 0, 32, DAG, SDLoc(Op),
VMovVT, VT, VMOVModImm);
if (NewVal != SDValue()) {
SDLoc DL(Op);
SDValue VecConstant = DAG.getNode(ARMISD::VMOVIMM, DL, VMovVT,
NewVal);
if (IsDouble)
return DAG.getNode(ISD::BITCAST, DL, MVT::f64, VecConstant);
// It's a float: cast and extract a vector element.

@davemgreen
Copy link
Collaborator Author

You might well be right if it is casting i32->f64. I hadn't looked at that bit of code, just the code that calls isConstantSplat. Are you able to create a double value that produces incorrect results?

@Zhenhang1213
Copy link
Contributor

You might well be right if it is casting i32->f64. I hadn't looked at that bit of code, just the code that calls isConstantSplat. Are you able to create a double value that produces incorrect results?

OK, I will try to create it then,if I make any progress,I'll tell you. This case is right

@davemgreen
Copy link
Collaborator Author

Thanks. Just to check - do you mean that this patch now looks correct, or the original code was correct?

@Zhenhang1213
Copy link
Contributor

Thanks. Just to check - do you mean that this patch now looks correct, or the original code was correct?

OK, I think this patch correct.

@davemgreen
Copy link
Collaborator Author

Thanks. I'll count that as a review as the code changes here are relatively minor.

@davemgreen davemgreen merged commit 637aa61 into llvm:main Sep 13, 2024
10 checks passed
@davemgreen davemgreen deleted the gh-be-vbicvorr branch September 13, 2024 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants