[RISCV] Combine vslide{up,down} x, poison -> x#169013
Conversation
The motivation for this is that it would be useful to express a vslideup/vslidedown in a target independent way e.g. from the loop vectorizer.
We can do this today with @llvm.vector.splice by setting one operand to poison:
- A slide down can be achieved with @llvm.vector.splice(%x, poison, slideamt)
- A slide up can be done by @llvm.vector.splice(poison, %x, -slideamt)
E.g.:
splice(<a,b,c,d>, poison, 3) = <d,poison,poison,poison>
splice(poison, <a,b,c,d>, -3) = <poison,poison,poison,a>
These splices get lowered to a vslideup + vslidedown pair with one of the vs2s being poison. We can optimize this away so that we are just left with a single slideup/slidedown.
|
@llvm/pr-subscribers-backend-risc-v Author: Luke Lau (lukel97) ChangesThe motivation for this is that it would be useful to express a vslideup/vslidedown in a target independent way e.g. from the loop vectorizer. We can do this today with @llvm.vector.splice by setting one operand to poison:
E.g.: These splices get lowered to a vslideup + vslidedown pair with one of the vs2s being poison. We can optimize this away so that we are just left with a single slideup/slidedown. Full diff: https://github.com/llvm/llvm-project/pull/169013.diff 3 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 2d6bb06d689c3..209e2969046c9 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -21834,6 +21834,11 @@ SDValue RISCVTargetLowering::PerformDAGCombine(SDNode *N,
return N->getOperand(0);
break;
}
+ case RISCVISD::VSLIDEDOWN_VL:
+ case RISCVISD::VSLIDEUP_VL:
+ if (N->getOperand(1)->isUndef())
+ return N->getOperand(0);
+ break;
case RISCVISD::VSLIDE1UP_VL:
case RISCVISD::VFSLIDE1UP_VL: {
using namespace SDPatternMatch;
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vector-splice.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vector-splice.ll
new file mode 100644
index 0000000000000..42d1c0321cfd1
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vector-splice.ll
@@ -0,0 +1,49 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 6
+; RUN: llc -mtriple riscv32 -mattr=+v < %s | FileCheck %s
+; RUN: llc -mtriple riscv64 -mattr=+v < %s | FileCheck %s
+; RUN: llc -mtriple riscv32 -mattr=+v,+vl-dependent-latency < %s | FileCheck %s
+; RUN: llc -mtriple riscv64 -mattr=+v,+vl-dependent-latency < %s | FileCheck %s
+
+define <4 x i32> @splice_v4i32_slidedown(<4 x i32> %a, <4 x i32> %b) #0 {
+; CHECK-LABEL: splice_v4i32_slidedown:
+; CHECK: # %bb.0:
+; CHECK-NEXT: vsetivli zero, 4, e32, m1, ta, ma
+; CHECK-NEXT: vrgather.vi v9, v8, 3
+; CHECK-NEXT: vmv.v.v v8, v9
+; CHECK-NEXT: ret
+ %res = call <4 x i32> @llvm.vector.splice(<4 x i32> %a, <4 x i32> poison, i32 3)
+ ret <4 x i32> %res
+}
+
+define <4 x i32> @splice_4i32_slideup(<4 x i32> %a) #0 {
+; CHECK-LABEL: splice_4i32_slideup:
+; CHECK: # %bb.0:
+; CHECK-NEXT: vsetivli zero, 4, e32, m1, ta, ma
+; CHECK-NEXT: vrgather.vi v9, v8, 0
+; CHECK-NEXT: vmv.v.v v8, v9
+; CHECK-NEXT: ret
+ %res = call <4 x i32> @llvm.vector.splice(<4 x i32> poison, <4 x i32> %a, i32 -3)
+ ret <4 x i32> %res
+}
+
+define <8 x i32> @splice_v8i32_slidedown(<8 x i32> %a, <8 x i32> %b) #0 {
+; CHECK-LABEL: splice_v8i32_slidedown:
+; CHECK: # %bb.0:
+; CHECK-NEXT: vsetivli zero, 8, e32, m2, ta, ma
+; CHECK-NEXT: vslidedown.vi v8, v8, 3
+; CHECK-NEXT: ret
+ %res = call <8 x i32> @llvm.vector.splice(<8 x i32> %a, <8 x i32> poison, i32 3)
+ ret <8 x i32> %res
+}
+
+define <8 x i32> @splice_v8i32_slideup(<8 x i32> %a) #0 {
+; CHECK-LABEL: splice_v8i32_slideup:
+; CHECK: # %bb.0:
+; CHECK-NEXT: vsetivli zero, 8, e32, m2, ta, ma
+; CHECK-NEXT: vslideup.vi v10, v8, 3
+; CHECK-NEXT: vmv.v.v v8, v10
+; CHECK-NEXT: ret
+ %res = call <8 x i32> @llvm.vector.splice(<8 x i32> poison, <8 x i32> %a, i32 -3)
+ ret <8 x i32> %res
+}
+
diff --git a/llvm/test/CodeGen/RISCV/rvv/vector-splice.ll b/llvm/test/CodeGen/RISCV/rvv/vector-splice.ll
index acc2a97dd5d1f..31936d3a084b2 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vector-splice.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vector-splice.ll
@@ -4247,4 +4247,34 @@ define <vscale x 8 x double> @splice_nxv8f64_offset_max(<vscale x 8 x double> %a
ret <vscale x 8 x double> %res
}
+define <vscale x 2 x i32> @splice_nxv2i32_slidedown(<vscale x 2 x i32> %a) #0 {
+; NOVLDEP-LABEL: splice_nxv2i32_slidedown:
+; NOVLDEP: # %bb.0:
+; NOVLDEP-NEXT: vsetvli a0, zero, e32, m1, ta, ma
+; NOVLDEP-NEXT: vslidedown.vi v8, v8, 3
+; NOVLDEP-NEXT: ret
+;
+; VLDEP-LABEL: splice_nxv2i32_slidedown:
+; VLDEP: # %bb.0:
+; VLDEP-NEXT: csrr a0, vlenb
+; VLDEP-NEXT: srli a0, a0, 2
+; VLDEP-NEXT: addi a0, a0, -3
+; VLDEP-NEXT: vsetvli zero, a0, e32, m1, ta, ma
+; VLDEP-NEXT: vslidedown.vi v8, v8, 3
+; VLDEP-NEXT: ret
+ %res = call <vscale x 2 x i32> @llvm.vector.splice(<vscale x 2 x i32> %a, <vscale x 2 x i32> poison, i32 3)
+ ret <vscale x 2 x i32> %res
+}
+
+define <vscale x 2 x i32> @splice_nxv2i32_slideup(<vscale x 2 x i32> %a) #0 {
+; CHECK-LABEL: splice_nxv2i32_slideup:
+; CHECK: # %bb.0:
+; CHECK-NEXT: vsetvli a0, zero, e32, m1, ta, ma
+; CHECK-NEXT: vslideup.vi v9, v8, 3
+; CHECK-NEXT: vmv.v.v v8, v9
+; CHECK-NEXT: ret
+ %res = call <vscale x 2 x i32> @llvm.vector.splice(<vscale x 2 x i32> poison, <vscale x 2 x i32> %a, i32 -3)
+ ret <vscale x 2 x i32> %res
+}
+
attributes #0 = { vscale_range(2,0) }
|
🐧 Linux x64 Test Results
|
| ; CHECK-LABEL: splice_v4i32_slidedown: | ||
| ; CHECK: # %bb.0: | ||
| ; CHECK-NEXT: vsetivli zero, 4, e32, m1, ta, ma | ||
| ; CHECK-NEXT: vrgather.vi v9, v8, 3 |
There was a problem hiding this comment.
OT: We should probably be preferring vslidedown.vi for case where we have a single element in the resulting shuffle mask. vrgather.vi forces the use of an extra register where vslidedown.vi doesn't. (This only works for vslidedown, vslideup does have the register constraint.)
| } | ||
| case RISCVISD::VSLIDEDOWN_VL: | ||
| case RISCVISD::VSLIDEUP_VL: | ||
| if (N->getOperand(1)->isUndef()) |
There was a problem hiding this comment.
Does this work with isPoison? isUndef returns true for poison and for undef, but we only tested poison.
There was a problem hiding this comment.
independent from what we test: I think this transformation still holds for undef, right?
There was a problem hiding this comment.
I think it's valid for undef.
There was a problem hiding this comment.
If we add an undef test then the undef detector linter will complain.
For the use case in the loop vectorizer just handling poison is fine if we want to avoid isUndef. Are we moving away from it in SelectionDAG?
There was a problem hiding this comment.
SelectionDAG support for poison is still in an early stage I think. It probably converts ISD::POISON to ISD::UNDEF in many cases still. I wouldn't be surprised if the vector was ISD::UNDEF by the time it gets here. Especially for the fixed vector case.
There was a problem hiding this comment.
Ok, just want to check then should we add a test case with undef anyway?
There was a problem hiding this comment.
Yes, please ignore the linter.
| ; RUN: llc -mtriple riscv32 -mattr=+v,+vl-dependent-latency < %s | FileCheck %s | ||
| ; RUN: llc -mtriple riscv64 -mattr=+v,+vl-dependent-latency < %s | FileCheck %s | ||
|
|
||
| define <4 x i32> @splice_v4i32_slidedown(<4 x i32> %a, <4 x i32> %b) #0 { |
There was a problem hiding this comment.
#0
nit: dangling attribute, you probably can remove it.
You can test this locally with the following command:git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef([^a-zA-Z0-9_-]|$)|UndefValue::get)' 'HEAD~1' HEAD llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vector-splice.ll llvm/lib/Target/RISCV/RISCVISelLowering.cpp llvm/test/CodeGen/RISCV/rvv/vector-splice.llThe following files introduce new uses of undef:
Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields In tests, avoid using For example, this is considered a bad practice: define void @fn() {
...
br i1 undef, ...
}Please use the following instead: define void @fn(i1 %cond) {
...
br i1 %cond, ...
}Please refer to the Undefined Behavior Manual for more information. |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/45/builds/18477 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/88/builds/18454 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/157/builds/42472 Here is the relevant piece of the build log for the reference |
The motivation for this is that it would be useful to express a
vslideup/vslidedown in a target independent way e.g. from the loop
vectorizer.
We can do this today with @llvm.vector.splice by setting one operand to
poison:
- A slide down can be achieved with @llvm.vector.splice(%x, poison,
slideamt)
- A slide up can be done by @llvm.vector.splice(poison, %x, -slideamt)
E.g.:
splice(<a,b,c,d>, poison, 3) = <d,poison,poison,poison>
splice(poison, <a,b,c,d>, -3) = <poison,poison,poison,a>
These splices get lowered to a vslideup + vslidedown pair with one of
the vs2s being poison. We can optimize this away so that we are just
left with a single slideup/slidedown.
The motivation for this is that it would be useful to express a
vslideup/vslidedown in a target independent way e.g. from the loop
vectorizer.
We can do this today with @llvm.vector.splice by setting one operand to
poison:
- A slide down can be achieved with @llvm.vector.splice(%x, poison,
slideamt)
- A slide up can be done by @llvm.vector.splice(poison, %x, -slideamt)
E.g.:
splice(<a,b,c,d>, poison, 3) = <d,poison,poison,poison>
splice(poison, <a,b,c,d>, -3) = <poison,poison,poison,a>
These splices get lowered to a vslideup + vslidedown pair with one of
the vs2s being poison. We can optimize this away so that we are just
left with a single slideup/slidedown.
The motivation for this is that it would be useful to express a
vslideup/vslidedown in a target independent way e.g. from the loop
vectorizer.
We can do this today with @llvm.vector.splice by setting one operand to
poison:
- A slide down can be achieved with @llvm.vector.splice(%x, poison,
slideamt)
- A slide up can be done by @llvm.vector.splice(poison, %x, -slideamt)
E.g.:
splice(<a,b,c,d>, poison, 3) = <d,poison,poison,poison>
splice(poison, <a,b,c,d>, -3) = <poison,poison,poison,a>
These splices get lowered to a vslideup + vslidedown pair with one of
the vs2s being poison. We can optimize this away so that we are just
left with a single slideup/slidedown.
The motivation for this is that it would be useful to express a vslideup/vslidedown in a target independent way e.g. from the loop vectorizer.
We can do this today with @llvm.vector.splice by setting one operand to poison:
E.g.:
These splices get lowered to a vslideup + vslidedown pair with one of the vs2s being poison. We can optimize this away so that we are just left with a single slideup/slidedown.