[LV] Restrict scalable vectorization to targets with power-of-2 vscale#183065
[LV] Restrict scalable vectorization to targets with power-of-2 vscale#183065
Conversation
|
@llvm/pr-subscribers-vectorizers @llvm/pr-subscribers-llvm-transforms Author: Luke Lau (lukel97) ChangesIn #145098 we are proposing to restrict vscale to always be a power of 2. This PR proposes to go ahead and remove support for non-power-of-2 vscales in the loop vectorizer independently of the LangRef. Given we don't have any targets with scalable vector support with non-power-of-2 vscales, this is essentially NFC. The main benefit of this is that it means the IV can't overflow with tail folding anymore. This remove a lot of checks, both at runtime and statically, and simplifies other areas:
I plan to post patches stacked on top of this for these to show how they can be removed. This patch just restricts it for now to show how no tests are affected. Full diff: https://github.com/llvm/llvm-project/pull/183065.diff 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 77be8cc95b6da..1726963f43e32 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -25717,9 +25717,6 @@ bool RISCVTargetLowering::isVScaleKnownToBeAPowerOfTwo() const {
// We define vscale to be VLEN/RVVBitsPerBlock. VLEN is always a power
// of two >= 64, and RVVBitsPerBlock is 64. Thus, vscale must be
// a power of two as well.
- // FIXME: This doesn't work for zve32, but that's already broken
- // elsewhere for the same reason.
- assert(Subtarget.getRealMinVLen() >= 64 && "zve32* unsupported");
static_assert(RISCV::RVVBitsPerBlock == 64,
"RVVBitsPerBlock changed, audit needed");
return true;
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index b28c3d949c96a..073a6ccef4c93 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -296,9 +296,9 @@ cl::opt<unsigned> llvm::ForceTargetInstructionCost(
static cl::opt<bool> ForceTargetSupportsScalableVectors(
"force-target-supports-scalable-vectors", cl::init(false), cl::Hidden,
- cl::desc(
- "Pretend that scalable vectors are supported, even if the target does "
- "not support them. This flag should only be used for testing."));
+ cl::desc("Pretend that scalable vectors are supported and vscale is a "
+ "power of two, even if the target does "
+ "not support them. This flag should only be used for testing."));
static cl::opt<unsigned> SmallLoopCost(
"small-loop-cost", cl::init(20), cl::Hidden,
@@ -3387,6 +3387,10 @@ bool LoopVectorizationCostModel::isScalableVectorizationAllowed() {
if (!TTI.supportsScalableVectors() && !ForceTargetSupportsScalableVectors)
return false;
+ if (!TTI.isVScaleKnownToBeAPowerOfTwo() &&
+ !ForceTargetSupportsScalableVectors)
+ return false;
+
if (Hints->isScalableVectorizationDisabled()) {
reportVectorizationInfo("Scalable vectorization is explicitly disabled",
"ScalableVectorizationDisabled", ORE, TheLoop);
|
|
@llvm/pr-subscribers-backend-risc-v Author: Luke Lau (lukel97) ChangesIn #145098 we are proposing to restrict vscale to always be a power of 2. This PR proposes to go ahead and remove support for non-power-of-2 vscales in the loop vectorizer independently of the LangRef. Given we don't have any targets with scalable vector support with non-power-of-2 vscales, this is essentially NFC. The main benefit of this is that it means the IV can't overflow with tail folding anymore. This remove a lot of checks, both at runtime and statically, and simplifies other areas:
I plan to post patches stacked on top of this for these to show how they can be removed. This patch just restricts it for now to show how no tests are affected. Full diff: https://github.com/llvm/llvm-project/pull/183065.diff 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 77be8cc95b6da..1726963f43e32 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -25717,9 +25717,6 @@ bool RISCVTargetLowering::isVScaleKnownToBeAPowerOfTwo() const {
// We define vscale to be VLEN/RVVBitsPerBlock. VLEN is always a power
// of two >= 64, and RVVBitsPerBlock is 64. Thus, vscale must be
// a power of two as well.
- // FIXME: This doesn't work for zve32, but that's already broken
- // elsewhere for the same reason.
- assert(Subtarget.getRealMinVLen() >= 64 && "zve32* unsupported");
static_assert(RISCV::RVVBitsPerBlock == 64,
"RVVBitsPerBlock changed, audit needed");
return true;
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index b28c3d949c96a..073a6ccef4c93 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -296,9 +296,9 @@ cl::opt<unsigned> llvm::ForceTargetInstructionCost(
static cl::opt<bool> ForceTargetSupportsScalableVectors(
"force-target-supports-scalable-vectors", cl::init(false), cl::Hidden,
- cl::desc(
- "Pretend that scalable vectors are supported, even if the target does "
- "not support them. This flag should only be used for testing."));
+ cl::desc("Pretend that scalable vectors are supported and vscale is a "
+ "power of two, even if the target does "
+ "not support them. This flag should only be used for testing."));
static cl::opt<unsigned> SmallLoopCost(
"small-loop-cost", cl::init(20), cl::Hidden,
@@ -3387,6 +3387,10 @@ bool LoopVectorizationCostModel::isScalableVectorizationAllowed() {
if (!TTI.supportsScalableVectors() && !ForceTargetSupportsScalableVectors)
return false;
+ if (!TTI.isVScaleKnownToBeAPowerOfTwo() &&
+ !ForceTargetSupportsScalableVectors)
+ return false;
+
if (Hints->isScalableVectorizationDisabled()) {
reportVectorizationInfo("Scalable vectorization is explicitly disabled",
"ScalableVectorizationDisabled", ORE, TheLoop);
|
| @@ -25717,9 +25717,6 @@ bool RISCVTargetLowering::isVScaleKnownToBeAPowerOfTwo() const { | |||
| // We define vscale to be VLEN/RVVBitsPerBlock. VLEN is always a power | |||
| // of two >= 64, and RVVBitsPerBlock is 64. Thus, vscale must be | |||
| // a power of two as well. | |||
| // FIXME: This doesn't work for zve32, but that's already broken | |||
| // elsewhere for the same reason. | |||
| assert(Subtarget.getRealMinVLen() >= 64 && "zve32* unsupported"); | |||
There was a problem hiding this comment.
llvm/test/Transforms/LoopVectorize/RISCV/zvl32b.ll triggers this assert because we query isVScaleKnownToBeAPowerOfTwo earlier.
| "Pretend that scalable vectors are supported, even if the target does " | ||
| "not support them. This flag should only be used for testing.")); | ||
| cl::desc("Pretend that scalable vectors are supported and vscale is a " | ||
| "power of two, even if the target does " |
There was a problem hiding this comment.
This adjusts the wording so that we can assume power-of-2 vscale in the target independent tests
There was a problem hiding this comment.
I'd probably revert this specific hunk, since the LangRef change has landed.
|
Why do this independently of the LangRef? Why not land the LangRef change first given it has broad acceptance? |
I reverse pinged the LangRef PR two weeks ago but there hasn't been any response yet. This splits off the loop vectorizer part to unblock the cleanup work. One specific thing I'd like to start fixing without being blocked on that PR is preserving the NUW flag on tail folded canonical IVs. We have to work around an assumption in VPlanVerifier otherwise in #182254, see the comment here: #182254 (comment) |
I looked at #145098 but the update was not free so I figured it best to try changing the LangRef in isolation and created #183080. |
artagnon
left a comment
There was a problem hiding this comment.
This LGTM, kindly wait on other reviewers.
| "Pretend that scalable vectors are supported, even if the target does " | ||
| "not support them. This flag should only be used for testing.")); | ||
| cl::desc("Pretend that scalable vectors are supported and vscale is a " | ||
| "power of two, even if the target does " |
There was a problem hiding this comment.
I'd probably revert this specific hunk, since the LangRef change has landed.
| if (!TTI.supportsScalableVectors() && !ForceTargetSupportsScalableVectors) | ||
| return false; | ||
|
|
||
| if (!TTI.isVScaleKnownToBeAPowerOfTwo() && |
There was a problem hiding this comment.
I think it probably makes sense to now just remove this hook completely, right?
There was a problem hiding this comment.
I am working on PR to do this. I just wanted to land the LangRef change first to unblock other related work.
|
Does this PR have any value after the LangRef change? I saw it more as a show of intent. Presumably we can now move straight to the functional changes? |
Yup this was just to show the intent, thanks for landing the langref change. Closing now |
In #145098 we are proposing to restrict vscale to always be a power of 2.
This PR proposes to go ahead and remove support for non-power-of-2 vscales in the loop vectorizer independently of the LangRef. Given we don't have any targets with scalable vector support with non-power-of-2 vscales, this is essentially NFC.
The main benefit of this is that it means the IV can't overflow with tail folding anymore. This remove a lot of checks, both at runtime and statically, and simplifies other areas:
I plan to post patches stacked on top of this for these to show how they can be removed. This patch just restricts it for now to show how no tests are affected.