[VPlan] Simplify ExplicitVectorLength(%AVL) -> %AVL when AVL <= VF by lukel97 · Pull Request #167647 · llvm/llvm-project

lukel97 · 2025-11-12T06:58:16Z

llvm.experimental.get.vector.length has the property that if the AVL (%cnt) is less than or equal to VF (%max_lanes) then the return value is just AVL.

This patch uses SCEV to simplify this in optimizeForVFAndUF, and adds ExplicitVectorLength to VPInstruction::opcodeMayReadOrWriteFromMemory so it gets removed once dead.

This has no effect for now from what I can tell but is needed if we ever want to extend narrowInterleaveGroups to handle EVL tail folded loops. diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp index 80cd112..488470d 100644 --- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp +++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp @@ -1259,6 +1259,7 @@ bool VPInstruction::opcodeMayReadOrWriteFromMemory() const { case VPInstruction::ExtractLastLanePerPart: case VPInstruction::ExtractPenultimateElement: case VPInstruction::ActiveLaneMask: + case VPInstruction::ExplicitVectorLength: case VPInstruction::FirstActiveLane: case VPInstruction::FirstOrderRecurrenceSplice: case VPInstruction::LogicalAnd:

llvmbot · 2025-11-12T06:58:53Z

@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-llvm-transforms

Author: Luke Lau (lukel97)

Changes

This has no effect for now from what I can tell but is needed if we ever want to extend narrowInterleaveGroups to handle EVL tail folded loops.
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 80cd112dbcd8..488470d24796 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -1259,6 +1259,7 @@ bool VPInstruction::opcodeMayReadOrWriteFromMemory() const {
case VPInstruction::ExtractLastLanePerPart:
case VPInstruction::ExtractPenultimateElement:
case VPInstruction::ActiveLaneMask:

case VPInstruction::ExplicitVectorLength:
case VPInstruction::FirstActiveLane:
case VPInstruction::FirstOrderRecurrenceSplice:
case VPInstruction::LogicalAnd:

Full diff: https://github.com/llvm/llvm-project/pull/167647.diff

1 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+1)

diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 80cd112dbcd8a..488470d247968 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -1259,6 +1259,7 @@ bool VPInstruction::opcodeMayReadOrWriteFromMemory() const {
   case VPInstruction::ExtractLastLanePerPart:
   case VPInstruction::ExtractPenultimateElement:
   case VPInstruction::ActiveLaneMask:
+  case VPInstruction::ExplicitVectorLength:
   case VPInstruction::FirstActiveLane:
   case VPInstruction::FirstOrderRecurrenceSplice:
   case VPInstruction::LogicalAnd:

llvmbot · 2025-11-12T06:58:54Z

@llvm/pr-subscribers-vectorizers

Author: Luke Lau (lukel97)

Changes

This has no effect for now from what I can tell but is needed if we ever want to extend narrowInterleaveGroups to handle EVL tail folded loops.
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 80cd112dbcd8..488470d24796 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -1259,6 +1259,7 @@ bool VPInstruction::opcodeMayReadOrWriteFromMemory() const {
case VPInstruction::ExtractLastLanePerPart:
case VPInstruction::ExtractPenultimateElement:
case VPInstruction::ActiveLaneMask:

case VPInstruction::ExplicitVectorLength:
case VPInstruction::FirstActiveLane:
case VPInstruction::FirstOrderRecurrenceSplice:
case VPInstruction::LogicalAnd:

Full diff: https://github.com/llvm/llvm-project/pull/167647.diff

1 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+1)

diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 80cd112dbcd8a..488470d247968 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -1259,6 +1259,7 @@ bool VPInstruction::opcodeMayReadOrWriteFromMemory() const {
   case VPInstruction::ExtractLastLanePerPart:
   case VPInstruction::ExtractPenultimateElement:
   case VPInstruction::ActiveLaneMask:
+  case VPInstruction::ExplicitVectorLength:
   case VPInstruction::FirstActiveLane:
   case VPInstruction::FirstOrderRecurrenceSplice:
   case VPInstruction::LogicalAnd:

artagnon

Oh, I didn't do this myself due to a missing test case!

lukel97 · 2025-11-12T10:56:57Z

Oh, I didn't do this myself due to a missing test case!

I'm not sure if there's any functional change today given that an ExplicitVectorLength isn't a candidate for hoisting/sinking etc., but I didn't want to mark it as NFC since its not really a refactoring.

I split the change off anyway to show that there's no test diff.

artagnon

Okay, don't feel strongly about adding something for the future. Weak LGTM, thanks!

fhahn

Are there cases where we could/should simplify evl, then we could remove it, for example if we remove the backedge https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/LoopVectorize/RISCV/vector-loop-backedge-elimination-with-evl.ll ?

If so, would be good to combine this with the update here.

lukel97 · 2025-11-12T13:03:07Z

Are there cases where we could/should simplify evl, then we could remove it, for example if we remove the backedge https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/LoopVectorize/RISCV/vector-loop-backedge-elimination-with-evl.ll ?

If so, would be good to combine this with the update here.

I've reworked this PR to simplify the EVL when it's known from AVL <= VF in 6bb2fe0. This was a nice catch, I checked and this seems to remove a small handful of vsetvlis in SPEC CPU 2017 that RISCVInsertVSETVLI can't handle on its own.

artagnon

The EVL simplification looks good!

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp

artagnon

I think the updated patch LGTM, thanks!

fhahn · 2025-11-12T15:48:01Z

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

  bool MadeChange = tryToReplaceALMWithWideALM(Plan, BestVF, BestUF);
  MadeChange |= simplifyBranchConditionForVFAndUF(Plan, BestVF, BestUF, PSE);
  MadeChange |= optimizeVectorInductionWidthForTCAndVFUF(Plan, BestVF, BestUF);
+  MadeChange |= simplifyKnownEVL(Plan, BestVF, PSE);


If we move this to the start, would it be sufficient to do a shallow traversal starting at the region entry?

I think this needs to run after simplifyBranchConditionForVFAndUF so that the AVL PHI feeding into ExplicitVectorLength is replaced with its singular incoming value, and it looks like the region is removed there

ah that's unfortuante, thanks for checking

fhahn · 2025-11-12T15:48:39Z

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

+      if (!match(&R, m_EVL(m_VPValue(AVL))))
+        continue;
+
+      const SCEV *AVLSCEV = vputils::getSCEVExprForVPValue(AVL, *PSE.getSE());


Can put PSE->getSE() into a variable, avoid repeated lookups

Thanks, done in 027c462

fhahn

LGTM, thanks

fhahn · 2025-11-13T12:35:43Z

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

  bool MadeChange = tryToReplaceALMWithWideALM(Plan, BestVF, BestUF);
  MadeChange |= simplifyBranchConditionForVFAndUF(Plan, BestVF, BestUF, PSE);
  MadeChange |= optimizeVectorInductionWidthForTCAndVFUF(Plan, BestVF, BestUF);
+  MadeChange |= simplifyKnownEVL(Plan, BestVF, PSE);


ah that's unfortuante, thanks for checking

…emory

…ax_lanes On RISC-V, some loops that the loop vectorizer vectorizes pre-LTO may turn out to have the exact trip count exposed after LTO, see llvm#164762. If the trip count is small enough we can fold away the @llvm.experimental.get.vector.length intrinsic based on this corollary from the LangRef: > If %cnt is less than or equal to %max_lanes, the return value is equal to %cnt. This on its own doesn't remove the @llvm.experimental.get.vector.length in llvm#164762 since we also need to teach computeKnownBits about @llvm.experimental.get.vector.length and the sub recurrence, but this PR is a starting point. I've added this in InstCombine rather than InstSimplify since we may need to insert a truncation (@llvm.experimental.get.vector.length can take an i64 %cnt argument, but always truncates the result to i32). Note that there was something similar done in VPlan in llvm#167647 for when the loop vectorizer knows the trip count.

…ax_lanes (#169293) On RISC-V, some loops that the loop vectorizer vectorizes pre-LTO may turn out to have the exact trip count exposed after LTO, see #164762. If the trip count is small enough we can fold away the @llvm.experimental.get.vector.length intrinsic based on this corollary from the LangRef: > If %cnt is less than or equal to %max_lanes, the return value is equal to %cnt. This on its own doesn't remove the @llvm.experimental.get.vector.length in #164762 since we also need to teach computeKnownBits about @llvm.experimental.get.vector.length and the sub recurrence, but this PR is a starting point. I've added this in InstCombine rather than InstSimplify since we may need to insert a truncation (@llvm.experimental.get.vector.length can take an i64 %cnt argument, the result is always i32). Note that there was something similar done in VPlan in #167647 for when the loop vectorizer knows the trip count.

…ax_lanes (llvm#169293) On RISC-V, some loops that the loop vectorizer vectorizes pre-LTO may turn out to have the exact trip count exposed after LTO, see llvm#164762. If the trip count is small enough we can fold away the @llvm.experimental.get.vector.length intrinsic based on this corollary from the LangRef: > If %cnt is less than or equal to %max_lanes, the return value is equal to %cnt. This on its own doesn't remove the @llvm.experimental.get.vector.length in llvm#164762 since we also need to teach computeKnownBits about @llvm.experimental.get.vector.length and the sub recurrence, but this PR is a starting point. I've added this in InstCombine rather than InstSimplify since we may need to insert a truncation (@llvm.experimental.get.vector.length can take an i64 %cnt argument, the result is always i32). Note that there was something similar done in VPlan in llvm#167647 for when the loop vectorizer knows the trip count.

lukel97 requested review from ElvisWang123, Mel-Chen, arcbbb, artagnon and fhahn November 12, 2025 06:58

llvmbot added vectorizers llvm:transforms labels Nov 12, 2025

artagnon reviewed Nov 12, 2025

View reviewed changes

artagnon approved these changes Nov 12, 2025

View reviewed changes

fhahn reviewed Nov 12, 2025

View reviewed changes

Simplify known EVLs

6bb2fe0

llvmbot added the backend:RISC-V label Nov 12, 2025

lukel97 changed the title ~~[VPlan] Handle ExplicitVectorLength in opcodeMayReadOrWriteFromMemory~~ [VPlan] Simplify ExplicitVectorLength(%AVL) -> %AVL when AVL <= VF Nov 12, 2025

Rename ZExt -> Trunc to reflect that we're casting from i64 -> i32

a4aa19e

artagnon reviewed Nov 12, 2025

View reviewed changes

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp Show resolved Hide resolved

artagnon approved these changes Nov 12, 2025

View reviewed changes

fhahn reviewed Nov 12, 2025

View reviewed changes

lukel97 added 2 commits November 13, 2025 00:17

Store SE in var

027c462

Remove single use builder var

fa1844c

fhahn approved these changes Nov 13, 2025

View reviewed changes

Merge branch 'main' into loop-vectorize/evl-opcodemayreadorwritefromm…

8580c7d

…emory

lukel97 enabled auto-merge (squash) November 13, 2025 12:47

lukel97 merged commit c0f7d51 into llvm:main Nov 13, 2025
9 of 10 checks passed

lukel97 mentioned this pull request Nov 24, 2025

[LoopVectorizer] Pre-LTO LoopVectorization (as discussed in 10/21/25 Vectorization meeting) #164762

Open

lukel97 mentioned this pull request Nov 24, 2025

[InstCombine] Fold @llvm.experimental.get.vector.length when cnt <= max_lanes #169293

Merged

Comments

Conversation

lukel97 commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Nov 12, 2025

Uh oh!

artagnon left a comment

Choose a reason for hiding this comment

Uh oh!

lukel97 commented Nov 12, 2025

Uh oh!

artagnon left a comment

Choose a reason for hiding this comment

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

lukel97 commented Nov 12, 2025

Uh oh!

artagnon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

artagnon left a comment

Choose a reason for hiding this comment

Uh oh!

fhahn Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

lukel97 Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

fhahn Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

fhahn Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

lukel97 Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

fhahn Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lukel97 commented Nov 12, 2025 •

edited

Loading

llvmbot commented Nov 12, 2025 •

edited

Loading