[VPlan] Split out optimizeEVLMasks. NFC #174925

lukel97 · 2026-01-08T08:30:27Z

Addresses part of #153144 and splits off part of #166164

There are two parts to the EVL transform:

Convert the loop so the number of elements processed each iteration is EVL, not VF. The IV and header mask are replaced with EVL-based variants.
Optimize users of the EVL based header mask to VP intrinsic based recipes.

(1) changes the semantics of the vector loop region, whereas (2) needs to preserve them. This splits (2) out so we don't mix the two up, and allows us to move (1) earlier in the pipeline in a future PR.

Addresses part of llvm#153144 and splits off part of llvm#166164 There are two parts to the EVL transform: 1) Convert the loop so the number of elements processed each iteration is EVL, not VF. The IV and header mask are replaced with EVL-based variants. 2) Optimize users of the EVL based header mask to VP intrinsic based recipes. 1) changes the semantics of the vector loop region, whereas the 2) needs to preserve it. This splits 2) out so we don't mix the two up, and allows us to move 1) earlier in the pipeline in a future PR.

llvmbot · 2026-01-08T08:30:59Z

@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: Luke Lau (lukel97)

Changes

Addresses part of #153144 and splits off part of #166164

There are two parts to the EVL transform:

Convert the loop so the number of elements processed each iteration is EVL, not VF. The IV and header mask are replaced with EVL-based variants.
Optimize users of the EVL based header mask to VP intrinsic based recipes.
changes the semantics of the vector loop region, whereas 2) needs to preserve them. This splits 2) out so we don't mix the two up, and allows us to move 1) earlier in the pipeline in a future PR.

Full diff: https://github.com/llvm/llvm-project/pull/174925.diff

3 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+4-2)
(modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp (+37-45)
(modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.h (+8)

diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 31c44fc46c973..903fb7b25ad6a 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -8328,10 +8328,12 @@ void LoopVectorizationPlanner::buildVPlansWithVPRecipes(ElementCount MinVF,
       VPlanTransforms::runPass(VPlanTransforms::truncateToMinimalBitwidths,
                                *Plan, CM.getMinimalBitwidths());
       VPlanTransforms::runPass(VPlanTransforms::optimize, *Plan);
-      // TODO: try to put it close to addActiveLaneMask().
-      if (CM.foldTailWithEVL())
+      // TODO: try to put addExplicitVectorLength close to addActiveLaneMask
+      if (CM.foldTailWithEVL()) {
         VPlanTransforms::runPass(VPlanTransforms::addExplicitVectorLength,
                                  *Plan, CM.getMaxSafeElements());
+        VPlanTransforms::runPass(VPlanTransforms::optimizeEVLMasks, *Plan);
+      }
       assert(verifyVPlanIsValid(*Plan) && "VPlan is invalid");
       VPlans.push_back(std::move(Plan));
     }
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index b1880517a4199..e1dba4cf4a064 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -2979,8 +2979,42 @@ static VPRecipeBase *optimizeMaskToEVL(VPValue *HeaderMask,
   return nullptr;
 }
 
-/// Replace recipes with their EVL variants.
-static void transformRecipestoEVLRecipes(VPlan &Plan, VPValue &EVL) {
+/// Optimize away any EVL-based header masks to VP intrinsic based recipes.
+/// The transforms here need to preserve the original semantics.
+void VPlanTransforms::optimizeEVLMasks(VPlan &Plan) {
+  // Find the EVL-based header mask if it exists: icmp ult step-vector, EVL
+  VPInstruction *HeaderMask = nullptr;
+  for (VPRecipeBase &R : *Plan.getVectorLoopRegion()->getEntryBasicBlock()) {
+    if (match(&R, m_ICmp(m_VPInstruction<VPInstruction::StepVector>(),
+                         m_EVL(m_VPValue())))) {
+      HeaderMask = cast<VPInstruction>(&R);
+      break;
+    }
+  }
+  if (!HeaderMask)
+    return;
+
+  VPTypeAnalysis TypeInfo(Plan);
+  VPValue *EVL = HeaderMask->getOperand(1);
+
+  for (VPUser *U : collectUsersRecursively(HeaderMask)) {
+    VPRecipeBase *R = cast<VPRecipeBase>(U);
+    if (auto *NewR = optimizeMaskToEVL(HeaderMask, *R, TypeInfo, *EVL)) {
+      NewR->insertBefore(R);
+      for (auto [Old, New] :
+           zip_equal(R->definedValues(), NewR->definedValues()))
+        Old->replaceAllUsesWith(New);
+      // Erase dead stores, the rest will be removed by removeDeadRecipes.
+      if (R->getNumDefinedValues() == 0)
+        R->eraseFromParent();
+    }
+  }
+  removeDeadRecipes(Plan);
+}
+
+/// After replacing the IV with a EVL-based IV, fixup recipes that use VF to use
+/// the EVL instead to avoid incorrect updates on the penultimate iteration.
+static void fixupVFUsersForEVL(VPlan &Plan, VPValue &EVL) {
   VPTypeAnalysis TypeInfo(Plan);
   VPRegionBlock *LoopRegion = Plan.getVectorLoopRegion();
   VPBasicBlock *Header = LoopRegion->getEntryBasicBlock();
@@ -3008,10 +3042,6 @@ static void transformRecipestoEVLRecipes(VPlan &Plan, VPValue &EVL) {
     return isa<VPWidenPointerInductionRecipe>(U);
   });
 
-  // Defer erasing recipes till the end so that we don't invalidate the
-  // VPTypeAnalysis cache.
-  SmallVector<VPRecipeBase *> ToErase;
-
   // Create a scalar phi to track the previous EVL if fixed-order recurrence is
   // contained.
   bool ContainsFORs =
@@ -3046,7 +3076,6 @@ static void transformRecipestoEVLRecipes(VPlan &Plan, VPValue &EVL) {
             R.getDebugLoc());
         VPSplice->insertBefore(&R);
         R.getVPSingleValue()->replaceAllUsesWith(VPSplice);
-        ToErase.push_back(&R);
       }
     }
   }
@@ -3067,43 +3096,6 @@ static void transformRecipestoEVLRecipes(VPlan &Plan, VPValue &EVL) {
       CmpInst::ICMP_ULT,
       Builder.createNaryOp(VPInstruction::StepVector, {}, EVLType), &EVL);
   HeaderMask->replaceAllUsesWith(EVLMask);
-  ToErase.push_back(HeaderMask->getDefiningRecipe());
-
-  // Try to optimize header mask recipes away to their EVL variants.
-  // TODO: Split optimizeMaskToEVL out and move into
-  // VPlanTransforms::optimize. transformRecipestoEVLRecipes should be run in
-  // tryToBuildVPlanWithVPRecipes beforehand.
-  for (VPUser *U : collectUsersRecursively(EVLMask)) {
-    auto *CurRecipe = cast<VPRecipeBase>(U);
-    VPRecipeBase *EVLRecipe =
-        optimizeMaskToEVL(EVLMask, *CurRecipe, TypeInfo, EVL);
-    if (!EVLRecipe)
-      continue;
-
-    unsigned NumDefVal = EVLRecipe->getNumDefinedValues();
-    assert(NumDefVal == CurRecipe->getNumDefinedValues() &&
-           "New recipe must define the same number of values as the "
-           "original.");
-    EVLRecipe->insertBefore(CurRecipe);
-    if (isa<VPSingleDefRecipe, VPWidenLoadEVLRecipe, VPInterleaveEVLRecipe>(
-            EVLRecipe)) {
-      for (unsigned I = 0; I < NumDefVal; ++I) {
-        VPValue *CurVPV = CurRecipe->getVPValue(I);
-        CurVPV->replaceAllUsesWith(EVLRecipe->getVPValue(I));
-      }
-    }
-    ToErase.push_back(CurRecipe);
-  }
-  // Remove dead EVL mask.
-  if (EVLMask->getNumUsers() == 0)
-    ToErase.push_back(EVLMask->getDefiningRecipe());
-
-  for (VPRecipeBase *R : reverse(ToErase)) {
-    SmallVector<VPValue *> PossiblyDead(R->operands());
-    R->eraseFromParent();
-    for (VPValue *Op : PossiblyDead)
-      recursivelyDeleteDeadRecipes(Op);
-  }
 }
 
 /// Add a VPEVLBasedIVPHIRecipe and related recipes to \p Plan and
@@ -3201,7 +3193,7 @@ void VPlanTransforms::addExplicitVectorLength(
       DebugLoc::getCompilerGenerated(), "avl.next");
   AVLPhi->addOperand(NextAVL);
 
-  transformRecipestoEVLRecipes(Plan, *VPEVL);
+  fixupVFUsersForEVL(Plan, *VPEVL);
 
   // Replace all uses of VPCanonicalIVPHIRecipe by
   // VPEVLBasedIVPHIRecipe except for the canonical IV increment.
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.h b/llvm/lib/Transforms/Vectorize/VPlanTransforms.h
index f4ecee4e7f04f..35af0a76019dc 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.h
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.h
@@ -266,6 +266,14 @@ struct VPlanTransforms {
   addExplicitVectorLength(VPlan &Plan,
                           const std::optional<unsigned> &MaxEVLSafeElements);
 
+  /// Optimize recipes which use an EVL based header mask to a VP intrinsic:
+  ///
+  /// %mask = icmp step-vector, EVL
+  /// %load = load %ptr, %mask
+  /// -->
+  /// %load = vp.load %ptr, EVL
+  static void optimizeEVLMasks(VPlan &Plan);
+
   // For each Interleave Group in \p InterleaveGroups replace the Recipes
   // widening its memory instructions with a single VPInterleaveRecipe at its
   // insertion point.

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

fhahn · 2026-01-08T19:57:41Z

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

+/// Optimize away any EVL-based header masks to VP intrinsic based recipes.
+/// The transforms here need to preserve the original semantics.
+void VPlanTransforms::optimizeEVLMasks(VPlan &Plan) {
+  // Find the EVL-based header mask if it exists: icmp ult step-vector, EVL


Can the code use m_SpecificICmp to look for icmp ult?

Done in e5dbe66, but I had to add a specific VPRecipeBase overload for SpecificCmp_match so we can directly call match on VPRecipeBase. Regular Cmp_match does the same thing

Grea,t hat useful in any cae

fhahn · 2026-01-08T19:58:27Z

llvm/lib/Transforms/Vectorize/VPlanTransforms.h


+  /// Optimize recipes which use an EVL based header mask to a VP intrinsic:
+  ///
+  /// %mask = icmp step-vector, EVL


Suggested change

/// %mask = icmp step-vector, EVL

/// %mask = icmp ult step-vector, EVL

?

Mel-Chen

LGTM, thanks

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

artagnon

LGTM, thanks!

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

fhahn · 2026-01-10T21:02:45Z

llvm/lib/Transforms/Vectorize/VPlanTransforms.h

  addExplicitVectorLength(VPlan &Plan,
                          const std::optional<unsigned> &MaxEVLSafeElements);

+  /// Optimize recipes which use an EVL based header mask to a VP intrinsic:


Suggested change

/// Optimize recipes which use an EVL based header mask to a VP intrinsic:

/// Optimize recipes which use an EVL-based header mask to VP intrinsics, for example

fhahn · 2026-01-10T21:04:00Z

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

+  removeDeadRecipes(Plan);
+}
+
+/// After replacing the IV with a EVL-based IV, fixup recipes that use VF to use


Suggested change

/// After replacing the IV with a EVL-based IV, fixup recipes that use VF to use

/// After replacing the canonical IV with a EVL-based IV, fixup recipes that use VF to use

fhahn · 2026-01-10T21:07:14Z

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

+  for (VPRecipeBase *OldR : OldRecipes)
+    OldR->eraseFromParent();
+  removeDeadRecipes(Plan);


is there a reason to run full removeDeadRecipes insead of the original code? If so, would be good to add a quick commen explaining why both is needed? (removeDeadRecieps won't remove stores)

Calling removeDeadRecipes was just simpler than having to worry about erasing the old recipes in reverse order, passing through the old recipe's operands etc. Not strongly opinionated about it though and can restore it if you prefer.

Right, I guess it would be better to just move the existing code over, otherwise it may not be NFC as we may remove more recipes than previously earlier on possibly causing other changes elsewhere

fhahn · 2026-01-10T21:14:19Z

llvm/lib/Transforms/Vectorize/VPlanTransforms.h

@@ -266,6 +266,14 @@ struct VPlanTransforms {
  addExplicitVectorLength(VPlan &Plan,


migh be a good opporunity to clarfiy the documentation here, perhaps worth mentioning that this replaces the header mask.

/// replaces all uses except the canonical IV increment of
/// VPCanonicalIVPHIRecipe with a VPEVLBasedIVPHIRecipe.

it may also have to introduce a new phi to track the EVL of the previous iteration

Clarified the documentation in e550c42

fhahn

LGTM, with a few more minor suggestions, thanks

fhahn · 2026-01-13T21:19:17Z

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

+/// Optimize away any EVL-based header masks to VP intrinsic based recipes.
+/// The transforms here need to preserve the original semantics.
+void VPlanTransforms::optimizeEVLMasks(VPlan &Plan) {
+  // Find the EVL-based header mask if it exists: icmp ult step-vector, EVL


Grea,t hat useful in any cae

fhahn · 2026-01-13T21:20:24Z

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

+  for (VPRecipeBase *OldR : OldRecipes)
+    OldR->eraseFromParent();
+  removeDeadRecipes(Plan);


Right, I guess it would be better to just move the existing code over, otherwise it may not be NFC as we may remove more recipes than previously earlier on possibly causing other changes elsewhere

fhahn · 2026-01-13T21:21:04Z

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

+/// - Add a VPEVLBasedIVPHIRecipe and related recipes to \p Plan and
 /// replaces all uses except the canonical IV increment of
 /// VPCanonicalIVPHIRecipe with a VPEVLBasedIVPHIRecipe. VPCanonicalIVPHIRecipe
 /// is used only for loop iterations counting after this transformation.


nit: consistent alignment

Suggested change

/// - Add a VPEVLBasedIVPHIRecipe and related recipes to \p Plan and

/// replaces all uses except the canonical IV increment of

/// VPCanonicalIVPHIRecipe with a VPEVLBasedIVPHIRecipe. VPCanonicalIVPHIRecipe

/// is used only for loop iterations counting after this transformation.

/// - Add a VPEVLBasedIVPHIRecipe and related recipes to \p Plan and

/// replaces all uses except the canonical IV increment of

/// VPCanonicalIVPHIRecipe with a VPEVLBasedIVPHIRecipe. VPCanonicalIVPHIRecipe

/// is used only for loop iterations counting after this transformation.

fhahn · 2026-01-13T21:21:35Z

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

+/// - Plans with FORs have a new phi added to keep track of the EVL of the
+/// previous iteration, and VPFirstOrderRecurrencePHIRecipes are replaced with
+/// @llvm.vp.splice.


nit: consistent alignment

Suggested change

/// - Plans with FORs have a new phi added to keep track of the EVL of the

/// previous iteration, and VPFirstOrderRecurrencePHIRecipes are replaced with

/// @llvm.vp.splice.

/// - Plans with FORs have a new phi added to keep track of the EVL of the

/// previous iteration, and VPFirstOrderRecurrencePHIRecipes are replaced with

/// @llvm.vp.splice.

fhahn · 2026-01-13T21:22:32Z

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

 }

-/// Add a VPEVLBasedIVPHIRecipe and related recipes to \p Plan and
+/// Converts a tail folded vector loop region to step by @llvm.get.vector.length


might be good to mention the VPINstruction here too

Addresses part of llvm#153144 and splits off part of llvm#166164 There are two parts to the EVL transform: 1) Convert the loop so the number of elements processed each iteration is EVL, not VF. The IV and header mask are replaced with EVL-based variants. 2) Optimize users of the EVL based header mask to VP intrinsic based recipes. (1) changes the semantics of the vector loop region, whereas (2) needs to preserve them. This splits (2) out so we don't mix the two up, and allows us to move (1) earlier in the pipeline in a future PR.

lukel97 requested review from ElvisWang123, Mel-Chen, arcbbb and fhahn January 8, 2026 08:30

llvmbot added vectorizers llvm:transforms labels Jan 8, 2026

Remove dead recipes in addExplicitVectorLength too

abbb939

artagnon reviewed Jan 8, 2026

View reviewed changes

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp Outdated Show resolved Hide resolved

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp Outdated Show resolved Hide resolved

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp Show resolved Hide resolved

fhahn reviewed Jan 8, 2026

View reviewed changes

lukel97 added 2 commits January 10, 2026 15:41

Use m_SpecificICmp, use m_StepVector, pattern match EVL

e5dbe66

Erase recipes at the end to avoid invalidating TypeInfo

5a2b815

Mel-Chen approved these changes Jan 10, 2026

View reviewed changes

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp Show resolved Hide resolved

Fix stray change

acf669c

artagnon reviewed Jan 10, 2026

View reviewed changes

Address comments

3a6438e

artagnon approved these changes Jan 10, 2026

View reviewed changes

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp Show resolved Hide resolved

fhahn reviewed Jan 10, 2026

View reviewed changes

Update comments

e550c42

fhahn approved these changes Jan 13, 2026

View reviewed changes

lukel97 added 2 commits January 14, 2026 14:21

Use recursivelyDeleteDeadRecipes

d307310

Update comments

3d5108c

lukel97 enabled auto-merge (squash) January 14, 2026 06:24

lukel97 merged commit 0ae23ca into llvm:main Jan 14, 2026
10 of 11 checks passed

	/// %mask = icmp step-vector, EVL
	/// %mask = icmp ult step-vector, EVL

	/// Optimize recipes which use an EVL based header mask to a VP intrinsic:
	/// Optimize recipes which use an EVL-based header mask to VP intrinsics, for example

	/// After replacing the IV with a EVL-based IV, fixup recipes that use VF to use
	/// After replacing the canonical IV with a EVL-based IV, fixup recipes that use VF to use

		@@ -266,6 +266,14 @@ struct VPlanTransforms {
		addExplicitVectorLength(VPlan &Plan,

[VPlan] Split out optimizeEVLMasks. NFC #174925

[VPlan] Split out optimizeEVLMasks. NFC #174925

Uh oh!

Conversation

lukel97 commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Mel-Chen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

artagnon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

lukel97 commented Jan 8, 2026 •

edited

Loading

llvmbot commented Jan 8, 2026 •

edited

Loading