Skip to content

[LLVM][TTI] Remove the isVScaleKnownToBeAPowerOfTwo hook.#183292

Merged
paulwalker-arm merged 2 commits intollvm:mainfrom
paulwalker-arm:remove-vscale-power-of-two-tti-hook
Feb 25, 2026
Merged

[LLVM][TTI] Remove the isVScaleKnownToBeAPowerOfTwo hook.#183292
paulwalker-arm merged 2 commits intollvm:mainfrom
paulwalker-arm:remove-vscale-power-of-two-tti-hook

Conversation

@paulwalker-arm
Copy link
Copy Markdown
Collaborator

@paulwalker-arm paulwalker-arm commented Feb 25, 2026

After #183080 this is no longer a configurable property.

NOTE: No test changes expected beyond llvm/test/Transforms/LoopVectorize/scalable-predication.ll which has been removed because it only existed to verfiy the now unsupported functionality.

@llvmbot llvmbot added backend:AArch64 backend:RISC-V vectorizers llvm:SelectionDAG SelectionDAGISel as well llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Feb 25, 2026
@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Feb 25, 2026

@llvm/pr-subscribers-llvm-selectiondag
@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-llvm-analysis

Author: Paul Walker (paulwalker-arm)

Changes

After #183080 this is no longer a configurable property.

NOTE: No test changes expected beyond llvm/test/Transforms/LoopVectorize/scalable-predication.ll which has been removed because it validated the now unsupported functionality.


Full diff: https://github.com/llvm/llvm-project/pull/183292.diff

13 Files Affected:

  • (modified) llvm/include/llvm/Analysis/TargetTransformInfo.h (-3)
  • (modified) llvm/include/llvm/Analysis/TargetTransformInfoImpl.h (-1)
  • (modified) llvm/include/llvm/CodeGen/BasicTTIImpl.h (-1)
  • (modified) llvm/include/llvm/CodeGen/TargetLowering.h (-3)
  • (modified) llvm/lib/Analysis/TargetTransformInfo.cpp (-4)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+3-5)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.h (-2)
  • (modified) llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h (-2)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (-12)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.h (-2)
  • (modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h (-4)
  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+3-24)
  • (removed) llvm/test/Transforms/LoopVectorize/scalable-predication.ll (-114)
diff --git a/llvm/include/llvm/Analysis/TargetTransformInfo.h b/llvm/include/llvm/Analysis/TargetTransformInfo.h
index a7fb0efedadde..18ae6a005d972 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfo.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfo.h
@@ -1358,9 +1358,6 @@ class TargetTransformInfo {
   /// \return the value of vscale to tune the cost model for.
   LLVM_ABI std::optional<unsigned> getVScaleForTuning() const;
 
-  /// \return true if vscale is known to be a power of 2
-  LLVM_ABI bool isVScaleKnownToBeAPowerOfTwo() const;
-
   /// \return True if the vectorization factor should be chosen to
   /// make the vector of the smallest element type match the size of a
   /// vector register. For wider element types, this could result in
diff --git a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
index 454be56aed6cc..e062b70be6b59 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
@@ -644,7 +644,6 @@ class TargetTransformInfoImplBase {
   virtual std::optional<unsigned> getVScaleForTuning() const {
     return std::nullopt;
   }
-  virtual bool isVScaleKnownToBeAPowerOfTwo() const { return false; }
 
   virtual bool
   shouldMaximizeVectorBandwidth(TargetTransformInfo::RegisterKind K) const {
diff --git a/llvm/include/llvm/CodeGen/BasicTTIImpl.h b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
index 68874c59be4b8..6dcb6f0062a08 100644
--- a/llvm/include/llvm/CodeGen/BasicTTIImpl.h
+++ b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
@@ -889,7 +889,6 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
   std::optional<unsigned> getVScaleForTuning() const override {
     return std::nullopt;
   }
-  bool isVScaleKnownToBeAPowerOfTwo() const override { return false; }
 
   /// Estimate the overhead of scalarizing an instruction. Insert and Extract
   /// are set if the demanded result elements need to be inserted and/or
diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h b/llvm/include/llvm/CodeGen/TargetLowering.h
index 7964bfd81d704..4b60c3f905120 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -623,9 +623,6 @@ class LLVM_ABI TargetLoweringBase {
     return BypassSlowDivWidths;
   }
 
-  /// Return true only if vscale must be a power of two.
-  virtual bool isVScaleKnownToBeAPowerOfTwo() const { return false; }
-
   /// Return true if Flow Control is an expensive operation that should be
   /// avoided.
   bool isJumpExpensive() const { return JumpIsExpensive; }
diff --git a/llvm/lib/Analysis/TargetTransformInfo.cpp b/llvm/lib/Analysis/TargetTransformInfo.cpp
index 0e745a978656b..0f97edc424d7e 100644
--- a/llvm/lib/Analysis/TargetTransformInfo.cpp
+++ b/llvm/lib/Analysis/TargetTransformInfo.cpp
@@ -837,10 +837,6 @@ std::optional<unsigned> TargetTransformInfo::getVScaleForTuning() const {
   return TTIImpl->getVScaleForTuning();
 }
 
-bool TargetTransformInfo::isVScaleKnownToBeAPowerOfTwo() const {
-  return TTIImpl->isVScaleKnownToBeAPowerOfTwo();
-}
-
 bool TargetTransformInfo::shouldMaximizeVectorBandwidth(
     TargetTransformInfo::RegisterKind K) const {
   return TTIImpl->shouldMaximizeVectorBandwidth(K);
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 3affb4de2d4b4..a58c08bd00041 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -4757,11 +4757,9 @@ bool SelectionDAG::isKnownToBeAPowerOfTwo(SDValue Val,
                                   Depth + 1);
 
   case ISD::VSCALE:
-    // vscale(power-of-two) is a power-of-two for some targets
-    if (getTargetLoweringInfo().isVScaleKnownToBeAPowerOfTwo() &&
-        isKnownToBeAPowerOfTwo(Val.getOperand(0), /*OrZero=*/false, Depth + 1))
-      return true;
-    break;
+    // vscale(power-of-two) is a power-of-two
+    return isKnownToBeAPowerOfTwo(Val.getOperand(0), /*OrZero=*/false,
+                                  Depth + 1);
   }
 
   // More could be done here, though the above checks are enough
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.h b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
index 6ecea4f6e2d5e..b1df977d43fcf 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -563,8 +563,6 @@ class AArch64TargetLowering : public TargetLowering {
                               SDValue Chain, SDValue InGlue, unsigned Condition,
                               bool InsertVectorLengthCheck = false) const;
 
-  bool isVScaleKnownToBeAPowerOfTwo() const override { return true; }
-
   /// Returns true if \p RdxOp should be lowered to a SVE reduction. If a SVE2
   /// pairwise operation can be used for the reduction \p PairwiseOpIID is set
   /// to its intrinsic ID.
diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
index e166e0cfdaafd..f247e9e49e23f 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
@@ -165,8 +165,6 @@ class AArch64TTIImpl final : public BasicTTIImplBase<AArch64TTIImpl> {
     return ST->getVScaleForTuning();
   }
 
-  bool isVScaleKnownToBeAPowerOfTwo() const override { return true; }
-
   bool shouldMaximizeVectorBandwidth(
       TargetTransformInfo::RegisterKind K) const override;
 
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 77512b609fba8..227abc9e80579 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -25714,18 +25714,6 @@ const MCExpr *RISCVTargetLowering::LowerCustomJumpTableEntry(
   return MCSymbolRefExpr::create(MBB->getSymbol(), Ctx);
 }
 
-bool RISCVTargetLowering::isVScaleKnownToBeAPowerOfTwo() const {
-  // We define vscale to be VLEN/RVVBitsPerBlock.  VLEN is always a power
-  // of two >= 64, and RVVBitsPerBlock is 64.  Thus, vscale must be
-  // a power of two as well.
-  // FIXME: This doesn't work for zve32, but that's already broken
-  // elsewhere for the same reason.
-  assert(Subtarget.getRealMinVLen() >= 64 && "zve32* unsupported");
-  static_assert(RISCV::RVVBitsPerBlock == 64,
-                "RVVBitsPerBlock changed, audit needed");
-  return true;
-}
-
 bool RISCVTargetLowering::getIndexedAddressParts(SDNode *Op, SDValue &Base,
                                                  SDValue &Offset,
                                                  ISD::MemIndexedMode &AM,
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.h b/llvm/lib/Target/RISCV/RISCVISelLowering.h
index c4bb32802ec05..8d88aeb7ae3fc 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.h
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.h
@@ -392,8 +392,6 @@ class RISCVTargetLowering : public TargetLowering {
                                           unsigned uid,
                                           MCContext &Ctx) const override;
 
-  bool isVScaleKnownToBeAPowerOfTwo() const override;
-
   bool getIndexedAddressParts(SDNode *Op, SDValue &Base, SDValue &Offset,
                               ISD::MemIndexedMode &AM, SelectionDAG &DAG) const;
   bool getPreIndexedAddressParts(SDNode *N, SDValue &Base, SDValue &Offset,
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
index 9e9277f050e01..424f9fe52c59e 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
@@ -358,10 +358,6 @@ class RISCVTTIImpl final : public BasicTTIImplBase<RISCVTTIImpl> {
 
   bool isLegalMaskedCompressStore(Type *DataTy, Align Alignment) const override;
 
-  bool isVScaleKnownToBeAPowerOfTwo() const override {
-    return TLI->isVScaleKnownToBeAPowerOfTwo();
-  }
-
   /// \returns How the target needs this vector-predicated operation to be
   /// transformed.
   TargetTransformInfo::VPLegalization
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 2342c8bfa502e..0fd425c23c7aa 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -2383,21 +2383,8 @@ Value *EpilogueVectorizerMainLoop::createIterationCountCheck(
       // check is known to be true, or known to be false.
       CheckMinIters = Builder.CreateICmp(P, Count, Step, "min.iters.check");
     } // else step known to be < trip count, use CheckMinIters preset to false.
-  } else if (VF.isScalable() && !TTI->isVScaleKnownToBeAPowerOfTwo() &&
-             !isIndvarOverflowCheckKnownFalse(Cost, VF, UF) &&
-             Style != TailFoldingStyle::DataAndControlFlowWithoutRuntimeCheck) {
-    // vscale is not necessarily a power-of-2, which means we cannot guarantee
-    // an overflow to zero when updating induction variables and so an
-    // additional overflow check is required before entering the vector loop.
-
-    // Get the maximum unsigned value for the type.
-    Value *MaxUIntTripCount =
-        ConstantInt::get(CountTy, cast<IntegerType>(CountTy)->getMask());
-    Value *LHS = Builder.CreateSub(MaxUIntTripCount, Count);
-
-    // Don't execute the vector loop if (UMax - n) < (VF * UF).
-    CheckMinIters = Builder.CreateICmp(ICmpInst::ICMP_ULT, LHS, CreateStep());
   }
+
   return CheckMinIters;
 }
 
@@ -3663,7 +3650,7 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
       MaxFactors.FixedVF.getFixedValue();
   if (MaxFactors.ScalableVF) {
     std::optional<unsigned> MaxVScale = getMaxVScale(*TheFunction, TTI);
-    if (MaxVScale && TTI.isVScaleKnownToBeAPowerOfTwo()) {
+    if (MaxVScale) {
       MaxPowerOf2RuntimeVF = std::max<unsigned>(
           *MaxPowerOf2RuntimeVF,
           *MaxVScale * MaxFactors.ScalableVF.getKnownMinValue());
@@ -8692,14 +8679,6 @@ void LoopVectorizationPlanner::attachRuntimeChecks(
 void LoopVectorizationPlanner::addMinimumIterationCheck(
     VPlan &Plan, ElementCount VF, unsigned UF,
     ElementCount MinProfitableTripCount) const {
-  // vscale is not necessarily a power-of-2, which means we cannot guarantee
-  // an overflow to zero when updating induction variables and so an
-  // additional overflow check is required before entering the vector loop.
-  bool IsIndvarOverflowCheckNeededForVF =
-      VF.isScalable() && !TTI.isVScaleKnownToBeAPowerOfTwo() &&
-      !isIndvarOverflowCheckKnownFalse(&CM, VF, UF) &&
-      CM.getTailFoldingStyle() !=
-          TailFoldingStyle::DataAndControlFlowWithoutRuntimeCheck;
   const uint32_t *BranchWeigths =
       hasBranchWeightMD(*OrigLoop->getLoopLatch()->getTerminator())
           ? &MinItersBypassWeights[0]
@@ -8707,7 +8686,7 @@ void LoopVectorizationPlanner::addMinimumIterationCheck(
   VPlanTransforms::addMinimumIterationCheck(
       Plan, VF, UF, MinProfitableTripCount,
       CM.requiresScalarEpilogue(VF.isVector()), CM.foldTailByMasking(),
-      IsIndvarOverflowCheckNeededForVF, OrigLoop, BranchWeigths,
+      /*CheckNeededWithTailFolding=*/false, OrigLoop, BranchWeigths,
       OrigLoop->getLoopPredecessor()->getTerminator()->getDebugLoc(), PSE);
 }
 
diff --git a/llvm/test/Transforms/LoopVectorize/scalable-predication.ll b/llvm/test/Transforms/LoopVectorize/scalable-predication.ll
deleted file mode 100644
index 65d3e7e7cbdf4..0000000000000
--- a/llvm/test/Transforms/LoopVectorize/scalable-predication.ll
+++ /dev/null
@@ -1,114 +0,0 @@
-; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; RUN: opt -passes=loop-vectorize -force-tail-folding-style=data -prefer-predicate-over-epilogue=predicate-dont-vectorize -force-target-supports-scalable-vectors -S < %s | FileCheck %s
-
-; vscale is not guaranteed to be a power of two, so this test (which
-; deliberately doesn't correspond to an in-tree backend since those
-; *do* have vscale as power-of-two) exercises the code required for the
-; minimum iteration check in the non-power-of-two case.
-
-define void @foo(i32 %val, ptr dereferenceable(1024) %ptr) {
-; CHECK-LABEL: @foo(
-; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[TMP6:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-NEXT:    [[TMP7:%.*]] = shl nuw i64 [[TMP6]], 2
-; CHECK-NEXT:    [[TMP8:%.*]] = icmp ult i64 -257, [[TMP7]]
-; CHECK-NEXT:    br i1 [[TMP8]], label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
-; CHECK:       vector.ph:
-; CHECK-NEXT:    [[TMP0:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-NEXT:    [[TMP1:%.*]] = shl nuw i64 [[TMP0]], 2
-; CHECK-NEXT:    [[TMP2:%.*]] = sub i64 [[TMP1]], 1
-; CHECK-NEXT:    [[N_RND_UP:%.*]] = add i64 256, [[TMP2]]
-; CHECK-NEXT:    [[N_MOD_VF:%.*]] = urem i64 [[N_RND_UP]], [[TMP1]]
-; CHECK-NEXT:    [[N_VEC:%.*]] = sub i64 [[N_RND_UP]], [[N_MOD_VF]]
-; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
-; CHECK:       vector.body:
-; CHECK-NEXT:    [[INDEX1:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT2:%.*]], [[VECTOR_BODY]] ]
-; CHECK-NEXT:    [[INDEX_NEXT2]] = add i64 [[INDEX1]], [[TMP1]]
-; CHECK-NEXT:    [[TMP5:%.*]] = icmp eq i64 [[INDEX_NEXT2]], [[N_VEC]]
-; CHECK-NEXT:    br i1 [[TMP5]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
-; CHECK:       middle.block:
-; CHECK-NEXT:    br label [[WHILE_END_LOOPEXIT:%.*]]
-; CHECK:       scalar.ph:
-; CHECK-NEXT:    br label [[WHILE_BODY:%.*]]
-; CHECK:       while.body:
-; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ [[INDEX_NEXT:%.*]], [[WHILE_BODY]] ], [ 0, [[SCALAR_PH]] ]
-; CHECK-NEXT:    [[GEP:%.*]] = getelementptr i32, ptr [[PTR:%.*]], i64 [[INDEX]]
-; CHECK-NEXT:    [[LD1:%.*]] = load i32, ptr [[GEP]], align 4
-; CHECK-NEXT:    [[INDEX_NEXT]] = add nsw i64 [[INDEX]], 1
-; CHECK-NEXT:    [[CMP10:%.*]] = icmp ult i64 [[INDEX_NEXT]], 256
-; CHECK-NEXT:    br i1 [[CMP10]], label [[WHILE_BODY]], label [[WHILE_END_LOOPEXIT]], !llvm.loop [[LOOP3:![0-9]+]]
-; CHECK:       while.end.loopexit:
-; CHECK-NEXT:    ret void
-;
-entry:
-  br label %while.body
-
-while.body:                                       ; preds = %while.body, %entry
-  %index = phi i64 [ %index.next, %while.body ], [ 0, %entry ]
-  %gep = getelementptr i32, ptr %ptr, i64 %index
-  %ld1 = load i32, ptr %gep, align 4
-  %index.next = add nsw i64 %index, 1
-  %cmp10 = icmp ult i64 %index.next, 256
-  br i1 %cmp10, label %while.body, label %while.end.loopexit, !llvm.loop !0
-
-while.end.loopexit:                               ; preds = %while.body
-  ret void
-}
-
-; Same as @foo, but with variable trip count.
-define void @foo2(i32 %val, ptr dereferenceable(1024) %ptr, i64 %n) {
-; CHECK-LABEL: @foo2(
-; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[UMAX:%.*]] = call i64 @llvm.umax.i64(i64 [[N:%.*]], i64 1)
-; CHECK-NEXT:    [[TMP0:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-NEXT:    [[TMP1:%.*]] = shl nuw i64 [[TMP0]], 2
-; CHECK-NEXT:    [[TMP2:%.*]] = sub i64 -1, [[UMAX]]
-; CHECK-NEXT:    [[TMP3:%.*]] = icmp ult i64 [[TMP2]], [[TMP1]]
-; CHECK-NEXT:    br i1 [[TMP3]], label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
-; CHECK:       vector.ph:
-; CHECK-NEXT:    [[TMP4:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-NEXT:    [[TMP5:%.*]] = shl nuw i64 [[TMP4]], 2
-; CHECK-NEXT:    [[TMP6:%.*]] = sub i64 [[TMP5]], 1
-; CHECK-NEXT:    [[N_RND_UP:%.*]] = add i64 [[UMAX]], [[TMP6]]
-; CHECK-NEXT:    [[N_MOD_VF:%.*]] = urem i64 [[N_RND_UP]], [[TMP5]]
-; CHECK-NEXT:    [[N_VEC:%.*]] = sub i64 [[N_RND_UP]], [[N_MOD_VF]]
-; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
-; CHECK:       vector.body:
-; CHECK-NEXT:    [[INDEX1:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT2:%.*]], [[VECTOR_BODY]] ]
-; CHECK-NEXT:    [[INDEX_NEXT2]] = add i64 [[INDEX1]], [[TMP5]]
-; CHECK-NEXT:    [[TMP7:%.*]] = icmp eq i64 [[INDEX_NEXT2]], [[N_VEC]]
-; CHECK-NEXT:    br i1 [[TMP7]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
-; CHECK:       middle.block:
-; CHECK-NEXT:    br label [[WHILE_END_LOOPEXIT:%.*]]
-; CHECK:       scalar.ph:
-; CHECK-NEXT:    br label [[WHILE_BODY:%.*]]
-; CHECK:       while.body:
-; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ [[INDEX_NEXT:%.*]], [[WHILE_BODY]] ], [ 0, [[SCALAR_PH]] ]
-; CHECK-NEXT:    [[GEP:%.*]] = getelementptr i32, ptr [[PTR:%.*]], i64 [[INDEX]]
-; CHECK-NEXT:    [[LD1:%.*]] = load i32, ptr [[GEP]], align 4
-; CHECK-NEXT:    [[INDEX_NEXT]] = add nsw i64 [[INDEX]], 1
-; CHECK-NEXT:    [[CMP10:%.*]] = icmp ult i64 [[INDEX_NEXT]], [[N]]
-; CHECK-NEXT:    br i1 [[CMP10]], label [[WHILE_BODY]], label [[WHILE_END_LOOPEXIT]], !llvm.loop [[LOOP5:![0-9]+]]
-; CHECK:       while.end.loopexit:
-; CHECK-NEXT:    ret void
-;
-entry:
-  br label %while.body
-
-while.body:                                       ; preds = %while.body, %entry
-  %index = phi i64 [ %index.next, %while.body ], [ 0, %entry ]
-  %gep = getelementptr i32, ptr %ptr, i64 %index
-  %ld1 = load i32, ptr %gep, align 4
-  %index.next = add nsw i64 %index, 1
-  %cmp10 = icmp ult i64 %index.next, %n
-  br i1 %cmp10, label %while.body, label %while.end.loopexit, !llvm.loop !0
-
-while.end.loopexit:                               ; preds = %while.body
-  ret void
-}
-
-!0 = distinct !{!0, !1, !2, !3, !4}
-!1 = !{!"llvm.loop.vectorize.predicate.enable", i1 true}
-!2 = !{!"llvm.loop.vectorize.scalable.enable", i1 true}
-!3 = !{!"llvm.loop.interleave.count", i32 1}
-!4 = !{!"llvm.loop.vectorize.width", i32 4}

Plan, VF, UF, MinProfitableTripCount,
CM.requiresScalarEpilogue(VF.isVector()), CM.foldTailByMasking(),
IsIndvarOverflowCheckNeededForVF, OrigLoop, BranchWeigths,
/*CheckNeededWithTailFolding=*/false, OrigLoop, BranchWeigths,
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect the CheckNeededWithTailFolding parameter can be removed, but I'd rather do that as a separate PR if that's agreeable.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can, #183066 should remove it

Copy link
Copy Markdown
Contributor

@lukel97 lukel97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Plan, VF, UF, MinProfitableTripCount,
CM.requiresScalarEpilogue(VF.isVector()), CM.foldTailByMasking(),
IsIndvarOverflowCheckNeededForVF, OrigLoop, BranchWeigths,
/*CheckNeededWithTailFolding=*/false, OrigLoop, BranchWeigths,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can, #183066 should remove it

Copy link
Copy Markdown
Contributor

@david-arm david-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@paulwalker-arm paulwalker-arm merged commit ab360b1 into llvm:main Feb 25, 2026
17 checks passed
@paulwalker-arm paulwalker-arm deleted the remove-vscale-power-of-two-tti-hook branch February 25, 2026 14:09
@llvm-ci
Copy link
Copy Markdown

llvm-ci commented Feb 25, 2026

LLVM Buildbot has detected a new failure on builder lldb-x86_64-debian running on lldb-x86_64-debian while building llvm at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/162/builds/41830

Here is the relevant piece of the build log for the reference
Step 6 (test) failure: build (failure)
...
6.526 [1/1/51] Linking CXX executable bin/lldb-test
6.527 [0/1/51] Running lldb lit test suite
llvm-lit: /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/llvm/utils/lit/lit/llvm/config.py:561: note: using clang: /home/worker/2.0.1/lldb-x86_64-debian/build/bin/clang
llvm-lit: /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/llvm/utils/lit/lit/llvm/config.py:561: note: using ld.lld: /home/worker/2.0.1/lldb-x86_64-debian/build/bin/ld.lld
llvm-lit: /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/llvm/utils/lit/lit/llvm/config.py:561: note: using lld-link: /home/worker/2.0.1/lldb-x86_64-debian/build/bin/lld-link
llvm-lit: /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/llvm/utils/lit/lit/llvm/config.py:561: note: using ld64.lld: /home/worker/2.0.1/lldb-x86_64-debian/build/bin/ld64.lld
llvm-lit: /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/llvm/utils/lit/lit/llvm/config.py:561: note: using wasm-ld: /home/worker/2.0.1/lldb-x86_64-debian/build/bin/wasm-ld
llvm-lit: /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/Shell/lit.cfg.py:115: note: Deleting module cache at /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-clang/lldb-shell.
-- Testing: 3395 tests, 72 workers --
UNRESOLVED: lldb-api :: commands/gui/spawn-threads/TestGuiSpawnThreads.py (1 of 3395)
******************** TEST 'lldb-api :: commands/gui/spawn-threads/TestGuiSpawnThreads.py' FAILED ********************
Script:
--
/usr/bin/python3 /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/./lib --env LLVM_INCLUDE_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/include --env LLVM_TOOLS_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/./bin --arch x86_64 --build-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex --lldb-module-cache-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/lldb --compiler /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/clang --dsymutil /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/dsymutil --make /usr/bin/gmake --llvm-tools-dir /home/worker/2.0.1/lldb-x86_64-debian/build/./bin --lldb-obj-root /home/worker/2.0.1/lldb-x86_64-debian/build/tools/lldb --lldb-libs-dir /home/worker/2.0.1/lldb-x86_64-debian/build/./lib --cmake-build-type Release -t /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/commands/gui/spawn-threads -p TestGuiSpawnThreads.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 23.0.0git (https://github.com/llvm/llvm-project.git revision ab360b1e7ef2bcc1c98443427da3c13e98e1331b)
  clang revision ab360b1e7ef2bcc1c98443427da3c13e98e1331b
  llvm revision ab360b1e7ef2bcc1c98443427da3c13e98e1331b

�[1A�7�[1;99r�8(lldb) settings clear --all
�7
�[7mno target                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           �[0m�8(lldb) settings set symbols.enable-external-lookup false
(lldb) settings set target.inherit-tcc true
(lldb) settings set target.disable-aslr false
(lldb) settings set target.detach-on-error false
(lldb) settings set target.auto-apply-fixits false
(lldb) settings set plugin.process.gdb-remote.packet-timeout 60
(lldb) settings set symbols.clang-modules-cache-path "/home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-lldb/lldb-api"
(lldb) settings set use-color false
(lldb) settings set show-statusline false
�7�[1;100r�8�[J(lldb) target create "/home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/commands/gui/spawn-threads/TestGuiSpawnThreads/a.out"
Current executable set to '/home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/commands/gui/spawn-threads/TestGuiSpawnThreads/a.out' (x86_64).

�[K(lldb) breakpoint set -f main.cpp -p "break here"
breakpoint set -f main.cpp -p "break here"
Breakpoint 1: where = a.out`test_thread() + 33 at main.cpp:14:20, address = 0x0000000000001291

�[K(lldb) breakpoint set -f main.cpp -p "before join"
breakpoint set -f main.cpp -p "before join"
Breakpoint 2: where = a.out`test_thread() + 129 at main.cpp:16:3, address = 0x00000000000012f1

�[K(lldb) run
run
Process 498231 launched: '/home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/commands/gui/spawn-threads/TestGuiSpawnThreads/a.out' (x86_64)
Process 498231 stopped

lukel97 added a commit to lukel97/llvm-project that referenced this pull request Feb 27, 2026
…heck

Previously, the canonical IV increment may have overflowed to a non-zero value due to vscale being a non power-of-two. So we used to emit a runtime check for this.

If you didn't want the runtime check, DataAndControlFlowWithoutRuntimeCheck skipped it and instead tweaked the trip count so it wouldn't overflow.

However llvm#144963 stopped the check from ever being emitted (and in llvm#183292 the code to emit the check was removed), but we never restored the trip count back to normal now that it was no longer needed.

This PR restores the trip count since we don't need to adjust it. A follow up NFC can then remove DataAndControlFlowWithoutRuntimeCheck.
lukel97 added a commit to lukel97/llvm-project that referenced this pull request Feb 27, 2026
Stacked on llvm#183729

After llvm#144963 and llvm#183292 we never emit the runtime check, so DataAndControlFlowWithoutRuntimeCheck is equivalent to DataAndControlFlow.

With that we only need to store one tail folding style instead of two, because we don't need to distinguish whether or not the IV update overflows (to a non-zero value)
lukel97 added a commit that referenced this pull request Feb 28, 2026
…heck (#183729)

Previously, the canonical IV increment may have overflowed to a non-zero
value due to vscale being a non power-of-two. So we used to emit a
runtime check for this.

If you didn't want the runtime check,
DataAndControlFlowWithoutRuntimeCheck skipped it and instead tweaked the
trip count so it wouldn't overflow.

However #144963 stopped the check from ever being emitted because vscale
is always a power-of-two on AArch64 and RISC-V, so it never overflowed
to a non-zero value. And in #183292 the code to emit the check was
removed. But we never restored the trip count back to normal when the
target's vscale was a power-of-two.

Now that vscale is always a power-of-two, this PR avoids adjusting it. A
follow up NFC can then remove DataAndControlFlowWithoutRuntimeCheck.
lukel97 added a commit to lukel97/llvm-project that referenced this pull request Feb 28, 2026
Stacked on llvm#183729

After llvm#144963 and llvm#183292 we never emit the runtime check, so DataAndControlFlowWithoutRuntimeCheck is equivalent to DataAndControlFlow.

With that we only need to store one tail folding style instead of two, because we don't need to distinguish whether or not the IV update overflows (to a non-zero value)
lukel97 added a commit that referenced this pull request Mar 2, 2026
After #144963 and #183292 we never emit the runtime check, so
DataAndControlFlowWithoutRuntimeCheck is equivalent to
DataAndControlFlow.

With that we only need to store one tail folding style instead of two,
because we don't need to distinguish whether or not the IV update
overflows (to a non-zero value)
nasherm pushed a commit to nasherm/llvm-project that referenced this pull request Mar 3, 2026
After llvm#144963 and llvm#183292 we never emit the runtime check, so
DataAndControlFlowWithoutRuntimeCheck is equivalent to
DataAndControlFlow.

With that we only need to store one tail folding style instead of two,
because we don't need to distinguish whether or not the IV update
overflows (to a non-zero value)
sahas3 pushed a commit to sahas3/llvm-project that referenced this pull request Mar 4, 2026
…heck (llvm#183729)

Previously, the canonical IV increment may have overflowed to a non-zero
value due to vscale being a non power-of-two. So we used to emit a
runtime check for this.

If you didn't want the runtime check,
DataAndControlFlowWithoutRuntimeCheck skipped it and instead tweaked the
trip count so it wouldn't overflow.

However llvm#144963 stopped the check from ever being emitted because vscale
is always a power-of-two on AArch64 and RISC-V, so it never overflowed
to a non-zero value. And in llvm#183292 the code to emit the check was
removed. But we never restored the trip count back to normal when the
target's vscale was a power-of-two.

Now that vscale is always a power-of-two, this PR avoids adjusting it. A
follow up NFC can then remove DataAndControlFlowWithoutRuntimeCheck.
sahas3 pushed a commit to sahas3/llvm-project that referenced this pull request Mar 4, 2026
After llvm#144963 and llvm#183292 we never emit the runtime check, so
DataAndControlFlowWithoutRuntimeCheck is equivalent to
DataAndControlFlow.

With that we only need to store one tail folding style instead of two,
because we don't need to distinguish whether or not the IV update
overflows (to a non-zero value)
sujianIBM pushed a commit to sujianIBM/llvm-project that referenced this pull request Mar 5, 2026
After llvm#183080 this is no longer
a configurable property.

NOTE: No test changes expected beyond
llvm/test/Transforms/LoopVectorize/scalable-predication.ll which has
been removed because it only existed to verfiy the now unsupported
functionality.
sujianIBM pushed a commit to sujianIBM/llvm-project that referenced this pull request Mar 5, 2026
…heck (llvm#183729)

Previously, the canonical IV increment may have overflowed to a non-zero
value due to vscale being a non power-of-two. So we used to emit a
runtime check for this.

If you didn't want the runtime check,
DataAndControlFlowWithoutRuntimeCheck skipped it and instead tweaked the
trip count so it wouldn't overflow.

However llvm#144963 stopped the check from ever being emitted because vscale
is always a power-of-two on AArch64 and RISC-V, so it never overflowed
to a non-zero value. And in llvm#183292 the code to emit the check was
removed. But we never restored the trip count back to normal when the
target's vscale was a power-of-two.

Now that vscale is always a power-of-two, this PR avoids adjusting it. A
follow up NFC can then remove DataAndControlFlowWithoutRuntimeCheck.
sujianIBM pushed a commit to sujianIBM/llvm-project that referenced this pull request Mar 5, 2026
After llvm#144963 and llvm#183292 we never emit the runtime check, so
DataAndControlFlowWithoutRuntimeCheck is equivalent to
DataAndControlFlow.

With that we only need to store one tail folding style instead of two,
because we don't need to distinguish whether or not the IV update
overflows (to a non-zero value)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AArch64 backend:RISC-V llvm:analysis Includes value tracking, cost tables and constant folding llvm:SelectionDAG SelectionDAGISel as well llvm:transforms vectorizers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants