Skip to content

Conversation

@boomanaiden154
Copy link
Contributor

a83c894 caused assertion failures here as if we have a single bit induction variable and two lanes (0 and 1), then the second lane index (1) will be out of bounds of what a signed 1-bit integer can hold. Lane indices are always >0 according to VPlanHelpers.h:125, and the lane representation in this code is also unsigned.

The test case come from tensorflow/XLA.

@llvmbot
Copy link
Member

llvmbot commented Jan 9, 2026

@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: Aiden Grossman (boomanaiden154)

Changes

a83c894 caused assertion failures here as if we have a single bit induction variable and two lanes (0 and 1), then the second lane index (1) will be out of bounds of what a signed 1-bit integer can hold. Lane indices are always >0 according to VPlanHelpers.h:125, and the lane representation in this code is also unsigned.

The test case come from tensorflow/XLA.


Full diff: https://github.com/llvm/llvm-project/pull/175231.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+3-4)
  • (added) llvm/test/Transforms/LoopVectorize/X86/vplan-single-bit-ind-var.ll (+47)
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index b9cd322d9ec69..e31e24fce5316 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -2346,9 +2346,8 @@ InstructionCost VPHeaderPHIRecipe::computeCost(ElementCount VF,
 
 /// A helper function that returns an integer or floating-point constant with
 /// value C.
-static Constant *getSignedIntOrFpConstant(Type *Ty, int64_t C) {
-  return Ty->isIntegerTy() ? ConstantInt::getSigned(Ty, C)
-                           : ConstantFP::get(Ty, C);
+static Constant *getUnsignedIntOrFpConstant(Type *Ty, int64_t C) {
+  return Ty->isIntegerTy() ? ConstantInt::get(Ty, C) : ConstantFP::get(Ty, C);
 }
 
 #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
@@ -2453,7 +2452,7 @@ void VPScalarIVStepsRecipe::execute(VPTransformState &State) {
 
   for (unsigned Lane = StartLane; Lane < EndLane; ++Lane) {
     Value *StartIdx = Builder.CreateBinOp(
-        AddOp, StartIdx0, getSignedIntOrFpConstant(BaseIVTy, Lane));
+        AddOp, StartIdx0, getUnsignedIntOrFpConstant(BaseIVTy, Lane));
     // The step returned by `createStepForVF` is a runtime-evaluated value
     // when VF is scalable. Otherwise, it should be folded into a Constant.
     assert((State.VF.isScalable() || isa<Constant>(StartIdx)) &&
diff --git a/llvm/test/Transforms/LoopVectorize/X86/vplan-single-bit-ind-var.ll b/llvm/test/Transforms/LoopVectorize/X86/vplan-single-bit-ind-var.ll
new file mode 100644
index 0000000000000..d501a485ac4db
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/X86/vplan-single-bit-ind-var.ll
@@ -0,0 +1,47 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
+; RUN: opt -passes=loop-vectorize -mcpu=znver3 -S %s 2>&1 | FileCheck %s
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-grtev4-linux-gnu"
+
+define ptr @copy_bitcast_fusion() {
+; CHECK-LABEL: define ptr @copy_bitcast_fusion(
+; CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  [[_PREHEADER7:.*:]]
+; CHECK-NEXT:    br label %[[VECTOR_PH:.*]]
+; CHECK:       [[VECTOR_PH]]:
+; CHECK-NEXT:    br label %[[VECTOR_BODY:.*]]
+; CHECK:       [[VECTOR_BODY]]:
+; CHECK-NEXT:    [[TMP0:%.*]] = select i1 false, i64 1, i64 0
+; CHECK-NEXT:    [[TMP1:%.*]] = select i1 true, i64 1, i64 0
+; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr float, ptr null, i64 [[TMP0]]
+; CHECK-NEXT:    [[TMP3:%.*]] = getelementptr float, ptr null, i64 [[TMP1]]
+; CHECK-NEXT:    [[TMP4:%.*]] = load float, ptr [[TMP2]], align 4
+; CHECK-NEXT:    [[TMP5:%.*]] = load float, ptr [[TMP3]], align 4
+; CHECK-NEXT:    [[TMP6:%.*]] = insertelement <2 x float> poison, float [[TMP4]], i32 0
+; CHECK-NEXT:    [[TMP7:%.*]] = insertelement <2 x float> [[TMP6]], float [[TMP5]], i32 1
+; CHECK-NEXT:    store <2 x float> [[TMP7]], ptr null, align 4
+; CHECK-NEXT:    br label %[[MIDDLE_BLOCK:.*]]
+; CHECK:       [[MIDDLE_BLOCK]]:
+; CHECK-NEXT:    br label %[[BB8:.*]]
+; CHECK:       [[BB8]]:
+; CHECK-NEXT:    ret ptr null
+;
+.preheader7:
+  br label %0
+
+0:                                                ; preds = %0, %.preheader7
+  %1 = phi i64 [ 0, %.preheader7 ], [ %7, %0 ]
+  %2 = trunc i64 %1 to i1
+  %3 = select i1 %2, i64 1, i64 0
+  %4 = getelementptr float, ptr null, i64 %3
+  %5 = load float, ptr %4, align 4
+  %6 = getelementptr float, ptr null, i64 %1
+  store float %5, ptr %6, align 4
+  %7 = add i64 %1, 1
+  %exitcond.not = icmp eq i64 %1, 1
+  br i1 %exitcond.not, label %8, label %0
+
+8:                                                ; preds = %0
+  ret ptr null
+}

@rupprecht rupprecht requested a review from nikic January 9, 2026 20:20
a83c894 caused assertion failures here
as if we have a single bit induction variable and two lanes (0 and 1),
then the second lane index (1) will be out of bounds of what a signed
1-bit integer can hold. Lane indices are always >0 according to
VPlanHelpers.h:125, and the lane representation in this code is also
unsigned.

The test case come from tensorflow/XLA.
@boomanaiden154 boomanaiden154 force-pushed the vplan-1-bit-ind-var-1-9-25 branch from ae8ddd4 to c1e7621 Compare January 9, 2026 21:59
@boomanaiden154 boomanaiden154 requested a review from nikic January 9, 2026 21:59
@boomanaiden154
Copy link
Contributor Author

(sorry for the force push. I was amending, forgetting I wasn't using spr)

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@boomanaiden154
Copy link
Contributor Author

Thank you very much for the fast review! Looks you already got the last issue we were seeing (1c07ba6).

@boomanaiden154 boomanaiden154 merged commit acb78bd into llvm:main Jan 9, 2026
9 of 10 checks passed
@boomanaiden154 boomanaiden154 deleted the vplan-1-bit-ind-var-1-9-25 branch January 9, 2026 22:28
Priyanshu3820 pushed a commit to Priyanshu3820/llvm-project that referenced this pull request Jan 18, 2026
a83c894 caused assertion failures here
as if we have a single bit induction variable and two lanes (0 and 1),
then the second lane index (1) will be out of bounds of what a signed
1-bit integer can hold. Lane indices are always >0 according to
VPlanHelpers.h:125, and the lane representation in this code is also
unsigned.

The test case come from tensorflow/XLA.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants