[IA] Fix crash when dealing with deinterleave(interleave)#150713
[IA] Fix crash when dealing with deinterleave(interleave)#150713
Conversation
Having a sequence of `deinterleave2(interleave2)` causes a crash due to recent change that expects `getMaskOperand` to only work with load or store intrinsics. The change relaxes this and moves `llvm_unreachables` into lowering of interleaved loads or stores
|
@llvm/pr-subscribers-backend-x86 Author: Nikolay Panchenko (npanchen) ChangesHaving a sequence of Full diff: https://github.com/llvm/llvm-project/pull/150713.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/InterleavedAccessPass.cpp b/llvm/lib/CodeGen/InterleavedAccessPass.cpp
index c2839d4c60680..de6f5add74dbe 100644
--- a/llvm/lib/CodeGen/InterleavedAccessPass.cpp
+++ b/llvm/lib/CodeGen/InterleavedAccessPass.cpp
@@ -256,7 +256,7 @@ static bool isReInterleaveMask(ShuffleVectorInst *SVI, unsigned &Factor,
static Value *getMaskOperand(IntrinsicInst *II) {
switch (II->getIntrinsicID()) {
default:
- llvm_unreachable("Unexpected intrinsic");
+ return nullptr;
case Intrinsic::vp_load:
return II->getOperand(1);
case Intrinsic::masked_load:
@@ -382,8 +382,11 @@ bool InterleavedAccessImpl::lowerInterleavedLoad(
if (LI) {
LLVM_DEBUG(dbgs() << "IA: Found an interleaved load: " << *Load << "\n");
} else {
+ Value *MaskOperand = getMaskOperand(II);
+ if (!MaskOperand)
+ llvm_unreachable("unsupported intrinsic");
// Check mask operand. Handle both all-true/false and interleaved mask.
- Mask = getMask(getMaskOperand(II), Factor, VecTy);
+ Mask = getMask(MaskOperand, Factor, VecTy);
if (!Mask)
return false;
@@ -534,10 +537,12 @@ bool InterleavedAccessImpl::lowerInterleavedStore(
if (SI) {
LLVM_DEBUG(dbgs() << "IA: Found an interleaved store: " << *Store << "\n");
} else {
+ Value *MaskOperand = getMaskOperand(II);
+ if (!MaskOperand)
+ llvm_unreachable("unsupported intrinsic");
// Check mask operand. Handle both all-true/false and interleaved mask.
unsigned LaneMaskLen = NumStoredElements / Factor;
- Mask = getMask(getMaskOperand(II), Factor,
- ElementCount::getFixed(LaneMaskLen));
+ Mask = getMask(MaskOperand, Factor, ElementCount::getFixed(LaneMaskLen));
if (!Mask)
return false;
@@ -634,9 +639,12 @@ bool InterleavedAccessImpl::lowerDeinterleaveIntrinsic(
<< " and factor = " << Factor << "\n");
} else {
assert(II);
+ Value *MaskOperand = getMaskOperand(II);
+ if (!MaskOperand)
+ return false;
// Check mask operand. Handle both all-true/false and interleaved mask.
- Mask = getMask(getMaskOperand(II), Factor, getDeinterleavedVectorType(DI));
+ Mask = getMask(MaskOperand, Factor, getDeinterleavedVectorType(DI));
if (!Mask)
return false;
@@ -673,8 +681,11 @@ bool InterleavedAccessImpl::lowerInterleaveIntrinsic(
Value *Mask = nullptr;
if (II) {
+ Value *MaskOperand = getMaskOperand(II);
+ if (!MaskOperand)
+ return false;
// Check mask operand. Handle both all-true/false and interleaved mask.
- Mask = getMask(getMaskOperand(II), Factor,
+ Mask = getMask(MaskOperand, Factor,
cast<VectorType>(InterleaveValues[0]->getType()));
if (!Mask)
return false;
diff --git a/llvm/test/CodeGen/X86/x86-interleaved-access.ll b/llvm/test/CodeGen/X86/x86-interleaved-access.ll
index 7cddebdca5cca..de6b18c134464 100644
--- a/llvm/test/CodeGen/X86/x86-interleaved-access.ll
+++ b/llvm/test/CodeGen/X86/x86-interleaved-access.ll
@@ -1897,3 +1897,15 @@ define <2 x i64> @PR37616(ptr %a0) nounwind {
%shuffle = shufflevector <16 x i64> %load, <16 x i64> undef, <2 x i32> <i32 2, i32 6>
ret <2 x i64> %shuffle
}
+
+define { <8 x float>, <8 x float> } @interleave_deinterleave2() {
+; AVX-LABEL: interleave_deinterleave2:
+; AVX: # %bb.0: # %.entry
+; AVX-NEXT: vxorps %xmm0, %xmm0, %xmm0
+; AVX-NEXT: vxorps %xmm1, %xmm1, %xmm1
+; AVX-NEXT: retq
+.entry:
+ %0 = call <16 x float> @llvm.vector.interleave2.v16f32(<8 x float> zeroinitializer, <8 x float> zeroinitializer)
+ %1 = call { <8 x float>, <8 x float> } @llvm.vector.deinterleave2.v16f32(<16 x float> %0)
+ ret { <8 x float>, <8 x float> } %1
+}
|
That's what I would do. |
If I understood your right, you suggest to add a check |
| LLVM_DEBUG(dbgs() << "IA: Found an interleaved store: " << *Store << "\n"); | ||
| } else { | ||
| Value *MaskOperand = getMaskOperand(II); | ||
| if (!MaskOperand) |
There was a problem hiding this comment.
I we do it this way, I would use an assert here.
| } else { | ||
| Value *MaskOperand = getMaskOperand(II); | ||
| if (!MaskOperand) | ||
| llvm_unreachable("unsupported intrinsic"); |
There was a problem hiding this comment.
I we do it this way, I would use an assert here.
From an interface design, it feels a little weird to me that the |
preames
left a comment
There was a problem hiding this comment.
There should be no code changed needed in lowerInterleavedLoad, or lowerInterleavedStore. The condition is checked by the caller. We do need an update to lowerInterleaveIntrinsic and lowerDeinterleaveIntrinsic as it looks like I dropped the check in the refactor which introduced getMaskOperand (as you note).
The simplest fix is probably to add:
if (II && II->getIntrinsicID() != Intrinsic::masked_load &&
II->getIntrinsicID() != Intrinsic::vp_load)
return false
To the just after the II variable is created. (Next to where we have the LI->isSimple check).
If you want to do that, please feel free. Otherwise, I'll post a patch Monday morning.
|
I went ahead and pushed a direct fix for this in f65b329. I used your test case, and added a couple others. Looking at the change, I think this could become a reasonable cleanup. I'd be tempted to fold the getMaskOperand into getMask (by replacing the mask operand with the LD/ST intrinsic). If we did that, we could reuse the existing exit paths for unsupported intrinsics. Do you want to reframe this patch to do that, or would you like me to follow up? p.s. Sorry for the breakage. In a case like this, feel free to drop a comment on the original review or file an issue. I generally try to jump on such oversights pretty quickly. |
Having a sequence of
deinterleave2(interleave2)causes a crash due to recent change that expectsgetMaskOperandto only work with load or store intrinsics.The change relaxes this and moves
llvm_unreachablesinto lowering of interleaved loads or stores