-
Notifications
You must be signed in to change notification settings - Fork 15.5k
[LV] Vectorize conditional scalar assignments #158088
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
756be4b
d3c1a9c
b1e535e
0083cd5
1fdc629
0972c1b
26a6c71
140a0d3
fd27c9f
ddaa28e
3bdc7ff
9fe220d
da25c67
f7618a3
dda8a83
f2fe1ae
6aa9dc1
01fc553
cef6c7a
b0486fa
4eb54d6
a2f3644
b5bf664
8656c46
2ac7132
a9d29c5
dde18ff
1f1bcb7
78682bf
ce7a1bc
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -58,6 +58,8 @@ bool RecurrenceDescriptor::isIntegerRecurrenceKind(RecurKind Kind) { | |
| case RecurKind::FindFirstIVUMin: | ||
| case RecurKind::FindLastIVSMax: | ||
| case RecurKind::FindLastIVUMax: | ||
| // TODO: Make type-agnostic. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this is not type-agnostic, should this be reflected in the name of the recurrence kind?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It isn't for the other FindFirst/FindLast (though that might be inferred by U/S Min/Max) or AnyOf. I think it's just the fp-based reduction types that are prefixed with an extra I did experiment with treating FindLast separately in |
||
| case RecurKind::FindLast: | ||
| return true; | ||
| } | ||
| return false; | ||
|
|
@@ -721,9 +723,15 @@ RecurrenceDescriptor::isAnyOfPattern(Loop *Loop, PHINode *OrigPhi, | |
| // if (src[i] > 3) | ||
| // r = i; | ||
| // } | ||
| // or like this: | ||
| // int r = 0; | ||
| // for (int i = 0; i < n; i++) { | ||
| // if (src[i] > 3) | ||
| // r = <loop-varying value>; | ||
| // } | ||
| // The reduction value (r) is derived from either the values of an induction | ||
| // variable (i) sequence, or from the start value (0). The LLVM IR generated for | ||
| // such loops would be as follows: | ||
| // variable (i) sequence, an arbitrary loop-varying value, or from the start | ||
| // value (0). The LLVM IR generated for such loops would be as follows: | ||
| // for.body: | ||
| // %r = phi i32 [ %spec.select, %for.body ], [ 0, %entry ] | ||
| // %i = phi i32 [ %inc, %for.body ], [ 0, %entry ] | ||
|
|
@@ -732,23 +740,27 @@ RecurrenceDescriptor::isAnyOfPattern(Loop *Loop, PHINode *OrigPhi, | |
| // %spec.select = select i1 %cmp, i32 %i, i32 %r | ||
| // %inc = add nsw i32 %i, 1 | ||
| // ... | ||
| // Since 'i' is an induction variable, the reduction value after the loop will | ||
| // be the maximum (increasing induction) or minimum (decreasing induction) value | ||
| // of 'i' that the condition (src[i] > 3) is satisfied, or the start value (0 in | ||
| // the example above). When the start value of the induction variable 'i' is | ||
| // greater than the minimum (increasing induction) or maximum (decreasing | ||
| // induction) value of the data type, we can use the minimum (increasing | ||
| // induction) or maximum (decreasing induction) value of the data type as a | ||
| // sentinel value to replace the start value. This allows us to perform a single | ||
| // reduction max (increasing induction) or min (decreasing induction) operation | ||
| // to obtain the final reduction result. | ||
| // When searching for an induction variable (i), the reduction value after the | ||
| // loop will be the maximum (increasing induction) or minimum (decreasing | ||
| // induction) value of 'i' that the condition (src[i] > 3) is satisfied, or the | ||
| // start value (0 in the example above). When the start value of the induction | ||
| // variable 'i' is greater than the minimum (increasing induction) or maximum | ||
| // (decreasing induction) value of the data type, we can use the minimum | ||
| // (increasing induction) or maximum (decreasing induction) value of the data | ||
| // type as a sentinel value to replace the start value. This allows us to | ||
| // perform a single reduction max (increasing induction) or min (decreasing | ||
| // induction) operation to obtain the final reduction result. | ||
|
Comment on lines
+743
to
+752
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I know this isn't your writing, but this is very difficult to follow.. |
||
| // TODO: It is possible to solve the case where the start value is the minimum | ||
| // value of the data type or a non-constant value by using mask and multiple | ||
| // reduction operations. | ||
| // | ||
| // When searching for an arbitrary loop-varying value, the reduction value will | ||
| // either be the initial value (0) if the condition was never met, or the value | ||
| // of the loop-varying value in the most recent loop iteration where the | ||
| // condition was met. | ||
| RecurrenceDescriptor::InstDesc | ||
| RecurrenceDescriptor::isFindIVPattern(RecurKind Kind, Loop *TheLoop, | ||
| PHINode *OrigPhi, Instruction *I, | ||
| ScalarEvolution &SE) { | ||
| RecurrenceDescriptor::isFindPattern(Loop *TheLoop, PHINode *OrigPhi, | ||
| Instruction *I, ScalarEvolution &SE) { | ||
| // TODO: Support the vectorization of FindLastIV when the reduction phi is | ||
| // used by more than one select instruction. This vectorization is only | ||
| // performed when the SCEV of each increasing induction variable used by the | ||
|
|
@@ -757,8 +769,10 @@ RecurrenceDescriptor::isFindIVPattern(RecurKind Kind, Loop *TheLoop, | |
| return InstDesc(false, I); | ||
|
|
||
| // We are looking for selects of the form: | ||
| // select(cmp(), phi, loop_induction) or | ||
| // select(cmp(), loop_induction, phi) | ||
| // select(cmp(), phi, value) or | ||
| // select(cmp(), value, phi) | ||
| // where 'value' must be a loop induction variable | ||
| // (for FindFirstIV/FindLastIV) or an arbitrary value (for FindLast). | ||
| // TODO: Match selects with multi-use cmp conditions. | ||
| Value *NonRdxPhi = nullptr; | ||
| if (!match(I, m_CombineOr(m_Select(m_OneUse(m_Cmp()), m_Value(NonRdxPhi), | ||
|
|
@@ -769,7 +783,7 @@ RecurrenceDescriptor::isFindIVPattern(RecurKind Kind, Loop *TheLoop, | |
|
|
||
| // Returns either FindFirstIV/FindLastIV, if such a pattern is found, or | ||
| // std::nullopt. | ||
| auto GetRecurKind = [&](Value *V) -> std::optional<RecurKind> { | ||
| auto GetFindFirstLastIVRecurKind = [&](Value *V) -> std::optional<RecurKind> { | ||
| Type *Ty = V->getType(); | ||
| if (!SE.isSCEVable(Ty)) | ||
| return std::nullopt; | ||
|
|
@@ -780,8 +794,9 @@ RecurrenceDescriptor::isFindIVPattern(RecurKind Kind, Loop *TheLoop, | |
| m_SpecificLoop(TheLoop)))) | ||
| return std::nullopt; | ||
|
|
||
| if ((isFindFirstIVRecurrenceKind(Kind) && !SE.isKnownNegative(Step)) || | ||
| (isFindLastIVRecurrenceKind(Kind) && !SE.isKnownPositive(Step))) | ||
| // We must have a known positive or negative step for FindIV | ||
| const bool PositiveStep = SE.isKnownPositive(Step); | ||
| if (!SE.isKnownNonZero(Step)) | ||
| return std::nullopt; | ||
|
|
||
| // Check if the minimum (FindLast) or maximum (FindFirst) value of the | ||
|
|
@@ -797,7 +812,7 @@ RecurrenceDescriptor::isFindIVPattern(RecurKind Kind, Loop *TheLoop, | |
| IsSigned ? SE.getSignedRange(AR) : SE.getUnsignedRange(AR); | ||
| unsigned NumBits = Ty->getIntegerBitWidth(); | ||
| ConstantRange ValidRange = ConstantRange::getEmpty(NumBits); | ||
| if (isFindLastIVRecurrenceKind(Kind)) { | ||
| if (PositiveStep) { | ||
| APInt Sentinel = IsSigned ? APInt::getSignedMinValue(NumBits) | ||
| : APInt::getMinValue(NumBits); | ||
| ValidRange = ConstantRange::getNonEmpty(Sentinel + 1, Sentinel); | ||
|
|
@@ -811,26 +826,22 @@ RecurrenceDescriptor::isFindIVPattern(RecurKind Kind, Loop *TheLoop, | |
| APInt::getMinValue(NumBits), APInt::getMaxValue(NumBits) - 1); | ||
| } | ||
|
|
||
| LLVM_DEBUG(dbgs() << "LV: " | ||
| << (isFindLastIVRecurrenceKind(Kind) ? "FindLastIV" | ||
| : "FindFirstIV") | ||
| << " valid range is " << ValidRange | ||
| << ", and the range of " << *AR << " is " << IVRange | ||
| << "\n"); | ||
| LLVM_DEBUG( | ||
| dbgs() << "LV: " << (PositiveStep ? "FindLastIV" : "FindFirstIV") | ||
| << " valid range is " << ValidRange << ", and the range of " | ||
| << *AR << " is " << IVRange << "\n"); | ||
|
|
||
| // Ensure the induction variable does not wrap around by verifying that | ||
| // its range is fully contained within the valid range. | ||
| return ValidRange.contains(IVRange); | ||
| }; | ||
| if (isFindLastIVRecurrenceKind(Kind)) { | ||
| if (PositiveStep) { | ||
| if (CheckRange(true)) | ||
| return RecurKind::FindLastIVSMax; | ||
| if (CheckRange(false)) | ||
| return RecurKind::FindLastIVUMax; | ||
| return std::nullopt; | ||
| } | ||
| assert(isFindFirstIVRecurrenceKind(Kind) && | ||
| "Kind must either be a FindLastIV or FindFirstIV"); | ||
|
|
||
| if (CheckRange(true)) | ||
| return RecurKind::FindFirstIVSMin; | ||
|
|
@@ -839,10 +850,11 @@ RecurrenceDescriptor::isFindIVPattern(RecurKind Kind, Loop *TheLoop, | |
| return std::nullopt; | ||
| }; | ||
|
|
||
| if (auto RK = GetRecurKind(NonRdxPhi)) | ||
| if (auto RK = GetFindFirstLastIVRecurKind(NonRdxPhi)) | ||
| return InstDesc(I, *RK); | ||
|
|
||
| return InstDesc(false, I); | ||
| // If the recurrence is not specific to an IV, return a generic FindLast. | ||
| return InstDesc(I, RecurKind::FindLast); | ||
| } | ||
|
|
||
| RecurrenceDescriptor::InstDesc | ||
|
|
@@ -976,8 +988,8 @@ RecurrenceDescriptor::InstDesc RecurrenceDescriptor::isRecurrenceInstr( | |
| Kind == RecurKind::Add || Kind == RecurKind::Mul || | ||
| Kind == RecurKind::Sub || Kind == RecurKind::AddChainWithSubs) | ||
| return isConditionalRdxPattern(I); | ||
| if (isFindIVRecurrenceKind(Kind) && SE) | ||
| return isFindIVPattern(Kind, L, OrigPhi, I, *SE); | ||
| if (isFindRecurrenceKind(Kind) && SE) | ||
| return isFindPattern(L, OrigPhi, I, *SE); | ||
| [[fallthrough]]; | ||
| case Instruction::FCmp: | ||
| case Instruction::ICmp: | ||
|
|
@@ -1117,14 +1129,9 @@ bool RecurrenceDescriptor::isReductionPHI(PHINode *Phi, Loop *TheLoop, | |
| << "\n"); | ||
| return true; | ||
| } | ||
| if (AddReductionVar(Phi, RecurKind::FindLastIVSMax, TheLoop, FMF, RedDes, DB, | ||
| AC, DT, SE)) { | ||
| LLVM_DEBUG(dbgs() << "Found a FindLastIV reduction PHI." << *Phi << "\n"); | ||
| return true; | ||
| } | ||
| if (AddReductionVar(Phi, RecurKind::FindFirstIVSMin, TheLoop, FMF, RedDes, DB, | ||
| AC, DT, SE)) { | ||
| LLVM_DEBUG(dbgs() << "Found a FindFirstIV reduction PHI." << *Phi << "\n"); | ||
| if (AddReductionVar(Phi, RecurKind::FindLast, TheLoop, FMF, RedDes, DB, AC, | ||
| DT, SE)) { | ||
| LLVM_DEBUG(dbgs() << "Found a Find reduction PHI." << *Phi << "\n"); | ||
| return true; | ||
| } | ||
| if (AddReductionVar(Phi, RecurKind::FMul, TheLoop, FMF, RedDes, DB, AC, DT, | ||
|
|
@@ -1174,7 +1181,6 @@ bool RecurrenceDescriptor::isReductionPHI(PHINode *Phi, Loop *TheLoop, | |
| << "\n"); | ||
| return true; | ||
| } | ||
|
|
||
| // Not a reduction of known type. | ||
| return false; | ||
| } | ||
|
|
@@ -1299,6 +1305,7 @@ unsigned RecurrenceDescriptor::getOpcode(RecurKind Kind) { | |
| case RecurKind::FMaximumNum: | ||
| case RecurKind::FMinimumNum: | ||
| return Instruction::FCmp; | ||
| case RecurKind::FindLast: | ||
| case RecurKind::AnyOf: | ||
| case RecurKind::FindFirstIVSMin: | ||
| case RecurKind::FindFirstIVUMin: | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4604,6 +4604,13 @@ X86TTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA, | |
| break; | ||
| } | ||
|
|
||
| // FIXME: There's a bug in SelectionDAG legalization for some data types | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would be great if you could file an issue for that.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Raised as #171831
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Great thanks! Do we still need anotther check to also avoid forced vectorization if the cost of the intrinsic is invalid? |
||
| // between a select and an extract when using this intrinsic for X86 without | ||
| // AVX512. | ||
| if (IID == Intrinsic::experimental_vector_extract_last_active && | ||
| !ST->hasAVX512()) | ||
| return InstructionCost::getInvalid(); | ||
|
|
||
| if (ISD != ISD::DELETED_NODE) { | ||
| auto adjustTableCost = [&](int ISD, unsigned Cost, | ||
| std::pair<InstructionCost, MVT> LT, | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4326,11 +4326,15 @@ bool LoopVectorizationPlanner::isCandidateForEpilogueVectorization( | |
| ElementCount VF) const { | ||
| // Cross iteration phis such as fixed-order recurrences and FMaxNum/FMinNum | ||
| // reductions need special handling and are currently unsupported. | ||
| // FindLast reductions also require special handling for the synthesized | ||
| // mask PHI. | ||
| if (any_of(OrigLoop->getHeader()->phis(), [&](PHINode &Phi) { | ||
| if (!Legal->isReductionVariable(&Phi)) | ||
| return Legal->isFixedOrderRecurrence(&Phi); | ||
| return RecurrenceDescriptor::isFPMinMaxNumRecurrenceKind( | ||
| Legal->getRecurrenceDescriptor(&Phi).getRecurrenceKind()); | ||
| RecurKind Kind = | ||
| Legal->getRecurrenceDescriptor(&Phi).getRecurrenceKind(); | ||
| return RecurrenceDescriptor::isFindLastRecurrenceKind(Kind) || | ||
| RecurrenceDescriptor::isFPMinMaxNumRecurrenceKind(Kind); | ||
| })) | ||
| return false; | ||
|
|
||
|
|
@@ -4636,6 +4640,14 @@ LoopVectorizationPlanner::selectInterleaveCount(VPlan &Plan, ElementCount VF, | |
| any_of(Plan.getVectorLoopRegion()->getEntryBasicBlock()->phis(), | ||
| IsaPred<VPReductionPHIRecipe>); | ||
|
|
||
| // FIXME: implement interleaving for FindLast transform correctly. | ||
| if (any_of(make_second_range(Legal->getReductionVars()), | ||
| [](const RecurrenceDescriptor &RdxDesc) { | ||
| return RecurrenceDescriptor::isFindLastRecurrenceKind( | ||
| RdxDesc.getRecurrenceKind()); | ||
| })) | ||
| return 1; | ||
|
Comment on lines
+4643
to
+4649
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we need to move this check earlier. (see the check a bit before the call to selectInterleaveCount that handles isSafeForAnyVectorwidth). We take the max of UserIC (set via flag or pragma) and the IC returned here, so we still may end up interleaving if the user requests it. Would be good to add a test with forced interleaving.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's a second check in Trying to override UserIC early on results in an ICE for the assert with !LVL.isSafeForAnyVectorWidth() after the cost model has been checked. I have added a test with an interleaving hint to both the AArch64 and X86 test files.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh right. |
||
|
|
||
| // If we did not calculate the cost for VF (because the user selected the VF) | ||
| // then we calculate the cost of VF here. | ||
| if (LoopCost == 0) { | ||
|
|
@@ -8623,6 +8635,11 @@ VPlanPtr LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes( | |
| *Plan)) | ||
| return nullptr; | ||
|
|
||
| // Create whole-vector selects for find-last recurrences. | ||
| if (!VPlanTransforms::runPass(VPlanTransforms::handleFindLastReductions, | ||
| *Plan)) | ||
| return nullptr; | ||
|
|
||
| // Transform recipes to abstract recipes if it is legal and beneficial and | ||
| // clamp the range for better cost estimation. | ||
| // TODO: Enable following transform when the EVL-version of extended-reduction | ||
|
|
@@ -8744,10 +8761,11 @@ void LoopVectorizationPlanner::adjustRecipesForReductions( | |
| continue; | ||
|
|
||
| RecurKind Kind = PhiR->getRecurrenceKind(); | ||
| assert( | ||
| !RecurrenceDescriptor::isAnyOfRecurrenceKind(Kind) && | ||
| !RecurrenceDescriptor::isFindIVRecurrenceKind(Kind) && | ||
| "AnyOf and FindIV reductions are not allowed for in-loop reductions"); | ||
| assert(!RecurrenceDescriptor::isFindLastRecurrenceKind(Kind) && | ||
| !RecurrenceDescriptor::isAnyOfRecurrenceKind(Kind) && | ||
| !RecurrenceDescriptor::isFindIVRecurrenceKind(Kind) && | ||
| "AnyOf, FindIV, and FindLast reductions are not allowed for in-loop " | ||
| "reductions"); | ||
|
|
||
| bool IsFPRecurrence = | ||
| RecurrenceDescriptor::isFloatingPointRecurrenceKind(Kind); | ||
|
|
@@ -9059,7 +9077,8 @@ void LoopVectorizationPlanner::adjustRecipesForReductions( | |
| RecurKind RK = RdxDesc.getRecurrenceKind(); | ||
| if ((!RecurrenceDescriptor::isAnyOfRecurrenceKind(RK) && | ||
| !RecurrenceDescriptor::isFindIVRecurrenceKind(RK) && | ||
| !RecurrenceDescriptor::isMinMaxRecurrenceKind(RK))) { | ||
| !RecurrenceDescriptor::isMinMaxRecurrenceKind(RK) && | ||
| !RecurrenceDescriptor::isFindLastRecurrenceKind(RK))) { | ||
| VPBuilder PHBuilder(Plan->getVectorPreheader()); | ||
| VPValue *Iden = Plan->getOrAddLiveIn( | ||
| getRecurrenceIdentity(RK, PhiTy, RdxDesc.getFastMathFlags())); | ||
|
|
@@ -10194,6 +10213,18 @@ bool LoopVectorizePass::processLoop(Loop *L) { | |
| // Override IC if user provided an interleave count. | ||
| IC = UserIC > 0 ? UserIC : IC; | ||
|
|
||
| // FIXME: Enable interleaving for FindLast reductions. | ||
| if (any_of(LVL.getReductionVars().values(), [](auto &RdxDesc) { | ||
| return RecurrenceDescriptor::isFindLastRecurrenceKind( | ||
| RdxDesc.getRecurrenceKind()); | ||
| })) { | ||
| LLVM_DEBUG(dbgs() << "LV: Not interleaving due to FindLast reduction.\n"); | ||
| IntDiagMsg = {"FindLastPreventsScalarInterleaving", | ||
| "Unable to interleave due to FindLast reduction."}; | ||
| InterleaveLoop = false; | ||
| IC = 1; | ||
| } | ||
|
|
||
| // Emit diagnostic messages, if any. | ||
| const char *VAPassName = Hints.vectorizeAnalysisPassName(); | ||
| if (!VectorizeLoop && !InterleaveLoop) { | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at least loop-unrolling also needs to be thought about the new kind, seeing crashes when building the test suite currently. I think to reproduce you can just add a loop with CAS to https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/LoopUnroll/partial-unroll-reductions.ll.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. I think I left it alone originally since I made FindLast a generic RecurKind that could handle int, float, and pointer types (and it therefore didn't appear in the isIntegerRecurrenceKind list.) I figured that could be a follow-up PR (and could potentially convert AnyOf at the same time).