Skip to content

Conversation

@fhahn
Copy link
Contributor

@fhahn fhahn commented Dec 8, 2025

ExtractLastLane is a no-op for scalar VFs. Update simplifyRecipe to remove them. This also requires adjusting the code in VPlanUnroll.cpp to split off handling of ExtractLastLane/ExtractPenultimateElement for scalar VFs, which now needs to match ExtractLastPart.

ExtractLastLane is a no-op  for scalar VFs. Update simplifyRecipe to
remove them. This also requires adjusting the code in VPlanUnroll.cpp
to split off handling of ExtractLastLane/ExtractPenultimateElement for
scalar VFs, which now needs to match ExtractLastPart.
@llvmbot
Copy link
Member

llvmbot commented Dec 8, 2025

@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

ExtractLastLane is a no-op for scalar VFs. Update simplifyRecipe to remove them. This also requires adjusting the code in VPlanUnroll.cpp to split off handling of ExtractLastLane/ExtractPenultimateElement for scalar VFs, which now needs to match ExtractLastPart.


Full diff: https://github.com/llvm/llvm-project/pull/171145.diff

3 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp (+10-6)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp (+7-5)
  • (modified) llvm/test/Transforms/LoopVectorize/interleave-and-scalarize-only.ll (+1-2)
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index 320baeb454d46..4ad098d748568 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -1385,12 +1385,16 @@ static void simplifyRecipe(VPSingleDefRecipe *Def, VPTypeAnalysis &TypeInfo) {
     return;
   }
 
-  // Look through ExtractLastLane (BuildVector ....).
-  if (match(Def, m_ExtractLastLane(m_BuildVector()))) {
-    auto *BuildVector = cast<VPInstruction>(Def->getOperand(0));
-    Def->replaceAllUsesWith(
-        BuildVector->getOperand(BuildVector->getNumOperands() - 1));
-    return;
+  // Look through ExtractLastLane.
+  if (match(Def, m_ExtractLastLane(m_VPValue(A)))) {
+    if (match(A, m_BuildVector())) {
+      auto *BuildVector = cast<VPInstruction>(A);
+      Def->replaceAllUsesWith(
+          BuildVector->getOperand(BuildVector->getNumOperands() - 1));
+      return;
+    }
+    if (Plan->hasScalarVFOnly())
+      return Def->replaceAllUsesWith(A);
   }
 
   // Look through ExtractPenultimateElement (BuildVector ....).
diff --git a/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp b/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
index 6fb706ea7d64b..7b4c524712d9a 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
@@ -371,10 +371,9 @@ void UnrollState::unrollBlock(VPBlockBase *VPB) {
       continue;
     }
 
-    if (match(&R, m_ExtractLastLaneOfLastPart(m_VPValue(Op0))) ||
-        match(&R, m_ExtractPenultimateElement(m_VPValue(Op0)))) {
-      addUniformForAllParts(cast<VPSingleDefRecipe>(&R));
-      if (Plan.hasScalarVFOnly()) {
+    if (Plan.hasScalarVFOnly()) {
+      if (match(&R, m_ExtractLastPart(m_VPValue(Op0))) ||
+          match(&R, m_ExtractPenultimateElement(m_VPValue(Op0)))) {
         auto *I = cast<VPInstruction>(&R);
         bool IsPenultimatePart =
             I->getOpcode() == VPInstruction::ExtractPenultimateElement;
@@ -383,7 +382,10 @@ void UnrollState::unrollBlock(VPBlockBase *VPB) {
         I->replaceAllUsesWith(getValueForPart(Op0, PartIdx));
         continue;
       }
-      // For vector VF, always extract from the last part.
+    }
+    if (match(&R, m_ExtractLastLaneOfLastPart(m_VPValue(Op0))) ||
+        match(&R, m_ExtractPenultimateElement(m_VPValue(Op0)))) {
+      addUniformForAllParts(cast<VPSingleDefRecipe>(&R));
       R.setOperand(0, getValueForPart(Op0, UF - 1));
       continue;
     }
diff --git a/llvm/test/Transforms/LoopVectorize/interleave-and-scalarize-only.ll b/llvm/test/Transforms/LoopVectorize/interleave-and-scalarize-only.ll
index bbd596a772c53..c77afa870e2c1 100644
--- a/llvm/test/Transforms/LoopVectorize/interleave-and-scalarize-only.ll
+++ b/llvm/test/Transforms/LoopVectorize/interleave-and-scalarize-only.ll
@@ -220,7 +220,6 @@ exit:
 ; DBG-EMPTY:
 ; DBG-NEXT: middle.block:
 ; DBG-NEXT:   EMIT vp<[[RESUME_1_PART:%.+]]> = extract-last-part vp<[[SCALAR_STEPS]]>
-; DBG-NEXT:   EMIT vp<[[RESUME_1:%.+]]> = extract-last-lane vp<[[RESUME_1_PART]]>
 ; DBG-NEXT:   EMIT vp<[[CMP:%.+]]> = icmp eq vp<[[TC]]>, vp<[[VEC_TC]]>
 ; DBG-NEXT:   EMIT branch-on-cond vp<[[CMP]]>
 ; DBG-NEXT: Successor(s): ir-bb<exit>, scalar.ph
@@ -230,7 +229,7 @@ exit:
 ; DBG-EMPTY:
 ; DBG-NEXT: scalar.ph:
 ; DBG-NEXT:  EMIT-SCALAR vp<[[RESUME_IV:%.+]]> = phi [ vp<[[VTC]]>, middle.block ], [ ir<0>, ir-bb<entry> ]
-; DBG-NEXT:  EMIT-SCALAR vp<[[RESUME_P:%.*]]> = phi [ vp<[[RESUME_1]]>, middle.block ], [ ir<0>, ir-bb<entry> ]
+; DBG-NEXT:  EMIT-SCALAR vp<[[RESUME_P:%.*]]> = phi [ vp<[[RESUME_1_PART]]>, middle.block ], [ ir<0>, ir-bb<entry> ]
 ; DBG-NEXT: Successor(s): ir-bb<loop>
 ; DBG-EMPTY:
 ; DBG-NEXT: ir-bb<loop>:

Copy link
Contributor

@lukel97 lukel97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@Mel-Chen Mel-Chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

@fhahn fhahn enabled auto-merge (squash) December 9, 2025 11:41
@fhahn fhahn merged commit 0768068 into llvm:main Dec 9, 2025
9 of 10 checks passed
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Dec 9, 2025
…(#171145)

ExtractLastLane is a no-op for scalar VFs. Update simplifyRecipe to
remove them. This also requires adjusting the code in VPlanUnroll.cpp to
split off handling of ExtractLastLane/ExtractPenultimateElement for
scalar VFs, which now needs to match ExtractLastPart.

PR: llvm/llvm-project#171145
@fhahn fhahn deleted the vplan-remove-extract-scalar-vf branch December 9, 2025 13:29
@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 10, 2025

LLVM Buildbot has detected a new failure on builder clang-ppc64le-linux-test-suite running on ppc64le-clang-test-suite while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/95/builds/19379

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'SanitizerCommon-tsan-powerpc64le-Linux :: Linux/getpwnam_r_invalid_user.cpp' FAILED ********************
Exit Code: -6

Command Output (stdout):
--
# RUN: at line 2
/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/build/./bin/clang  --driver-mode=g++ -gline-tables-only -fsanitize=thread  -m64 -fno-function-sections -funwind-tables  -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/llvm-project/compiler-rt/test -ldl -O0 -g /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/llvm-project/compiler-rt/test/sanitizer_common/TestCases/Linux/getpwnam_r_invalid_user.cpp -o /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/build/runtimes/runtimes-bins/compiler-rt/test/sanitizer_common/tsan-powerpc64le-Linux/Linux/Output/getpwnam_r_invalid_user.cpp.tmp &&  /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/build/runtimes/runtimes-bins/compiler-rt/test/sanitizer_common/tsan-powerpc64le-Linux/Linux/Output/getpwnam_r_invalid_user.cpp.tmp
# executed command: /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/build/./bin/clang --driver-mode=g++ -gline-tables-only -fsanitize=thread -m64 -fno-function-sections -funwind-tables -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/llvm-project/compiler-rt/test -ldl -O0 -g /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/llvm-project/compiler-rt/test/sanitizer_common/TestCases/Linux/getpwnam_r_invalid_user.cpp -o /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/build/runtimes/runtimes-bins/compiler-rt/test/sanitizer_common/tsan-powerpc64le-Linux/Linux/Output/getpwnam_r_invalid_user.cpp.tmp
# executed command: /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/build/runtimes/runtimes-bins/compiler-rt/test/sanitizer_common/tsan-powerpc64le-Linux/Linux/Output/getpwnam_r_invalid_user.cpp.tmp
# .---command stderr------------
# | Result: 110
# | getpwnam_r_invalid_user.cpp.tmp: /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/llvm-project/compiler-rt/test/sanitizer_common/TestCases/Linux/getpwnam_r_invalid_user.cpp:19: int main(): Assertion `res == 0 || res == ENOENT' failed.
# `-----------------------------
# error: command failed with exit status: -6

--

********************


Copy link
Collaborator

@ayalz ayalz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of post-commits nits.

return;
// Look through ExtractLastLane.
if (match(Def, m_ExtractLastLane(m_VPValue(A)))) {
if (match(A, m_BuildVector())) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such BuildVector operands were visited and processed earlier by the code below folding them into Broadcast if all their operands are equal. Does that folding hold for VF=1 too; if so should such redundant Broadcasts be eliminated too? Worth looking for extracts from broadcast too/instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like currently there are no extracts of broadcasts that could be folded in the tests. For VF=1, the broadcasts itself should be removed, possibly before hitting this pattern, but I might be missing something.

Comment on lines +375 to +376
if (match(&R, m_ExtractLastPart(m_VPValue(Op0))) ||
match(&R, m_ExtractPenultimateElement(m_VPValue(Op0)))) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be simpler handle each match separately?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, the only difference is the index to extract from?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but folding that difference seems to complicate two trivial cases, the first which should hold for any VF, scalar or not:

    if (match(&R, m_ExtractLastPart(m_VPValue(Op0)))) {
      cast<VPInstruction>(&R)->replaceAllUsesWith(getValueForPart(Op0, UF - 1));
      continue;
    }

and the second which deals with scalar VF only:

    if (Plan.hasScalarVFOnly() && match(&R, m_ExtractPenultimateElement(m_VPValue(Op0)))) {
      // For scalar VF, use the penultimate scalar part w/o extracting a lane.
      cast<VPInstruction>(&R)->replaceAllUsesWith(getValueForPart(Op0, UF - 2));
      continue;
    }

?

Are both missing addUniformForAllParts(cast<VPSingleDefRecipe>(&R));?

Can also hoist from below the auto *SingleDef = dyn_cast<VPSingleDefRecipe>(&R) to be used for the RAUW's here and addUniformForAllParts() elsewhere, instead of having to cast<>(&R) after various matches. Perhaps also handle the !SingleDef cases first, to freely access SingleDef afterwards.

I->replaceAllUsesWith(getValueForPart(Op0, PartIdx));
continue;
}
// For vector VF, always extract from the last part.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth retaining the comment, perhaps emphasizing that
// For vector VF, the penultimate element is always extracted from the last part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, re-added in e6e3f94

fhahn added a commit that referenced this pull request Dec 12, 2025
Re-add and emphasize comment regarding extracting from the last part, as
suggested post-commit in #171145.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Dec 12, 2025
…t. (NFC)

Re-add and emphasize comment regarding extracting from the last part, as
suggested post-commit in llvm/llvm-project#171145.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants