Skip to content

[mlir][linalg] Fix vectorizer generating invalid vector.gather for 0-D tensor.extract#187085

Merged
banach-space merged 2 commits intollvm:mainfrom
edg-l:edgl/fix-vectorizer-0d-gather
Mar 19, 2026
Merged

[mlir][linalg] Fix vectorizer generating invalid vector.gather for 0-D tensor.extract#187085
banach-space merged 2 commits intollvm:mainfrom
edg-l:edgl/fix-vectorizer-0d-gather

Conversation

@edg-l
Copy link
Copy Markdown
Contributor

@edg-l edg-l commented Mar 17, 2026

Vectorizing a rank-0 linalg.generic whose body contains tensor.extract with data-dependent indices hits the Gather classification in getTensorExtractMemoryAccessPattern because isOutput1DVector returns false for a 0-D result. This produces an invalid vector.gather where operand #2 must be a vector of index values but gets a scalar index instead.

Fix classifies a 0-D result as ScalarBroadcast rather than Gather, and skips mask generation for 0-D in that path.

@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Mar 17, 2026

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-linalg

Author: Edgar (edg-l)

Changes

When vectorizing a rank-0 linalg.generic whose body contains tensor.extract with data-dependent indices, getTensorExtractMemoryAccessPattern fell through to the Gather classification because isOutput1DVector returns false for a 0-D result. This produced an invalid vector.gather where operand #2 must be a vector of index values but got a scalar index.

Error seen in practice:

'vector.gather' op operand #<!-- -->2 must be vector of integer or index values, but got 'index'

Fix in Vectorization.cpp:

  1. Early-return ScalarBroadcast in getTensorExtractMemoryAccessPattern when resType.getRank() == 0, before the isOutput1DVector check.
  2. Skip the vector&lt;1xi1&gt; masking step in the ScalarBroadcast handler when dstRank == 0, since 0-D vectors don't support masking.

Reproducer: ONNX Gather ops lowered to linalg.generic + tensor.extract on a rank-0 iteration space, as seen during GPT-2 model compilation via ONNX-MLIR.


Full diff: https://github.com/llvm/llvm-project/pull/187085.diff

1 Files Affected:

  • (modified) mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp (+18-11)
diff --git a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
index 0477815f329bf..d2439ef1f2bf4 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
@@ -1093,6 +1093,11 @@ getTensorExtractMemoryAccessPattern(tensor::ExtractOp extractOp,
   if (inputShape.getShape().empty())
     return VectorMemoryAccessKind::ScalarBroadcast;
 
+  // 0a. Is the result a 0-D vector? If yes, there are no iteration dimensions
+  // so the tensor.extract is a single scalar load regardless of the index.
+  if (resType.getRank() == 0)
+    return VectorMemoryAccessKind::ScalarBroadcast;
+
   // True for vectors that are effectively 1D, e.g. `vector<1x4x1xi32>`, false
   // otherwise.
   bool isOutput1DVector =
@@ -1254,19 +1259,21 @@ vectorizeTensorExtract(RewriterBase &rewriter, VectorizationState &state,
         rewriter, loc, resultType, extractOp.getTensor(), transferReadIdxs,
         /*padding=*/std::nullopt, permutationMap, inBounds);
 
-    // Mask this broadcasting xfer_read here rather than relying on the generic
-    // path (the generic path assumes identity masking map, which wouldn't be
-    // valid here).
-    SmallVector<int64_t> readMaskShape = {1};
-    auto readMaskType = VectorType::get(readMaskShape, rewriter.getI1Type());
-    auto allTrue = vector::ConstantMaskOp::create(
-        rewriter, loc, readMaskType, vector::ConstantMaskKind::AllTrue);
-    auto *maskedReadOp =
-        mlir::vector::maskOperation(rewriter, transferReadOp, allTrue);
+    Operation *resultOp = transferReadOp;
+    if (dstRank > 0) {
+      // Mask this broadcasting xfer_read here rather than relying on the
+      // generic path (the generic path assumes identity masking map, which
+      // wouldn't be valid here).
+      SmallVector<int64_t> readMaskShape = {1};
+      auto readMaskType = VectorType::get(readMaskShape, rewriter.getI1Type());
+      auto allTrue = vector::ConstantMaskOp::create(
+          rewriter, loc, readMaskType, vector::ConstantMaskKind::AllTrue);
+      resultOp =
+          mlir::vector::maskOperation(rewriter, transferReadOp, allTrue);
+    }
 
     LDBG() << "Vectorised as scalar broadcast load: " << extractOp;
-    return VectorizationHookResult{VectorizationHookStatus::NewOp,
-                                   maskedReadOp};
+    return VectorizationHookResult{VectorizationHookStatus::NewOp, resultOp};
   }
 
   // 2b. Handle contiguous access.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 17, 2026

✅ With the latest revision this PR passed the C/C++ code formatter.

@banach-space
Copy link
Copy Markdown
Contributor

banach-space commented Mar 17, 2026

Thanks for the fix! Please add tests :)

@edg-l edg-l force-pushed the edgl/fix-vectorizer-0d-gather branch from 97daa08 to 2fccbf9 Compare March 17, 2026 18:45
…D tensor.extract

When vectorizing a rank-0 linalg.generic whose body contains
tensor.extract with data-dependent indices, the vectorizer incorrectly
classified the access as a Gather (since the 0-D result vector has no
dimension > 1). This produced an invalid vector.gather with a scalar
index operand where a vector of indices is required.

Fix by classifying 0-D result vectors as ScalarBroadcast in
getTensorExtractMemoryAccessPattern, and skipping the masking logic
in the ScalarBroadcast path when the result rank is 0 (0-D vectors
don't support masking).
@edg-l edg-l force-pushed the edgl/fix-vectorizer-0d-gather branch from 2fccbf9 to a18ee37 Compare March 17, 2026 19:00
@edg-l
Copy link
Copy Markdown
Contributor Author

edg-l commented Mar 18, 2026

Thanks for the fix! Please add tests :)

added

Copy link
Copy Markdown
Contributor

@banach-space banach-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % nits

Thanks!

@banach-space
Copy link
Copy Markdown
Contributor

@edg-l , do you have commit access to land this?

@edg-l
Copy link
Copy Markdown
Contributor Author

edg-l commented Mar 18, 2026

@edg-l , do you have commit access to land this?

no i dont have any permission

@edg-l edg-l force-pushed the edgl/fix-vectorizer-0d-gather branch from 8d9649e to b68108e Compare March 19, 2026 19:02
@banach-space banach-space merged commit ae6fbd0 into llvm:main Mar 19, 2026
10 checks passed
ambergorzynski pushed a commit to ambergorzynski/llvm-project that referenced this pull request Mar 27, 2026
…D tensor.extract (llvm#187085)

Vectorizing a rank-0 `linalg.generic` whose body contains
`tensor.extract` with data-dependent indices hits the Gather
classification in `getTensorExtractMemoryAccessPattern` because
`isOutput1DVector` returns false for a 0-D result. This produces an
invalid `vector.gather` where operand llvm#2 must be a vector of index
values but gets a scalar `index` instead.

Fix classifies a 0-D result as ScalarBroadcast rather than Gather, and
skips mask generation for 0-D in that path.
albertbolt1 pushed a commit to albertbolt1/llvm-project that referenced this pull request Mar 28, 2026
…D tensor.extract (llvm#187085)

Vectorizing a rank-0 `linalg.generic` whose body contains
`tensor.extract` with data-dependent indices hits the Gather
classification in `getTensorExtractMemoryAccessPattern` because
`isOutput1DVector` returns false for a 0-D result. This produces an
invalid `vector.gather` where operand llvm#2 must be a vector of index
values but gets a scalar `index` instead.

Fix classifies a 0-D result as ScalarBroadcast rather than Gather, and
skips mask generation for 0-D in that path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants