[mlir][linalg] Fix vectorizer generating invalid vector.gather for 0-D tensor.extract#187085
Conversation
|
@llvm/pr-subscribers-mlir @llvm/pr-subscribers-mlir-linalg Author: Edgar (edg-l) ChangesWhen vectorizing a rank-0 Error seen in practice: Fix in
Reproducer: ONNX Gather ops lowered to Full diff: https://github.com/llvm/llvm-project/pull/187085.diff 1 Files Affected:
diff --git a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
index 0477815f329bf..d2439ef1f2bf4 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
@@ -1093,6 +1093,11 @@ getTensorExtractMemoryAccessPattern(tensor::ExtractOp extractOp,
if (inputShape.getShape().empty())
return VectorMemoryAccessKind::ScalarBroadcast;
+ // 0a. Is the result a 0-D vector? If yes, there are no iteration dimensions
+ // so the tensor.extract is a single scalar load regardless of the index.
+ if (resType.getRank() == 0)
+ return VectorMemoryAccessKind::ScalarBroadcast;
+
// True for vectors that are effectively 1D, e.g. `vector<1x4x1xi32>`, false
// otherwise.
bool isOutput1DVector =
@@ -1254,19 +1259,21 @@ vectorizeTensorExtract(RewriterBase &rewriter, VectorizationState &state,
rewriter, loc, resultType, extractOp.getTensor(), transferReadIdxs,
/*padding=*/std::nullopt, permutationMap, inBounds);
- // Mask this broadcasting xfer_read here rather than relying on the generic
- // path (the generic path assumes identity masking map, which wouldn't be
- // valid here).
- SmallVector<int64_t> readMaskShape = {1};
- auto readMaskType = VectorType::get(readMaskShape, rewriter.getI1Type());
- auto allTrue = vector::ConstantMaskOp::create(
- rewriter, loc, readMaskType, vector::ConstantMaskKind::AllTrue);
- auto *maskedReadOp =
- mlir::vector::maskOperation(rewriter, transferReadOp, allTrue);
+ Operation *resultOp = transferReadOp;
+ if (dstRank > 0) {
+ // Mask this broadcasting xfer_read here rather than relying on the
+ // generic path (the generic path assumes identity masking map, which
+ // wouldn't be valid here).
+ SmallVector<int64_t> readMaskShape = {1};
+ auto readMaskType = VectorType::get(readMaskShape, rewriter.getI1Type());
+ auto allTrue = vector::ConstantMaskOp::create(
+ rewriter, loc, readMaskType, vector::ConstantMaskKind::AllTrue);
+ resultOp =
+ mlir::vector::maskOperation(rewriter, transferReadOp, allTrue);
+ }
LDBG() << "Vectorised as scalar broadcast load: " << extractOp;
- return VectorizationHookResult{VectorizationHookStatus::NewOp,
- maskedReadOp};
+ return VectorizationHookResult{VectorizationHookStatus::NewOp, resultOp};
}
// 2b. Handle contiguous access.
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
|
Thanks for the fix! Please add tests :) |
97daa08 to
2fccbf9
Compare
…D tensor.extract When vectorizing a rank-0 linalg.generic whose body contains tensor.extract with data-dependent indices, the vectorizer incorrectly classified the access as a Gather (since the 0-D result vector has no dimension > 1). This produced an invalid vector.gather with a scalar index operand where a vector of indices is required. Fix by classifying 0-D result vectors as ScalarBroadcast in getTensorExtractMemoryAccessPattern, and skipping the masking logic in the ScalarBroadcast path when the result rank is 0 (0-D vectors don't support masking).
2fccbf9 to
a18ee37
Compare
added |
banach-space
left a comment
There was a problem hiding this comment.
LGTM % nits
Thanks!
|
@edg-l , do you have commit access to land this? |
no i dont have any permission |
8d9649e to
b68108e
Compare
…D tensor.extract (llvm#187085) Vectorizing a rank-0 `linalg.generic` whose body contains `tensor.extract` with data-dependent indices hits the Gather classification in `getTensorExtractMemoryAccessPattern` because `isOutput1DVector` returns false for a 0-D result. This produces an invalid `vector.gather` where operand llvm#2 must be a vector of index values but gets a scalar `index` instead. Fix classifies a 0-D result as ScalarBroadcast rather than Gather, and skips mask generation for 0-D in that path.
…D tensor.extract (llvm#187085) Vectorizing a rank-0 `linalg.generic` whose body contains `tensor.extract` with data-dependent indices hits the Gather classification in `getTensorExtractMemoryAccessPattern` because `isOutput1DVector` returns false for a 0-D result. This produces an invalid `vector.gather` where operand llvm#2 must be a vector of index values but gets a scalar `index` instead. Fix classifies a 0-D result as ScalarBroadcast rather than Gather, and skips mask generation for 0-D in that path.
Vectorizing a rank-0
linalg.genericwhose body containstensor.extractwith data-dependent indices hits the Gather classification ingetTensorExtractMemoryAccessPatternbecauseisOutput1DVectorreturns false for a 0-D result. This produces an invalidvector.gatherwhere operand #2 must be a vector of index values but gets a scalarindexinstead.Fix classifies a 0-D result as ScalarBroadcast rather than Gather, and skips mask generation for 0-D in that path.