-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[mlir][linalg] set inbounds on xfer_read/writes for assumeDynamicDimsMatchVecSizes
#160839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mlir][linalg] set inbounds on xfer_read/writes for assumeDynamicDimsMatchVecSizes
#160839
Conversation
…umeDynamicDimsMatchVecSizes
|
@llvm/pr-subscribers-mlir-linalg Author: Ege Beysel (egebeysel) ChangesThe idea from #146531 was to introduce the flag In the existence of scalable tile sizes, subsequent patterns tend to overwrite the inbounds attribute and introduce masks further down when lowered to loads and stores. This PR explicitly sets the inbounds attribute to all-true for Full diff: https://github.com/llvm/llvm-project/pull/160839.diff 2 Files Affected:
diff --git a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
index 15c467b21c81e..4db6057519da5 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
@@ -524,6 +524,23 @@ VectorizationState::maskOperation(RewriterBase &rewriter, Operation *opToMask,
if (!mask) {
LDBG() << "No mask required";
+ if (assumeDynamicDimsMatchVecSizes) {
+ LDBG() << "Assuming dynamic dimensions match vector sizes!";
+ // Set inbounds to all-true.
+ llvm::TypeSwitch<Operation *>(opToMask)
+ .Case<vector::TransferReadOp, vector::TransferWriteOp>(
+ [&](auto xferOp) {
+ SmallVector<bool> inBoundsMap(xferOp.getInBounds().size(),
+ true);
+ rewriter.modifyOpInPlace(xferOp, [&]() {
+ xferOp.setInBoundsAttr(
+ rewriter.getBoolArrayAttr(inBoundsMap));
+ });
+ })
+ .Default([](Operation *op) {
+ // No-op if the operation is not an xfer read or write.
+ });
+ }
return opToMask;
}
diff --git a/mlir/test/Dialect/Linalg/vectorization/linalg-ops.mlir b/mlir/test/Dialect/Linalg/vectorization/linalg-ops.mlir
index 62bf1f55c9af2..7fe707902072a 100644
--- a/mlir/test/Dialect/Linalg/vectorization/linalg-ops.mlir
+++ b/mlir/test/Dialect/Linalg/vectorization/linalg-ops.mlir
@@ -918,12 +918,16 @@ func.func @mmt4d_scalable_with_assume(%A: memref<16x16x8x1xf32>, %B: memref<16x1
// CHECK-SAME: %[[B:.*]]: memref<16x16x?x1xf32>,
// CHECK-SAME: %[[C_IN:.*]]: memref<16x16x8x?xf32>) {
// CHECK-NOT: mask
-// CHECK: %[[VEC_A:.*]] = vector.transfer_read %[[A]]{{.*}} : memref<16x16x8x1xf32>, vector<16x16x16x8x[4]x1xf32>
-// CHECK: %[[VEC_B:.*]] = vector.transfer_read %[[B]]{{.*}} : memref<16x16x?x1xf32>, vector<16x16x16x8x[4]x1xf32>
-// CHECK: %[[VEC_C:.*]] = vector.transfer_read %[[C_IN]]{{.*}} : memref<16x16x8x?xf32>, vector<16x16x8x[4]xf32>
+// CHECK: %[[VEC_A:.*]] = vector.transfer_read %[[A]]
+// CHECK-SAME: in_bounds = [true, true, true, true, true, true]{{.*}} : memref<16x16x8x1xf32>, vector<16x16x16x8x[4]x1xf32>
+// CHECK: %[[VEC_B:.*]] = vector.transfer_read %[[B]]
+// CHECK-SAME: in_bounds = [true, true, true, true, true, true]{{.*}} : memref<16x16x?x1xf32>, vector<16x16x16x8x[4]x1xf32>
+// CHECK: %[[VEC_C:.*]] = vector.transfer_read %[[C_IN]]
+// CHECK-SAME: in_bounds = [true, true, true, true]{{.*}} : memref<16x16x8x?xf32>, vector<16x16x8x[4]xf32>
// CHECK: %[[MUL:.*]] = arith.mulf %[[VEC_A]], %[[VEC_B]] : vector<16x16x16x8x[4]x1xf32>
// CHECK: %[[RED:.*]] = vector.multi_reduction <add>, %[[MUL]], %[[VEC_C]] [2, 5] : vector<16x16x16x8x[4]x1xf32> to vector<16x16x8x[4]xf32>
-// CHECK: vector.transfer_write %[[RED]], %[[C_IN]]{{.*}} : vector<16x16x8x[4]xf32>, memref<16x16x8x?xf32>
+// CHECK: vector.transfer_write %[[RED]], %[[C_IN]]
+// CHECK-SAME: in_bounds = [true, true, true, true]{{.*}} : vector<16x16x8x[4]xf32>, memref<16x16x8x?xf32>
module attributes {transform.with_named_sequence} {
transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
@@ -1011,12 +1015,16 @@ func.func @batch_mmt4d_scalable_with_assume(%A: memref<2x16x16x8x1xf32>, %B: mem
// CHECK-SAME: %[[B:.*]]: memref<2x16x16x?x1xf32>,
// CHECK-SAME: %[[C_IN:.*]]: memref<2x16x16x8x?xf32>) {
// CHECK-NOT: mask
-// CHECK: %[[VEC_A:.*]] = vector.transfer_read %[[A]]{{.*}} : memref<2x16x16x8x1xf32>, vector<2x16x16x16x8x[4]x1xf32>
-// CHECK: %[[VEC_B:.*]] = vector.transfer_read %[[B]]{{.*}} : memref<2x16x16x?x1xf32>, vector<2x16x16x16x8x[4]x1xf32>
-// CHECK: %[[VEC_C:.*]] = vector.transfer_read %[[C_IN]]{{.*}} : memref<2x16x16x8x?xf32>, vector<2x16x16x8x[4]xf32>
+// CHECK: %[[VEC_A:.*]] = vector.transfer_read %[[A]]
+// CHECK-SAME: in_bounds = [true, true, true, true, true, true, true]{{.*}} : memref<2x16x16x8x1xf32>, vector<2x16x16x16x8x[4]x1xf32>
+// CHECK: %[[VEC_B:.*]] = vector.transfer_read %[[B]]
+// CHECK-SAME: in_bounds = [true, true, true, true, true, true, true]{{.*}} : memref<2x16x16x?x1xf32>, vector<2x16x16x16x8x[4]x1xf32>
+// CHECK: %[[VEC_C:.*]] = vector.transfer_read %[[C_IN]]
+// CHECK-SAME: in_bounds = [true, true, true, true, true]{{.*}} : memref<2x16x16x8x?xf32>, vector<2x16x16x8x[4]xf32>
// CHECK: %[[MUL:.*]] = arith.mulf %[[VEC_A]], %[[VEC_B]] : vector<2x16x16x16x8x[4]x1xf32>
// CHECK: %[[RED:.*]] = vector.multi_reduction <add>, %[[MUL]], %[[VEC_C]] [3, 6] : vector<2x16x16x16x8x[4]x1xf32> to vector<2x16x16x8x[4]xf32>
-// CHECK: vector.transfer_write %[[RED]], %[[C_IN]]{{.*}} : vector<2x16x16x8x[4]xf32>, memref<2x16x16x8x?xf32>
+// CHECK: vector.transfer_write %[[RED]], %[[C_IN]]
+// CHECK-SAME: in_bounds = [true, true, true, true, true]{{.*}} : vector<2x16x16x8x[4]xf32>, memref<2x16x16x8x?xf32>
module attributes {transform.with_named_sequence} {
transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
|
|
@llvm/pr-subscribers-mlir Author: Ege Beysel (egebeysel) ChangesThe idea from #146531 was to introduce the flag In the existence of scalable tile sizes, subsequent patterns tend to overwrite the inbounds attribute and introduce masks further down when lowered to loads and stores. This PR explicitly sets the inbounds attribute to all-true for Full diff: https://github.com/llvm/llvm-project/pull/160839.diff 2 Files Affected:
diff --git a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
index 15c467b21c81e..4db6057519da5 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
@@ -524,6 +524,23 @@ VectorizationState::maskOperation(RewriterBase &rewriter, Operation *opToMask,
if (!mask) {
LDBG() << "No mask required";
+ if (assumeDynamicDimsMatchVecSizes) {
+ LDBG() << "Assuming dynamic dimensions match vector sizes!";
+ // Set inbounds to all-true.
+ llvm::TypeSwitch<Operation *>(opToMask)
+ .Case<vector::TransferReadOp, vector::TransferWriteOp>(
+ [&](auto xferOp) {
+ SmallVector<bool> inBoundsMap(xferOp.getInBounds().size(),
+ true);
+ rewriter.modifyOpInPlace(xferOp, [&]() {
+ xferOp.setInBoundsAttr(
+ rewriter.getBoolArrayAttr(inBoundsMap));
+ });
+ })
+ .Default([](Operation *op) {
+ // No-op if the operation is not an xfer read or write.
+ });
+ }
return opToMask;
}
diff --git a/mlir/test/Dialect/Linalg/vectorization/linalg-ops.mlir b/mlir/test/Dialect/Linalg/vectorization/linalg-ops.mlir
index 62bf1f55c9af2..7fe707902072a 100644
--- a/mlir/test/Dialect/Linalg/vectorization/linalg-ops.mlir
+++ b/mlir/test/Dialect/Linalg/vectorization/linalg-ops.mlir
@@ -918,12 +918,16 @@ func.func @mmt4d_scalable_with_assume(%A: memref<16x16x8x1xf32>, %B: memref<16x1
// CHECK-SAME: %[[B:.*]]: memref<16x16x?x1xf32>,
// CHECK-SAME: %[[C_IN:.*]]: memref<16x16x8x?xf32>) {
// CHECK-NOT: mask
-// CHECK: %[[VEC_A:.*]] = vector.transfer_read %[[A]]{{.*}} : memref<16x16x8x1xf32>, vector<16x16x16x8x[4]x1xf32>
-// CHECK: %[[VEC_B:.*]] = vector.transfer_read %[[B]]{{.*}} : memref<16x16x?x1xf32>, vector<16x16x16x8x[4]x1xf32>
-// CHECK: %[[VEC_C:.*]] = vector.transfer_read %[[C_IN]]{{.*}} : memref<16x16x8x?xf32>, vector<16x16x8x[4]xf32>
+// CHECK: %[[VEC_A:.*]] = vector.transfer_read %[[A]]
+// CHECK-SAME: in_bounds = [true, true, true, true, true, true]{{.*}} : memref<16x16x8x1xf32>, vector<16x16x16x8x[4]x1xf32>
+// CHECK: %[[VEC_B:.*]] = vector.transfer_read %[[B]]
+// CHECK-SAME: in_bounds = [true, true, true, true, true, true]{{.*}} : memref<16x16x?x1xf32>, vector<16x16x16x8x[4]x1xf32>
+// CHECK: %[[VEC_C:.*]] = vector.transfer_read %[[C_IN]]
+// CHECK-SAME: in_bounds = [true, true, true, true]{{.*}} : memref<16x16x8x?xf32>, vector<16x16x8x[4]xf32>
// CHECK: %[[MUL:.*]] = arith.mulf %[[VEC_A]], %[[VEC_B]] : vector<16x16x16x8x[4]x1xf32>
// CHECK: %[[RED:.*]] = vector.multi_reduction <add>, %[[MUL]], %[[VEC_C]] [2, 5] : vector<16x16x16x8x[4]x1xf32> to vector<16x16x8x[4]xf32>
-// CHECK: vector.transfer_write %[[RED]], %[[C_IN]]{{.*}} : vector<16x16x8x[4]xf32>, memref<16x16x8x?xf32>
+// CHECK: vector.transfer_write %[[RED]], %[[C_IN]]
+// CHECK-SAME: in_bounds = [true, true, true, true]{{.*}} : vector<16x16x8x[4]xf32>, memref<16x16x8x?xf32>
module attributes {transform.with_named_sequence} {
transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
@@ -1011,12 +1015,16 @@ func.func @batch_mmt4d_scalable_with_assume(%A: memref<2x16x16x8x1xf32>, %B: mem
// CHECK-SAME: %[[B:.*]]: memref<2x16x16x?x1xf32>,
// CHECK-SAME: %[[C_IN:.*]]: memref<2x16x16x8x?xf32>) {
// CHECK-NOT: mask
-// CHECK: %[[VEC_A:.*]] = vector.transfer_read %[[A]]{{.*}} : memref<2x16x16x8x1xf32>, vector<2x16x16x16x8x[4]x1xf32>
-// CHECK: %[[VEC_B:.*]] = vector.transfer_read %[[B]]{{.*}} : memref<2x16x16x?x1xf32>, vector<2x16x16x16x8x[4]x1xf32>
-// CHECK: %[[VEC_C:.*]] = vector.transfer_read %[[C_IN]]{{.*}} : memref<2x16x16x8x?xf32>, vector<2x16x16x8x[4]xf32>
+// CHECK: %[[VEC_A:.*]] = vector.transfer_read %[[A]]
+// CHECK-SAME: in_bounds = [true, true, true, true, true, true, true]{{.*}} : memref<2x16x16x8x1xf32>, vector<2x16x16x16x8x[4]x1xf32>
+// CHECK: %[[VEC_B:.*]] = vector.transfer_read %[[B]]
+// CHECK-SAME: in_bounds = [true, true, true, true, true, true, true]{{.*}} : memref<2x16x16x?x1xf32>, vector<2x16x16x16x8x[4]x1xf32>
+// CHECK: %[[VEC_C:.*]] = vector.transfer_read %[[C_IN]]
+// CHECK-SAME: in_bounds = [true, true, true, true, true]{{.*}} : memref<2x16x16x8x?xf32>, vector<2x16x16x8x[4]xf32>
// CHECK: %[[MUL:.*]] = arith.mulf %[[VEC_A]], %[[VEC_B]] : vector<2x16x16x16x8x[4]x1xf32>
// CHECK: %[[RED:.*]] = vector.multi_reduction <add>, %[[MUL]], %[[VEC_C]] [3, 6] : vector<2x16x16x16x8x[4]x1xf32> to vector<2x16x16x8x[4]xf32>
-// CHECK: vector.transfer_write %[[RED]], %[[C_IN]]{{.*}} : vector<2x16x16x8x[4]xf32>, memref<2x16x16x8x?xf32>
+// CHECK: vector.transfer_write %[[RED]], %[[C_IN]]
+// CHECK-SAME: in_bounds = [true, true, true, true, true]{{.*}} : vector<2x16x16x8x[4]xf32>, memref<2x16x16x8x?xf32>
module attributes {transform.with_named_sequence} {
transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
|
banach-space
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good % minor comments. I am away next week, so approving as is to unblock you.
Thanks for fixing this 🙏🏻
| LDBG() << "No mask required"; | ||
| if (assumeDynamicDimsMatchVecSizes) { | ||
| LDBG() << "Assuming dynamic dimensions match vector sizes!"; | ||
| // Set inbounds to all-true. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment tells me what is happening, but that's also what the code says :) Could you instead clarify why we need this extra step? I would also update the LLDB message.
Basically,
- xfer_read/xfer_write are special and hence this special treatment (this does not concern any other Ops)
- We need to set
in_boundsexplicitly as otherwise things further down the line will assume "out-of-bouds".
| // Set inbounds to all-true. | |
| For vector.transfer_read and vector.transfer_write, there is also the `in-bounds` attribute that we need to set explicitly. Otherwise, "out-of-bounds" access will be assumed and masks will be generated. |
| if (!mask) { | ||
| LDBG() << "No mask required"; | ||
| if (assumeDynamicDimsMatchVecSizes) { | ||
| LDBG() << "Assuming dynamic dimensions match vector sizes!"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this inside the lambda. Otherwise it will be printed for every Op, which could be noise, no?
| LDBG() << "Assuming dynamic dimensions match vector sizes!"; | |
| LDBG() << "Assuming dynamic dimensions match vector sizes, set in_bounds to true!"; |
| SmallVector<bool> inBoundsMap(xferOp.getInBounds().size(), | ||
| true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, assumeDynamicDimsMatchVecSizes only applies to dynamic sizes ;-) Static should remain "out-of-bounds". IIRC, something else will later infer that it's in fact "in bounds" 🤞🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, completely missed that 🥲 Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the logic to only set the attribute to true for dynamic dims, tested downstream with IREE that something else later infers that it's in fact "in-bounds" :) WDYT?
Signed-off-by: Ege Beysel <[email protected]>
banach-space
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, lgtm!
…imsMatchVecSizes ` (#160839) The idea from #146531 was to introduce the flag `assumeDynamicDimsMatchVecSizes`, to signal the vectorizer that the access should not be masked and is in-bounds. Though the masking part is handled, `xfer_read/write` ops are created without explicitly setting the inbounds attribute, which defaults to all-false. In the existence of scalable tile sizes, subsequent patterns tend to overwrite the inbounds attribute and introduce masks further down when lowered to loads and stores. This PR explicitly sets the inbounds attribute to all-true for `xfer_read/write` ops if the `assumeDynamicDimsMatchVecSizes` flag is set. --------- Signed-off-by: Ege Beysel <[email protected]>
…imsMatchVecSizes ` (llvm#160839) The idea from llvm#146531 was to introduce the flag `assumeDynamicDimsMatchVecSizes`, to signal the vectorizer that the access should not be masked and is in-bounds. Though the masking part is handled, `xfer_read/write` ops are created without explicitly setting the inbounds attribute, which defaults to all-false. In the existence of scalable tile sizes, subsequent patterns tend to overwrite the inbounds attribute and introduce masks further down when lowered to loads and stores. This PR explicitly sets the inbounds attribute to all-true for `xfer_read/write` ops if the `assumeDynamicDimsMatchVecSizes` flag is set. --------- Signed-off-by: Ege Beysel <[email protected]>
…imsMatchVecSizes ` (llvm#160839) The idea from llvm#146531 was to introduce the flag `assumeDynamicDimsMatchVecSizes`, to signal the vectorizer that the access should not be masked and is in-bounds. Though the masking part is handled, `xfer_read/write` ops are created without explicitly setting the inbounds attribute, which defaults to all-false. In the existence of scalable tile sizes, subsequent patterns tend to overwrite the inbounds attribute and introduce masks further down when lowered to loads and stores. This PR explicitly sets the inbounds attribute to all-true for `xfer_read/write` ops if the `assumeDynamicDimsMatchVecSizes` flag is set. --------- Signed-off-by: Ege Beysel <[email protected]>
The idea from #146531 was to introduce the flag
assumeDynamicDimsMatchVecSizes, to signal the vectorizer that the access should not be masked and is in-bounds. Though the masking part is handled,xfer_read/writeops are created without explicitly setting the inbounds attribute, which defaults to all-false.In the existence of scalable tile sizes, subsequent patterns tend to overwrite the inbounds attribute and introduce masks further down when lowered to loads and stores. This PR explicitly sets the inbounds attribute to all-true for
xfer_read/writeops if theassumeDynamicDimsMatchVecSizesflag is set.