-
Notifications
You must be signed in to change notification settings - Fork 16k
[mlir][SCF] Fix region branch op interfaces for scf.forall and its terminator
#174221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mlir][SCF] Fix region branch op interfaces for scf.forall and its terminator
#174221
Conversation
|
@llvm/pr-subscribers-mlir-scf Author: Matthias Springer (matthias-springer) Changes
Incomplete interface implementation is a problem for transformations that try to understand the data flow by querying the Full diff: https://github.com/llvm/llvm-project/pull/174221.diff 3 Files Affected:
diff --git a/mlir/include/mlir/Dialect/SCF/IR/SCFOps.td b/mlir/include/mlir/Dialect/SCF/IR/SCFOps.td
index 8bdf3e0b566ef..8c6c5e56e3645 100644
--- a/mlir/include/mlir/Dialect/SCF/IR/SCFOps.td
+++ b/mlir/include/mlir/Dialect/SCF/IR/SCFOps.td
@@ -326,7 +326,6 @@ def ForallOp : SCF_Op<"forall", [
"promoteIfSingleIteration", "yieldTiledValuesAndReplace"]>,
RecursiveMemoryEffects,
SingleBlockImplicitTerminator<"scf::InParallelOp">,
- DeclareOpInterfaceMethods<RegionBranchOpInterface>,
DestinationStyleOpInterface,
HasParallelRegion
]> {
diff --git a/mlir/lib/Dialect/SCF/IR/SCF.cpp b/mlir/lib/Dialect/SCF/IR/SCF.cpp
index 8803a6d136f7a..0c6d539be0db4 100644
--- a/mlir/lib/Dialect/SCF/IR/SCF.cpp
+++ b/mlir/lib/Dialect/SCF/IR/SCF.cpp
@@ -2013,23 +2013,6 @@ void ForallOp::getCanonicalizationPatterns(RewritePatternSet &results,
ForallOpReplaceConstantInductionVar>(context);
}
-/// Given the region at `index`, or the parent operation if `index` is None,
-/// return the successor regions. These are the regions that may be selected
-/// during the flow of control. `operands` is a set of optional attributes that
-/// correspond to a constant value for each operand, or null if that operand is
-/// not a constant.
-void ForallOp::getSuccessorRegions(RegionBranchPoint point,
- SmallVectorImpl<RegionSuccessor> ®ions) {
- // In accordance with the semantics of forall, its body is executed in
- // parallel by multiple threads. We should not expect to branch back into
- // the forall body after the region's execution is complete.
- if (point.isParent())
- regions.push_back(RegionSuccessor(&getRegion(), getRegionIterArgs()));
- else
- regions.push_back(
- RegionSuccessor(getOperation(), getOperation()->getResults()));
-}
-
//===----------------------------------------------------------------------===//
// InParallelOp
//===----------------------------------------------------------------------===//
diff --git a/mlir/test/Analysis/DataFlow/test-liveness-analysis.mlir b/mlir/test/Analysis/DataFlow/test-liveness-analysis.mlir
index 171a35fdeafb9..5c277f31dd02e 100644
--- a/mlir/test/Analysis/DataFlow/test-liveness-analysis.mlir
+++ b/mlir/test/Analysis/DataFlow/test-liveness-analysis.mlir
@@ -346,7 +346,7 @@ func.func @affine_loop_no_use_iv() {
// CHECK-LABEL: test_tag: forall:
// CHECK-NEXT: operand #0: live
// CHECK-NEXT: region: #0:
-// CHECK-NEXT: argument: #0: live
+// CHECK-NEXT: argument: #0: not live
func.func @forall_no_use_iv_has_side_effect_op(%idx1: index, %idx2: index) {
scf.parallel (%i) = (%idx1) to (%idx2) step (%idx2) {
|
|
@llvm/pr-subscribers-mlir Author: Matthias Springer (matthias-springer) Changes
Incomplete interface implementation is a problem for transformations that try to understand the data flow by querying the Full diff: https://github.com/llvm/llvm-project/pull/174221.diff 3 Files Affected:
diff --git a/mlir/include/mlir/Dialect/SCF/IR/SCFOps.td b/mlir/include/mlir/Dialect/SCF/IR/SCFOps.td
index 8bdf3e0b566ef..8c6c5e56e3645 100644
--- a/mlir/include/mlir/Dialect/SCF/IR/SCFOps.td
+++ b/mlir/include/mlir/Dialect/SCF/IR/SCFOps.td
@@ -326,7 +326,6 @@ def ForallOp : SCF_Op<"forall", [
"promoteIfSingleIteration", "yieldTiledValuesAndReplace"]>,
RecursiveMemoryEffects,
SingleBlockImplicitTerminator<"scf::InParallelOp">,
- DeclareOpInterfaceMethods<RegionBranchOpInterface>,
DestinationStyleOpInterface,
HasParallelRegion
]> {
diff --git a/mlir/lib/Dialect/SCF/IR/SCF.cpp b/mlir/lib/Dialect/SCF/IR/SCF.cpp
index 8803a6d136f7a..0c6d539be0db4 100644
--- a/mlir/lib/Dialect/SCF/IR/SCF.cpp
+++ b/mlir/lib/Dialect/SCF/IR/SCF.cpp
@@ -2013,23 +2013,6 @@ void ForallOp::getCanonicalizationPatterns(RewritePatternSet &results,
ForallOpReplaceConstantInductionVar>(context);
}
-/// Given the region at `index`, or the parent operation if `index` is None,
-/// return the successor regions. These are the regions that may be selected
-/// during the flow of control. `operands` is a set of optional attributes that
-/// correspond to a constant value for each operand, or null if that operand is
-/// not a constant.
-void ForallOp::getSuccessorRegions(RegionBranchPoint point,
- SmallVectorImpl<RegionSuccessor> ®ions) {
- // In accordance with the semantics of forall, its body is executed in
- // parallel by multiple threads. We should not expect to branch back into
- // the forall body after the region's execution is complete.
- if (point.isParent())
- regions.push_back(RegionSuccessor(&getRegion(), getRegionIterArgs()));
- else
- regions.push_back(
- RegionSuccessor(getOperation(), getOperation()->getResults()));
-}
-
//===----------------------------------------------------------------------===//
// InParallelOp
//===----------------------------------------------------------------------===//
diff --git a/mlir/test/Analysis/DataFlow/test-liveness-analysis.mlir b/mlir/test/Analysis/DataFlow/test-liveness-analysis.mlir
index 171a35fdeafb9..5c277f31dd02e 100644
--- a/mlir/test/Analysis/DataFlow/test-liveness-analysis.mlir
+++ b/mlir/test/Analysis/DataFlow/test-liveness-analysis.mlir
@@ -346,7 +346,7 @@ func.func @affine_loop_no_use_iv() {
// CHECK-LABEL: test_tag: forall:
// CHECK-NEXT: operand #0: live
// CHECK-NEXT: region: #0:
-// CHECK-NEXT: argument: #0: live
+// CHECK-NEXT: argument: #0: not live
func.func @forall_no_use_iv_has_side_effect_op(%idx1: index, %idx2: index) {
scf.parallel (%i) = (%idx1) to (%idx2) step (%idx2) {
|
|
I am not sure about the "interface implementation completeness": do we require somewhere that terminators in operations implementing I'm also not opposed to the change, but I think we need to nail it down this time. |
|
Outside of the requirement (or not) for |
Not all terminators are necessarily If you have a transformation that analyzes I see two options:
For an op that does forward any values, we could still require that the
Yes, that the problem that I wanted to address first. It's unclear to me why the op even implements the interface. If the answer is "we need it" for some reason, then Option 1 above is not feasible with the current design. @cxy-1993 please take a look when you get a chance. |
|
I found additional code that misbehaves when terminators model region control flow but do not implement the The above function may return "false". To be fair, I wrote that function (and the places where it is used) some time ago. |
I never said that all terminators should implement it, only
which is exactly
Currently we don't seem to require it. I have written such transformations and in a conservative way: |
I think we're still talking about different things. There are legitimate cases where even a terminator inside of a
I'd like to see if we can rule out case 2 to simplify the design. Here's a problem with case 2 that I hinted to above: While the If we relax this API, I agree that we could probably safely allow case 2 and even support conservative value propagation as you described above. But that means that we also have to check/update all existing analyses/transformations that query control flow from the edit: Even the
I just looked into this. It's a bit different from |
|
The discussion here sidetracked a bit. I created a new PR to make |
f14c7c1 to
9cad651
Compare
RegionBranchOpInterface from scf.forallRegionBranchOpInterface to scf.forall.in_parallel
|
After clarifying the semantics of Basically what @ftynse said above:
As @joker-eph mentioned above, the main issue seems to be with the special merge logic of the op. This commit drops that part from the interface implementation. |
| Pure, | ||
| Terminator, | ||
| DeclareOpInterfaceMethods<InParallelOpInterface>, | ||
| ReturnLike, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This is a shortcut for RegionBranchTerminatorOpInterface.
|
|
||
| /// RegionBranchOpInterface | ||
|
|
||
| OperandRange getEntrySuccessorOperands(RegionSuccessor successor) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: This now falls back to the default implementation of getEntrySuccessorOperands: an empty operand range.
7787a1d to
8e04b6b
Compare
9cad651 to
007d2dc
Compare
RegionBranchOpInterface to scf.forall.in_parallelRegionBranchTerminatorOpInterface to scf.forall.in_parallel
RegionBranchTerminatorOpInterface to scf.forall.in_parallelscf.forall and its terminator
007d2dc to
dda3109
Compare
dda3109 to
0ab6ffd
Compare
| // When first entering the forall op, the control flow typically branches | ||
| // into the forall body. (In parallel for multiple threads.) | ||
| regions.push_back(RegionSuccessor(&getRegion())); | ||
| // However, when there are 0 threads, the control flow may branch back to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we test the 0 thread case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The getSuccessorRegions implementation doesn't actually check the number of iterations of the loop. I.e., doesn't matter if 0 or more iterations. It just (conservatively) populates both region successors. (This could be improved across the entire SCF dialect. E.g., the implementation for scf.for also doesn't look at the number of iterations...)
I updated the test case to use an SSA value for the number of threads. In that case, we must populate both region successors because we do not statically know the branching behavior. Is that good enough? (That test case used to produce incorrect results before.)
|
This is ready to be merged. Could you take another look? |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/129/builds/36454 Here is the relevant piece of the build log for the reference |
…terminator (llvm#174221) `scf.forall` does not completely implement the `RegionBranchOpInterface`: `scf.forall.in_parallel` does not implement the `RegionBranchTerminatorOpInterface`. Incomplete interface implementation is a problem for transformations that try to understand the control flow by querying the `RegionBranchOpInterface`. Detailed explanation of what is wrong with the current implementation. - There is exactly one region branch point: "parent". `in_parallel` is not a region branch point because it does not implement the `RegionBranchTerminatorOpInterface`. (Clarified in llvm#174978.) - `ForallOp::getSuccessorRegions(parent)` returns one region successors: the region of the `scf.forall` op. - Since there is no region branch point in the region, there is no way to leave the region. This means: once you enter the region, you are stuck in it indefinitely. (It is unspecified what happens once you are in the region, but we can be sure that you cannot leave it.) This commit adds the `RegionBranchTerminatorOpInterface` (via `ReturnLike`) to `scf.forall.in_parallel` to indicate the after leaving the region, the control flow returns to the parent. (Note: Only block terminators in directly nested regions can be region branch terminators, so `in_parallel` is the only possible op. I.e., `parallel_insert_slice` cannot be a region branch terminator.) This commit also removes all successor operands / inputs from the implementation. The number of successor operands and successor inputs must match, but `scf.forall.in_parallel` has no operands. Therefore, the region must also have 0 successor inputs. Therefore, the `scf.forall` op must also have 0 successor operands. This commit also adds a missing control flow edge from "parent" to "parent": in case of 0 threads, the region is not entered. Depends on llvm#174978.
scf.foralldoes not completely implement theRegionBranchOpInterface:scf.forall.in_paralleldoes not implement theRegionBranchTerminatorOpInterface.Incomplete interface implementation is a problem for transformations that try to understand the control flow by querying the
RegionBranchOpInterface.Detailed explanation of what is wrong with the current implementation.
in_parallelis not a region branch point because it does not implement theRegionBranchTerminatorOpInterface. (Clarified in [mlir][Interfaces][NFC] Document thatRegionBranchTerminatorOpInterfaceis mandatory #174978.)ForallOp::getSuccessorRegions(parent)returns one region successors: the region of thescf.forallop.This commit adds the
RegionBranchTerminatorOpInterface(viaReturnLike) toscf.forall.in_parallelto indicate the after leaving the region, the control flow returns to the parent. (Note: Only block terminators in directly nested regions can be region branch terminators, soin_parallelis the only possible op. I.e.,parallel_insert_slicecannot be a region branch terminator.)This commit also removes all successor operands / inputs from the implementation. The number of successor operands and successor inputs must match, but
scf.forall.in_parallelhas no operands. Therefore, the region must also have 0 successor inputs. Therefore, thescf.forallop must also have 0 successor operands.This commit also adds a missing control flow edge from "parent" to "parent": in case of 0 threads, the region is not entered.
Depends on #174978.