-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[SECV] Try to push the op into ZExt: A + zext (-A + B) -> zext (B) #151227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-llvm-analysis @llvm/pr-subscribers-llvm-transforms Author: Florian Hahn (fhahn) ChangesTry to push the constant operand into a ZExt: The actual code supports ZExts with arbitrary number of arguments, hence the getAddExpr in the return. This helps SCEV reasoning in some cases, commonly when adding an offset to a zero-extended SCEV that subtracts the same offset. Note that this is restricted to cases where we can fold away an operand of the inner Add. This is needed to avoid bad interactions with patterns when forming ZExts, which try to push to ZExt to add operands. https://alive2.llvm.org/ce/z/uuYGC3k Full diff: https://github.com/llvm/llvm-project/pull/151227.diff 3 Files Affected:
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 0990a0daac80c..8e587d8cbced6 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -2682,6 +2682,21 @@ const SCEV *ScalarEvolution::getAddExpr(SmallVectorImpl<const SCEV *> &Ops,
return getAddExpr(NewOps, PreservedFlags);
}
}
+
+ // Try to push the constant operand into a ZExt: A + zext (-A + B) -> zext
+ // (B), if trunc (A) + -A + B does not unsigned-wrap.
+ if (auto *ZExt = dyn_cast<SCEVZeroExtendExpr>(Ops[1])) {
+ const SCEV *B = ZExt->getOperand(0);
+ const SCEV *NarrowA = getTruncateExpr(A, B->getType());
+ if (isa<SCEVAddExpr>(B) &&
+ NarrowA == getNegativeSCEV(cast<SCEVAddExpr>(B)->getOperand(0)) &&
+ getZeroExtendExpr(NarrowA, ZExt->getType()) == A &&
+ hasFlags(
+ StrengthenNoWrapFlags(this, scAddExpr, {NarrowA, B}, OrigFlags),
+ SCEV::FlagNUW)) {
+ return getZeroExtendExpr(getAddExpr(NarrowA, B), ZExt->getType());
+ }
+ }
}
// Canonicalize (-1 * urem X, Y) + X --> (Y * X/Y)
diff --git a/llvm/test/Transforms/IndVarSimplify/AArch64/fold-ext-add.ll b/llvm/test/Transforms/IndVarSimplify/AArch64/fold-ext-add.ll
index 48b92e905b8ce..640c910be4a82 100644
--- a/llvm/test/Transforms/IndVarSimplify/AArch64/fold-ext-add.ll
+++ b/llvm/test/Transforms/IndVarSimplify/AArch64/fold-ext-add.ll
@@ -10,21 +10,21 @@ define void @pred_mip_12(ptr %dst, ptr %src, i32 %n, i64 %offset) {
; CHECK-SAME: ptr [[DST:%.*]], ptr [[SRC:%.*]], i32 [[N:%.*]], i64 [[OFFSET:%.*]]) {
; CHECK-NEXT: [[ENTRY:.*]]:
; CHECK-NEXT: [[SMAX:%.*]] = call i32 @llvm.smax.i32(i32 [[N]], i32 1)
+; CHECK-NEXT: [[TMP0:%.*]] = zext nneg i32 [[SMAX]] to i64
+; CHECK-NEXT: [[TMP1:%.*]] = mul i64 [[OFFSET]], [[TMP0]]
+; CHECK-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr [[SRC]], i64 [[TMP1]]
; CHECK-NEXT: br label %[[OUTER_LOOP:.*]]
; CHECK: [[OUTER_LOOP_LOOPEXIT:.*]]:
-; CHECK-NEXT: [[PTR_IV_NEXT_LCSSA:%.*]] = phi ptr [ [[PTR_IV_NEXT:%.*]], %[[INNER_LOOP:.*]] ]
; CHECK-NEXT: br label %[[OUTER_LOOP]]
; CHECK: [[OUTER_LOOP]]:
-; CHECK-NEXT: [[OUTER_PTR:%.*]] = phi ptr [ [[SRC]], %[[ENTRY]] ], [ [[PTR_IV_NEXT_LCSSA]], %[[OUTER_LOOP_LOOPEXIT]] ]
+; CHECK-NEXT: [[OUTER_PTR:%.*]] = phi ptr [ [[SRC]], %[[ENTRY]] ], [ [[SCEVGEP]], %[[OUTER_LOOP_LOOPEXIT]] ]
; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
; CHECK-NEXT: br i1 [[C]], label %[[INNER_LOOP_PREHEADER:.*]], label %[[EXIT:.*]]
; CHECK: [[INNER_LOOP_PREHEADER]]:
-; CHECK-NEXT: br label %[[INNER_LOOP]]
+; CHECK-NEXT: br label %[[INNER_LOOP:.*]]
; CHECK: [[INNER_LOOP]]:
; CHECK-NEXT: [[IV:%.*]] = phi i32 [ [[IV_NEXT:%.*]], %[[INNER_LOOP]] ], [ 0, %[[INNER_LOOP_PREHEADER]] ]
-; CHECK-NEXT: [[PTR_IV:%.*]] = phi ptr [ [[PTR_IV_NEXT]], %[[INNER_LOOP]] ], [ [[SRC]], %[[INNER_LOOP_PREHEADER]] ]
; CHECK-NEXT: [[L:%.*]] = load i8, ptr [[OUTER_PTR]], align 1
-; CHECK-NEXT: [[PTR_IV_NEXT]] = getelementptr i8, ptr [[PTR_IV]], i64 [[OFFSET]]
; CHECK-NEXT: store i8 [[L]], ptr [[DST]], align 2
; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i32 [[IV]], 1
; CHECK-NEXT: [[EXITCOND:%.*]] = icmp ne i32 [[IV_NEXT]], [[SMAX]]
diff --git a/llvm/test/Transforms/IndVarSimplify/zext-nuw.ll b/llvm/test/Transforms/IndVarSimplify/zext-nuw.ll
index d24f9a4e40e38..17921afc5ff06 100644
--- a/llvm/test/Transforms/IndVarSimplify/zext-nuw.ll
+++ b/llvm/test/Transforms/IndVarSimplify/zext-nuw.ll
@@ -15,11 +15,9 @@ define void @_Z3fn1v() {
; CHECK-NEXT: [[J_SROA_0_0_COPYLOAD:%.*]] = load i8, ptr [[X5]], align 1
; CHECK-NEXT: br label [[DOTPREHEADER4_LR_PH:%.*]]
; CHECK: .preheader4.lr.ph:
-; CHECK-NEXT: [[TMP1:%.*]] = add nsw i32 [[X4]], -1
-; CHECK-NEXT: [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
-; CHECK-NEXT: [[TMP3:%.*]] = add nuw nsw i64 [[TMP2]], 1
; CHECK-NEXT: [[TMP4:%.*]] = sext i8 [[J_SROA_0_0_COPYLOAD]] to i64
-; CHECK-NEXT: [[TMP5:%.*]] = mul i64 [[TMP3]], [[TMP4]]
+; CHECK-NEXT: [[TMP2:%.*]] = zext nneg i32 [[X4]] to i64
+; CHECK-NEXT: [[TMP5:%.*]] = mul i64 [[TMP4]], [[TMP2]]
; CHECK-NEXT: br label [[DOTPREHEADER4:%.*]]
; CHECK: .preheader4:
; CHECK-NEXT: [[K_09:%.*]] = phi ptr [ undef, [[DOTPREHEADER4_LR_PH]] ], [ [[X25:%.*]], [[X22:%.*]] ]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OrigFlags is the flags of the outer add. I can't see any way it's correct to use those flags for a narrower add.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Even if both the outer and inner additions are nsw/nuw, that isn't enough for this transform. See https://alive2.llvm.org/ce/z/DFLN-5 .)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, it should not pass the original flags. Updated to pass AnyWrapFlags.
This checks adding the truncated first operand and the inner expression of the ZExt does not wrap. Updated proof: https://alive2.llvm.org/ce/z/q7d303
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to the indvars tests, can you add a scalar-evolution analysis test that shows the flags in a dump?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, added an analysis only test, thanks!
Adds a SCEV-only tests for #151227.
Try to push the constant operand into a ZExt: A + zext (-A + B) -> zext (B), if trunc (A) + -A + B does not unsigned-wrap. The actual code supports ZExts with arbitrary number of arguments, hence the getAddExpr in the return. This helps SCEV reasoning in some cases, commonly when adding an offset to a zero-extended SCEV that subtracts the same offset. Note that this is restricted to cases where we can fold away an operand of the inner Add. This is needed to avoid bad interactions with patterns when forming ZExts, which try to push to ZExt to add operands. https://alive2.llvm.org/ce/z/uuYGC3k
df16937 to
e834442
Compare
Adds a SCEV-only tests for llvm/llvm-project#151227.
efriedma-quic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…zext (B) (#151227) Try to push the constant operand into a ZExt: A + zext (-A + B) -> zext (B), if trunc (A) + -A + B does not unsigned-wrap. The actual code supports ZExts with arbitrary number of arguments, hence the getAddExpr in the return. This helps SCEV reasoning in some cases, commonly when adding an offset to a zero-extended SCEV that subtracts the same offset. Note that this is restricted to cases where we can fold away an operand of the inner Add. This is needed to avoid bad interactions with patterns when forming ZExts, which try to push to ZExt to add operands. https://alive2.llvm.org/ce/z/q7d303 PR: llvm/llvm-project#151227
| // (B), if trunc (A) + -A + B does not unsigned-wrap. | ||
| if (auto *ZExt = dyn_cast<SCEVZeroExtendExpr>(Ops[1])) { | ||
| const SCEV *B = ZExt->getOperand(0); | ||
| const SCEV *NarrowA = getTruncateExpr(A, B->getType()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check isa<SCEVAddExpr>(B) before computing getTruncateExpr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, should be done in ab9b23c by introducing a new m_scec_Add() which matches and captures SCEVAddExpr with an abitrary number of operands
Follow-up to #151227 (review) to check the inner expression is an Add before calling getTruncateExpr. Adds a new matcher that just matches and captures SCEVAddExpr, to support matching a SCEVAddExpr with arbitrary number of operands.
Follow-up to llvm/llvm-project#151227 (review) to check the inner expression is an Add before calling getTruncateExpr. Adds a new matcher that just matches and captures SCEVAddExpr, to support matching a SCEVAddExpr with arbitrary number of operands.
Try to push constant multiply operand into a ZExt containing an add, if possible. In general we are trying to push down ops through ZExt if possible. This is similar to llvm#151227 which did the same for additions. For now this is restricted to adds with a constant operand, which is similar to some of the logic above. This enables some additional simplifications. Alive2 Proof: https://alive2.llvm.org/ce/z/97pbSL
…#155300) Try to push constant multiply operand into a ZExt containing an add, if possible. In general we are trying to push down ops through ZExt if possible. This is similar to #151227 which did the same for additions. For now this is restricted to adds with a constant operand, which is similar to some of the logic above. This enables some additional simplifications. Alive2 Proof: https://alive2.llvm.org/ce/z/97pbSL PR: #155300
…(A*C + B*C) (#155300) Try to push constant multiply operand into a ZExt containing an add, if possible. In general we are trying to push down ops through ZExt if possible. This is similar to llvm/llvm-project#151227 which did the same for additions. For now this is restricted to adds with a constant operand, which is similar to some of the logic above. This enables some additional simplifications. Alive2 Proof: https://alive2.llvm.org/ce/z/97pbSL PR: llvm/llvm-project#155300
Adds a SCEV-only tests for llvm#151227. (cherry picked from commit c6f7fa7)
…lvm#151227) Try to push the constant operand into a ZExt: A + zext (-A + B) -> zext (B), if trunc (A) + -A + B does not unsigned-wrap. The actual code supports ZExts with arbitrary number of arguments, hence the getAddExpr in the return. This helps SCEV reasoning in some cases, commonly when adding an offset to a zero-extended SCEV that subtracts the same offset. Note that this is restricted to cases where we can fold away an operand of the inner Add. This is needed to avoid bad interactions with patterns when forming ZExts, which try to push to ZExt to add operands. https://alive2.llvm.org/ce/z/q7d303 PR: llvm#151227 (cherry picked from commit d74d841)
Follow-up to llvm#151227 (review) to check the inner expression is an Add before calling getTruncateExpr. Adds a new matcher that just matches and captures SCEVAddExpr, to support matching a SCEVAddExpr with arbitrary number of operands. (cherry picked from commit ab9b23c)
…llvm#155300) Try to push constant multiply operand into a ZExt containing an add, if possible. In general we are trying to push down ops through ZExt if possible. This is similar to llvm#151227 which did the same for additions. For now this is restricted to adds with a constant operand, which is similar to some of the logic above. This enables some additional simplifications. Alive2 Proof: https://alive2.llvm.org/ce/z/97pbSL PR: llvm#155300 (cherry-picked from 9143746)
Try to push the constant operand into a ZExt:
A + zext (-A + B) -> zext (B), if trunc (A) + -A + B does not unsigned-wrap.
The actual code supports ZExts with arbitrary number of arguments, hence the getAddExpr in the return.
This helps SCEV reasoning in some cases, commonly when adding an offset to a zero-extended SCEV that subtracts the same offset.
Note that this is restricted to cases where we can fold away an operand of the inner Add. This is needed to avoid bad interactions with patterns when forming ZExts, which try to push to ZExt to add operands.
https://alive2.llvm.org/ce/z/q7d303