Skip to content

Conversation

@fhahn
Copy link
Contributor

@fhahn fhahn commented Jul 29, 2025

Try to push the constant operand into a ZExt:
A + zext (-A + B) -> zext (B), if trunc (A) + -A + B does not unsigned-wrap.

The actual code supports ZExts with arbitrary number of arguments, hence the getAddExpr in the return.

This helps SCEV reasoning in some cases, commonly when adding an offset to a zero-extended SCEV that subtracts the same offset.

Note that this is restricted to cases where we can fold away an operand of the inner Add. This is needed to avoid bad interactions with patterns when forming ZExts, which try to push to ZExt to add operands.

https://alive2.llvm.org/ce/z/q7d303

@fhahn fhahn requested review from efriedma-quic and preames July 29, 2025 20:47
@fhahn fhahn requested a review from nikic as a code owner July 29, 2025 20:47
@llvmbot llvmbot added llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Jul 29, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 29, 2025

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

Try to push the constant operand into a ZExt:
A + zext (-A + B) -> zext (B), if trunc (A) + -A + B does not unsigned-wrap.

The actual code supports ZExts with arbitrary number of arguments, hence the getAddExpr in the return.

This helps SCEV reasoning in some cases, commonly when adding an offset to a zero-extended SCEV that subtracts the same offset.

Note that this is restricted to cases where we can fold away an operand of the inner Add. This is needed to avoid bad interactions with patterns when forming ZExts, which try to push to ZExt to add operands.

https://alive2.llvm.org/ce/z/uuYGC3k


Full diff: https://github.com/llvm/llvm-project/pull/151227.diff

3 Files Affected:

  • (modified) llvm/lib/Analysis/ScalarEvolution.cpp (+15)
  • (modified) llvm/test/Transforms/IndVarSimplify/AArch64/fold-ext-add.ll (+5-5)
  • (modified) llvm/test/Transforms/IndVarSimplify/zext-nuw.ll (+2-4)
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 0990a0daac80c..8e587d8cbced6 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -2682,6 +2682,21 @@ const SCEV *ScalarEvolution::getAddExpr(SmallVectorImpl<const SCEV *> &Ops,
         return getAddExpr(NewOps, PreservedFlags);
       }
     }
+
+    // Try to push the constant operand into a ZExt: A + zext (-A + B) -> zext
+    // (B), if trunc (A) + -A + B  does not unsigned-wrap.
+    if (auto *ZExt = dyn_cast<SCEVZeroExtendExpr>(Ops[1])) {
+      const SCEV *B = ZExt->getOperand(0);
+      const SCEV *NarrowA = getTruncateExpr(A, B->getType());
+      if (isa<SCEVAddExpr>(B) &&
+          NarrowA == getNegativeSCEV(cast<SCEVAddExpr>(B)->getOperand(0)) &&
+          getZeroExtendExpr(NarrowA, ZExt->getType()) == A &&
+          hasFlags(
+              StrengthenNoWrapFlags(this, scAddExpr, {NarrowA, B}, OrigFlags),
+              SCEV::FlagNUW)) {
+        return getZeroExtendExpr(getAddExpr(NarrowA, B), ZExt->getType());
+      }
+    }
   }
 
   // Canonicalize (-1 * urem X, Y) + X --> (Y * X/Y)
diff --git a/llvm/test/Transforms/IndVarSimplify/AArch64/fold-ext-add.ll b/llvm/test/Transforms/IndVarSimplify/AArch64/fold-ext-add.ll
index 48b92e905b8ce..640c910be4a82 100644
--- a/llvm/test/Transforms/IndVarSimplify/AArch64/fold-ext-add.ll
+++ b/llvm/test/Transforms/IndVarSimplify/AArch64/fold-ext-add.ll
@@ -10,21 +10,21 @@ define void @pred_mip_12(ptr %dst, ptr %src, i32 %n, i64 %offset) {
 ; CHECK-SAME: ptr [[DST:%.*]], ptr [[SRC:%.*]], i32 [[N:%.*]], i64 [[OFFSET:%.*]]) {
 ; CHECK-NEXT:  [[ENTRY:.*]]:
 ; CHECK-NEXT:    [[SMAX:%.*]] = call i32 @llvm.smax.i32(i32 [[N]], i32 1)
+; CHECK-NEXT:    [[TMP0:%.*]] = zext nneg i32 [[SMAX]] to i64
+; CHECK-NEXT:    [[TMP1:%.*]] = mul i64 [[OFFSET]], [[TMP0]]
+; CHECK-NEXT:    [[SCEVGEP:%.*]] = getelementptr i8, ptr [[SRC]], i64 [[TMP1]]
 ; CHECK-NEXT:    br label %[[OUTER_LOOP:.*]]
 ; CHECK:       [[OUTER_LOOP_LOOPEXIT:.*]]:
-; CHECK-NEXT:    [[PTR_IV_NEXT_LCSSA:%.*]] = phi ptr [ [[PTR_IV_NEXT:%.*]], %[[INNER_LOOP:.*]] ]
 ; CHECK-NEXT:    br label %[[OUTER_LOOP]]
 ; CHECK:       [[OUTER_LOOP]]:
-; CHECK-NEXT:    [[OUTER_PTR:%.*]] = phi ptr [ [[SRC]], %[[ENTRY]] ], [ [[PTR_IV_NEXT_LCSSA]], %[[OUTER_LOOP_LOOPEXIT]] ]
+; CHECK-NEXT:    [[OUTER_PTR:%.*]] = phi ptr [ [[SRC]], %[[ENTRY]] ], [ [[SCEVGEP]], %[[OUTER_LOOP_LOOPEXIT]] ]
 ; CHECK-NEXT:    [[C:%.*]] = call i1 @cond()
 ; CHECK-NEXT:    br i1 [[C]], label %[[INNER_LOOP_PREHEADER:.*]], label %[[EXIT:.*]]
 ; CHECK:       [[INNER_LOOP_PREHEADER]]:
-; CHECK-NEXT:    br label %[[INNER_LOOP]]
+; CHECK-NEXT:    br label %[[INNER_LOOP:.*]]
 ; CHECK:       [[INNER_LOOP]]:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i32 [ [[IV_NEXT:%.*]], %[[INNER_LOOP]] ], [ 0, %[[INNER_LOOP_PREHEADER]] ]
-; CHECK-NEXT:    [[PTR_IV:%.*]] = phi ptr [ [[PTR_IV_NEXT]], %[[INNER_LOOP]] ], [ [[SRC]], %[[INNER_LOOP_PREHEADER]] ]
 ; CHECK-NEXT:    [[L:%.*]] = load i8, ptr [[OUTER_PTR]], align 1
-; CHECK-NEXT:    [[PTR_IV_NEXT]] = getelementptr i8, ptr [[PTR_IV]], i64 [[OFFSET]]
 ; CHECK-NEXT:    store i8 [[L]], ptr [[DST]], align 2
 ; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i32 [[IV]], 1
 ; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i32 [[IV_NEXT]], [[SMAX]]
diff --git a/llvm/test/Transforms/IndVarSimplify/zext-nuw.ll b/llvm/test/Transforms/IndVarSimplify/zext-nuw.ll
index d24f9a4e40e38..17921afc5ff06 100644
--- a/llvm/test/Transforms/IndVarSimplify/zext-nuw.ll
+++ b/llvm/test/Transforms/IndVarSimplify/zext-nuw.ll
@@ -15,11 +15,9 @@ define void @_Z3fn1v() {
 ; CHECK-NEXT:    [[J_SROA_0_0_COPYLOAD:%.*]] = load i8, ptr [[X5]], align 1
 ; CHECK-NEXT:    br label [[DOTPREHEADER4_LR_PH:%.*]]
 ; CHECK:       .preheader4.lr.ph:
-; CHECK-NEXT:    [[TMP1:%.*]] = add nsw i32 [[X4]], -1
-; CHECK-NEXT:    [[TMP2:%.*]] = zext nneg i32 [[TMP1]] to i64
-; CHECK-NEXT:    [[TMP3:%.*]] = add nuw nsw i64 [[TMP2]], 1
 ; CHECK-NEXT:    [[TMP4:%.*]] = sext i8 [[J_SROA_0_0_COPYLOAD]] to i64
-; CHECK-NEXT:    [[TMP5:%.*]] = mul i64 [[TMP3]], [[TMP4]]
+; CHECK-NEXT:    [[TMP2:%.*]] = zext nneg i32 [[X4]] to i64
+; CHECK-NEXT:    [[TMP5:%.*]] = mul i64 [[TMP4]], [[TMP2]]
 ; CHECK-NEXT:    br label [[DOTPREHEADER4:%.*]]
 ; CHECK:       .preheader4:
 ; CHECK-NEXT:    [[K_09:%.*]] = phi ptr [ undef, [[DOTPREHEADER4_LR_PH]] ], [ [[X25:%.*]], [[X22:%.*]] ]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OrigFlags is the flags of the outer add. I can't see any way it's correct to use those flags for a narrower add.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Even if both the outer and inner additions are nsw/nuw, that isn't enough for this transform. See https://alive2.llvm.org/ce/z/DFLN-5 .)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it should not pass the original flags. Updated to pass AnyWrapFlags.

This checks adding the truncated first operand and the inner expression of the ZExt does not wrap. Updated proof: https://alive2.llvm.org/ce/z/q7d303

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the indvars tests, can you add a scalar-evolution analysis test that shows the flags in a dump?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, added an analysis only test, thanks!

fhahn added a commit that referenced this pull request Jul 30, 2025
fhahn added 2 commits July 30, 2025 10:05
Try to push the constant operand into a ZExt:
  A + zext (-A + B) -> zext (B), if trunc (A) + -A + B  does not unsigned-wrap.

The actual code supports ZExts with arbitrary number of arguments, hence
the getAddExpr in the return.

This helps SCEV reasoning in some cases, commonly when adding an offset
to a zero-extended SCEV that subtracts the same offset.

Note that this is restricted to cases where we can fold away an operand
of the inner Add. This is needed to avoid bad interactions with patterns
when forming ZExts, which try to push to ZExt to add operands.

https://alive2.llvm.org/ce/z/uuYGC3k
@fhahn fhahn force-pushed the perf/scev-push-const-op-into-zext branch from df16937 to e834442 Compare July 30, 2025 09:05
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jul 30, 2025
Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fhahn fhahn merged commit d74d841 into llvm:main Jul 30, 2025
9 checks passed
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jul 30, 2025
…zext (B) (#151227)

Try to push the constant operand into a ZExt:
A + zext (-A + B) -> zext (B), if trunc (A) + -A + B does not
unsigned-wrap.

The actual code supports ZExts with arbitrary number of arguments, hence
the getAddExpr in the return.

This helps SCEV reasoning in some cases, commonly when adding an offset
to a zero-extended SCEV that subtracts the same offset.

Note that this is restricted to cases where we can fold away an operand
of the inner Add. This is needed to avoid bad interactions with patterns
when forming ZExts, which try to push to ZExt to add operands.

https://alive2.llvm.org/ce/z/q7d303

PR: llvm/llvm-project#151227
// (B), if trunc (A) + -A + B does not unsigned-wrap.
if (auto *ZExt = dyn_cast<SCEVZeroExtendExpr>(Ops[1])) {
const SCEV *B = ZExt->getOperand(0);
const SCEV *NarrowA = getTruncateExpr(A, B->getType());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check isa<SCEVAddExpr>(B) before computing getTruncateExpr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, should be done in ab9b23c by introducing a new m_scec_Add() which matches and captures SCEVAddExpr with an abitrary number of operands

fhahn added a commit that referenced this pull request Jul 31, 2025
Follow-up to
#151227 (review)
to check the inner expression is an Add before calling getTruncateExpr.

Adds a new matcher that just matches and captures SCEVAddExpr, to
support matching a SCEVAddExpr with arbitrary number of operands.
@fhahn fhahn deleted the perf/scev-push-const-op-into-zext branch July 31, 2025 11:48
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jul 31, 2025
Follow-up to
llvm/llvm-project#151227 (review)
to check the inner expression is an Add before calling getTruncateExpr.

Adds a new matcher that just matches and captures SCEVAddExpr, to
support matching a SCEVAddExpr with arbitrary number of operands.
fhahn added a commit to fhahn/llvm-project that referenced this pull request Aug 26, 2025
Try to push constant multiply operand into a ZExt containing an add, if
possible. In general we are trying to push down ops through ZExt if
possible. This is similar to llvm#151227
which did the same for additions.

For now this is restricted to adds with a constant operand, which is
similar to some of the logic above.

This enables some additional simplifications.

Alive2 Proof: https://alive2.llvm.org/ce/z/97pbSL
fhahn added a commit that referenced this pull request Aug 26, 2025
…#155300)

Try to push constant multiply operand into a ZExt containing an add, if
possible. In general we are trying to push down ops through ZExt if
possible. This is similar to
#151227 which did the same for
additions.

For now this is restricted to adds with a constant operand, which is
similar to some of the logic above.

This enables some additional simplifications.

Alive2 Proof: https://alive2.llvm.org/ce/z/97pbSL

PR: #155300
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Aug 26, 2025
…(A*C + B*C) (#155300)

Try to push constant multiply operand into a ZExt containing an add, if
possible. In general we are trying to push down ops through ZExt if
possible. This is similar to
llvm/llvm-project#151227 which did the same for
additions.

For now this is restricted to adds with a constant operand, which is
similar to some of the logic above.

This enables some additional simplifications.

Alive2 Proof: https://alive2.llvm.org/ce/z/97pbSL

PR: llvm/llvm-project#155300
fhahn added a commit to fhahn/llvm-project that referenced this pull request Nov 4, 2025
Adds a SCEV-only tests for
llvm#151227.

(cherry picked from commit c6f7fa7)
fhahn added a commit to fhahn/llvm-project that referenced this pull request Nov 4, 2025
…lvm#151227)

Try to push the constant operand into a ZExt:
A + zext (-A + B) -> zext (B), if trunc (A) + -A + B does not
unsigned-wrap.

The actual code supports ZExts with arbitrary number of arguments, hence
the getAddExpr in the return.

This helps SCEV reasoning in some cases, commonly when adding an offset
to a zero-extended SCEV that subtracts the same offset.

Note that this is restricted to cases where we can fold away an operand
of the inner Add. This is needed to avoid bad interactions with patterns
when forming ZExts, which try to push to ZExt to add operands.

https://alive2.llvm.org/ce/z/q7d303

PR: llvm#151227
(cherry picked from commit d74d841)
fhahn added a commit to fhahn/llvm-project that referenced this pull request Nov 4, 2025
Follow-up to
llvm#151227 (review)
to check the inner expression is an Add before calling getTruncateExpr.

Adds a new matcher that just matches and captures SCEVAddExpr, to
support matching a SCEVAddExpr with arbitrary number of operands.

(cherry picked from commit ab9b23c)
fhahn added a commit to fhahn/llvm-project that referenced this pull request Nov 4, 2025
…llvm#155300)

Try to push constant multiply operand into a ZExt containing an add, if
possible. In general we are trying to push down ops through ZExt if
possible. This is similar to
llvm#151227 which did the same for
additions.

For now this is restricted to adds with a constant operand, which is
similar to some of the logic above.

This enables some additional simplifications.

Alive2 Proof: https://alive2.llvm.org/ce/z/97pbSL

PR: llvm#155300

(cherry-picked from 9143746)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants