Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 19 additions & 3 deletions llvm/lib/Analysis/ScalarEvolutionDivision.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -141,10 +141,26 @@ void SCEVDivision::visitAddRecExpr(const SCEVAddRecExpr *Numerator) {
if (Ty != StartQ->getType() || Ty != StartR->getType() ||
Ty != StepQ->getType() || Ty != StepR->getType())
return cannotDivide(Numerator);

// Infer no-wrap flags for Remainder.
// TODO: Catch more cases.
SCEV::NoWrapFlags NumFlags = Numerator->getNoWrapFlags();
SCEV::NoWrapFlags RemFlags = SCEV::NoWrapFlags::FlagAnyWrap;
const SCEV *StepNumAbs =
SE.getAbsExpr(Numerator->getStepRecurrence(SE), /*IsNSW=*/false);
const SCEV *StepRAbs = SE.getAbsExpr(StepR, /*IsNSW=*/false);
const Loop *L = Numerator->getLoop();

// If abs(StepR) <=u abs(StepNumAbs) and both are loop invariant, propagate
Copy link
Member

@Meinersbur Meinersbur Aug 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If abs(StepR) <=u abs(StepNumAbs) and both are loop invariant, propagate
// If abs(StepR) <=u abs(StepNum) and both are loop invariant, propagate

StepNumAbs already is the absolute

Add reasoning here? Such as "since the denominator cannot be zero, so abs(Denom) >= 1, the range of the SCEVAddRec can only shrink, i.e. if the range was not exceeding the size of the integer type's domain (i.e. not self-wrap) before, it will not self-wrap after division"

I think the deduction is more general, only needs that the nominator is NW and the denominator is loop-invariant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments.

I think the deduction is more general, only needs that the nominator is NW and the denominator is loop-invariant.

This seems correct, but I realized that there are no checks to ensure that Denominator is non-zero in the first place. Anyway, it looks like I should fix the other parts first, so I’ll start with those.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Division by zero must either be ruled-out by the caller, or the code that it is processing has undefined behaviour. In either case, there is not a lot SCEVDivision can do when representing a division-by-zero.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently used in Delinearization and does not correspond to actual div instructions in LLVM IR. That is, given a multiplication like %m * %n, an invocation like divide(%m * %n, %m) can happen without a non-zero check. Anyway, in such a case the division seems ill-defined.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something we need to be careful about is that SCEV flags are global, not use-site specific. So even if we know that division by zero is not possible in context, we may not be able to assume it for the purpose of flag propagation. (I haven't checked whether this is a problem for the delinearization case or not.)

// the <NW> from Numerator to Remainder.
if (ScalarEvolution::hasFlags(NumFlags, SCEV::NoWrapFlags::FlagNW) &&
SE.isLoopInvariant(StepNumAbs, L) && SE.isLoopInvariant(StepRAbs, L) &&
SE.isKnownPredicate(ICmpInst::ICMP_ULE, StepRAbs, StepNumAbs))
RemFlags = ScalarEvolution::setFlags(RemFlags, SCEV::NoWrapFlags::FlagNW);

Quotient = SE.getAddRecExpr(StartQ, StepQ, Numerator->getLoop(),
Numerator->getNoWrapFlags());
Remainder = SE.getAddRecExpr(StartR, StepR, Numerator->getLoop(),
Numerator->getNoWrapFlags());
SCEV::NoWrapFlags::FlagAnyWrap);
Remainder = SE.getAddRecExpr(StartR, StepR, Numerator->getLoop(), RemFlags);
}

void SCEVDivision::visitAddExpr(const SCEVAddExpr *Numerator) {
Expand Down
2 changes: 1 addition & 1 deletion llvm/test/Analysis/Delinearization/fixed_size_array.ll
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ exit:
; CHECK: Delinearization on function a_i_2j1_k:
; CHECK: Base offset: %a
; CHECK-NEXT: ArrayDecl[UnknownSize][4][64] with elements of 4 bytes.
; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><nsw><%for.i.header>][{0,+,1}<nuw><%for.j.header>][{32,+,1}<nw><%for.k>]
; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><nsw><%for.i.header>][{0,+,1}<%for.j.header>][{32,+,1}<%for.k>]
define void @a_i_2j1_k(ptr %a) {
entry:
br label %for.i.header
Expand Down
130 changes: 130 additions & 0 deletions llvm/test/Analysis/Delinearization/wraps.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
; RUN: opt < %s -passes='print<delinearization>' -disable-output 2>&1 | FileCheck %s

; In the following case, we don't know the concret value of `m`, so we cannot
; deduce no-wrap behavior for the quotient/remainder divided by `m`. However,
; we can infer `{0,+,1}<%loop>` is nuw and nsw from the induction variable.
;
; for (int i = 0; i < btc; i++)
; a[i * (m + 42)] = 0;

; CHECK: AccessFunction: {0,+,(42 + %m)}<nuw><nsw><%loop>
; CHECK-NEXT: Base offset: %a
; CHECK-NEXT: ArrayDecl[UnknownSize][%m] with elements of 1 bytes.
; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><nsw><%loop>][{0,+,42}<%loop>]
define void @divide_by_m0(ptr %a, i64 %m, i64 %btc) {
entry:
%stride = add nsw nuw i64 %m, 42
br label %loop

loop:
%i = phi i64 [ 0, %entry ], [ %i.next, %loop ]
%offset = phi i64 [ 0, %entry ], [ %offset.next, %loop ]
%idx = getelementptr inbounds i8, ptr %a, i64 %offset
store i8 0, ptr %idx
%i.next = add nsw nuw i64 %i, 1
%offset.next = add nsw nuw i64 %offset, %stride
%cond = icmp eq i64 %i.next, %btc
br i1 %cond, label %exit, label %loop

exit:
ret void
}

; In the following case, we don't know the concret value of `m`, so we cannot
; deduce no-wrap behavior for the quotient/remainder divided by `m`. Also, we
; don't infer nsw/nuw from the induction variable in this case.
;
; for (int i = 0; i < btc; i++)
; a[i * (2 * m + 42)] = 0;

; CHECK: AccessFunction: {0,+,(42 + (2 * %m))}<nuw><nsw><%loop>
; CHECK-NEXT: Base offset: %a
; CHECK-NEXT: ArrayDecl[UnknownSize][%m] with elements of 1 bytes.
; CHECK-NEXT: ArrayRef[{0,+,2}<%loop>][{0,+,42}<%loop>]
define void @divide_by_m2(ptr %a, i64 %m, i64 %btc) {
entry:
%m2 = add nsw nuw i64 %m, %m
%stride = add nsw nuw i64 %m2, 42
br label %loop

loop:
%i = phi i64 [ 0, %entry ], [ %i.next, %loop ]
%offset = phi i64 [ 0, %entry ], [ %offset.next, %loop ]
%idx = getelementptr inbounds i8, ptr %a, i64 %offset
store i8 0, ptr %idx
%i.next = add nsw nuw i64 %i, 1
%offset.next = add nsw nuw i64 %offset, %stride
%cond = icmp eq i64 %i.next, %btc
br i1 %cond, label %exit, label %loop

exit:
ret void
}

; In the following case, the `i * 2 * d` is always zero, so it's nsw and nuw.
; However, the quotient divided by `d` is neither nsw nor nuw.
;
; if (d == 0)
; for (unsigned long long i = 0; i != UINT64_MAX; i++)
; a[i * 2 * d] = 42;

; CHECK: AccessFunction: {0,+,(2 * %d)}<nuw><nsw><%loop>
; CHECK-NEXT: Base offset: %a
; CHECK-NEXT: ArrayDecl[UnknownSize][%d] with elements of 1 bytes.
; CHECK-NEXT: ArrayRef[{0,+,2}<%loop>][0]
define void @divide_by_zero(ptr %a, i64 %d) {
entry:
%guard = icmp eq i64 %d, 0
br i1 %guard, label %loop.preheader, label %exit

loop.preheader:
%stride = mul nsw nuw i64 %d, 2 ; since %d is 0, %stride is also 0
br label %loop

loop:
%i = phi i64 [ 0, %loop.preheader ], [ %i.next, %loop ]
%offset = phi i64 [ 0, %loop.preheader ], [ %offset.next, %loop ]
%idx = getelementptr inbounds i8, ptr %a, i64 %offset
store i8 42, ptr %idx
%i.next = add nuw i64 %i, 1
%offset.next = add nsw nuw i64 %offset, %stride
%cond = icmp eq i64 %i.next, -1
br i1 %cond, label %exit, label %loop

exit:
ret void
}

; In the following case, the `i * (d + 1)` is always zero, so it's nsw and nuw.
; However, the quotient/remainder divided by `d` is not nsw.
;
; if (d == UINT64_MAX)
; for (unsigned long long i = 0; i != d; i++)
; a[i * (d + 1)] = 42;

; CHECK: AccessFunction: {0,+,(1 + %d)}<nuw><nsw><%loop>
; CHECK-NEXT: Base offset: %a
; CHECK-NEXT: ArrayDecl[UnknownSize][%d] with elements of 1 bytes.
; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><%loop>][{0,+,1}<nuw><%loop>]
define void @divide_by_minus_one(ptr %a, i64 %d) {
entry:
%guard = icmp eq i64 %d, -1
br i1 %guard, label %loop.preheader, label %exit

loop.preheader:
%stride = add nsw i64 %d, 1 ; since %d is -1, %stride is 0
br label %loop

loop:
%i = phi i64 [ 0, %loop.preheader ], [ %i.next, %loop ]
%offset = phi i64 [ 0, %loop.preheader ], [ %offset.next, %loop ]
%idx = getelementptr inbounds i8, ptr %a, i64 %offset
store i8 42, ptr %idx
%i.next = add nuw i64 %i, 1
%offset.next = add nsw nuw i64 %offset, %stride
%cond = icmp eq i64 %i.next, %d
br i1 %cond, label %exit, label %loop

exit:
ret void
}
6 changes: 4 additions & 2 deletions llvm/test/Analysis/DependenceAnalysis/DADelin.ll
Original file line number Diff line number Diff line change
Expand Up @@ -479,14 +479,16 @@ for.cond.cleanup: ; preds = %for.cond.cleanup3,
;; for (int k = 1; k < o; k++)
;; = A[i*m*o + j*o + k]
;; A[i*m*o + j*o + k - 1] =
;;
;; FIXME: Currently fails to infer nsw for the SCEV `{0,+,1}<for.body8>`
define void @t8(i32 %n, i32 %m, i32 %o, ptr nocapture %A) {
; CHECK-LABEL: 't8'
; CHECK-NEXT: Src: %0 = load i32, ptr %arrayidx, align 4 --> Dst: %0 = load i32, ptr %arrayidx, align 4
; CHECK-NEXT: da analyze - none!
; CHECK-NEXT: Src: %0 = load i32, ptr %arrayidx, align 4 --> Dst: store i32 %add12, ptr %arrayidx2, align 4
; CHECK-NEXT: da analyze - consistent anti [0 0 1]!
; CHECK-NEXT: da analyze - anti [* * *|<]!
; CHECK-NEXT: Src: store i32 %add12, ptr %arrayidx2, align 4 --> Dst: store i32 %add12, ptr %arrayidx2, align 4
; CHECK-NEXT: da analyze - none!
; CHECK-NEXT: da analyze - output [* * *]!
Comment on lines +482 to +491
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The essential portion of the IR is as follows:

preheader:
  %guard = icmp sgt i32 %o, 0
  br i1 %guard, label %loop, label %exit

loop:
  %i = phi i32 [ 1, %preheader ], [ %inc, %loop ]
  ...
  %inc = add nuw nsw i32 %i, 1
  %exitcond = icmp eq i32 %inc, %o
  br i1 %exitcond, label %exit, label %body

exit:
  ...

From the loop-guard and the induction variable, we know the following:

  • 0 <s %o
  • {1,+,1}<%loop> <s %o
  • {1,+,1}<%loop> is nsw and nuw

IIUIC, in principle, we can deduce that {0,+,1}<%loop> is also nsw and nuw, but now the inference fails. Are there any good ways to address this case, or do we need to implement a separate inference specifically for it?

;
entry:
%cmp49 = icmp sgt i32 %n, 0
Expand Down