Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WRONG code: GVN? Loop opts? #114879

Closed
JonPsson1 opened this issue Nov 4, 2024 · 4 comments · Fixed by #115109
Closed

WRONG code: GVN? Loop opts? #114879

JonPsson1 opened this issue Nov 4, 2024 · 4 comments · Fixed by #115109
Assignees
Labels

Comments

@JonPsson1
Copy link
Contributor

cat wrong0.i

int printf(const char *, ...);
unsigned int IntArr[6];
unsigned int GlobIntONE = 1, GlobIntZERO = 0;
const unsigned int *GlobIntPtr = 0;
unsigned long Res;
unsigned long *ResPtr = &Res;
const unsigned int **GlobIntPtrPTr = &GlobIntPtr;
unsigned int *func() {
  int *GlobIntONEPtr = &GlobIntONE;
  for (int Idx = 0; Idx <= 7; Idx += 1) {
    int Idx2 = 1;
    if (Idx > 0) {
      for (; Idx2 <= 7; Idx2 += 1)
        ;
      return GlobIntONEPtr;
    }
  }
  0 != &GlobIntONEPtr;
  return &GlobIntZERO;
}

int main() {
  IntArr[GlobIntZERO] = GlobIntZERO;
  *GlobIntPtrPTr = func();
  unsigned char Byte = *GlobIntPtr;
  *ResPtr = Byte;
  printf("checksum = %X\n", Res);
}

In the above file, func() should obviously should return GlobIntONEPtr, but it does not if compiled with '-fno-inline -mllvm -unroll-full-max-count=1'.

clang -O0 -march=z16 wrong0.i -o a.out -w -fno-inline  ; ./a.out
checksum = 1

clang -O3 -march=z16 wrong0.i -o a.out -w -fno-inline  ; ./a.out
checksum = 1

clang -O3 -march=z16 wrong0.i -o a.out -w -fno-inline  -mllvm -unroll-full-max-count=1 ; ./a.out
checksum = 0

It may be that GVN does something wrong, as after it @func() is always returning ptr undef. Or it could very well be some of the loop optimizations that is run just before it.

A bisect shows that

d77067d08a3f56dc2d0e6c95bd2852c943df743a is the first bad commit
commit d77067d08a3f56dc2d0e6c95bd2852c943df743a
Author: Nikita Popov <[email protected]>
Date:   Wed Dec 6 14:17:18 2023 +0100

    [ValueTracking] Add dominating condition support in computeKnownBits() (#73662)

@nikic

@dtcxzyw dtcxzyw self-assigned this Nov 5, 2024
@dtcxzyw
Copy link
Member

dtcxzyw commented Nov 5, 2024

Bisected to LoopUnrollFullPass:

; bin/opt -passes=loop-unroll-full -unroll-full-max-count=1 test.ll -S

@GlobIntONE = dso_local global i32 1, align 4
@GlobIntZERO = dso_local local_unnamed_addr global i32 0, align 4

define i32 @main() local_unnamed_addr {
entry:
  br label %for.body

for.body:                                         ; preds = %entry, %cleanup
  %retval.0 = phi ptr [ undef, %entry ], [ %retval.2, %cleanup ]
  %cmp1.not = phi i1 [ true, %entry ], [ false, %cleanup ]
  br i1 %cmp1.not, label %cleanup, label %for.cond2.preheader

for.cond2.preheader:                              ; preds = %for.body
  br label %cleanup.loopexit

cleanup.loopexit:                                 ; preds = %for.cond2.preheader
  br label %cleanup

cleanup:                                          ; preds = %cleanup.loopexit, %for.body
  %retval.2 = phi ptr [ %retval.0, %for.body ], [ @GlobIntONE, %cleanup.loopexit ]
  br i1 %cmp1.not, label %for.body, label %cleanup7

cleanup7:                                         ; preds = %cleanup
  %retval.2.lcssa = phi ptr [ %retval.2, %cleanup ]
  %res = load i32, ptr %retval.2.lcssa, align 4
  ret i32 %res
}
source_filename = "test.ll"

@GlobIntONE = dso_local global i32 1, align 4
@GlobIntZERO = dso_local local_unnamed_addr global i32 0, align 4

define i32 @main() local_unnamed_addr {
entry:
  br label %for.body.peel.begin

for.body.peel.begin:                              ; preds = %entry
  br label %for.body.peel

for.body.peel:                                    ; preds = %for.body.peel.begin
  br i1 true, label %cleanup.peel, label %for.cond2.preheader.peel

for.cond2.preheader.peel:                         ; preds = %for.body.peel
  br label %cleanup.loopexit.peel

cleanup.loopexit.peel:                            ; preds = %for.cond2.preheader.peel
  br label %cleanup.peel

cleanup.peel:                                     ; preds = %cleanup.loopexit.peel, %for.body.peel
  %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ]
  br i1 true, label %for.body.peel.next, label %cleanup7

for.body.peel.next:                               ; preds = %cleanup.peel
  br label %for.body.peel.next1

for.body.peel.next1:                              ; preds = %for.body.peel.next
  br label %entry.peel.newph

entry.peel.newph:                                 ; preds = %for.body.peel.next1
  br label %for.body

for.body:                                         ; preds = %cleanup, %entry.peel.newph
  br i1 false, label %cleanup, label %for.cond2.preheader

for.cond2.preheader:                              ; preds = %for.body
  br label %cleanup.loopexit

cleanup.loopexit:                                 ; preds = %for.cond2.preheader
  br label %cleanup

cleanup:                                          ; preds = %cleanup.loopexit, %for.body
  br i1 false, label %for.body, label %cleanup7.loopexit, !llvm.loop !0

cleanup7.loopexit:                                ; preds = %cleanup
  %retval.2.lcssa.ph = phi ptr [ %retval.2.peel, %cleanup ]
  br label %cleanup7

cleanup7:                                         ; preds = %cleanup7.loopexit, %cleanup.peel
  %retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ]
  %res = load i32, ptr %retval.2.lcssa, align 4
  ret i32 %res
}

!0 = distinct !{!0, !1}
!1 = !{!"llvm.loop.peeled.count", i32 1}

@dtcxzyw
Copy link
Member

dtcxzyw commented Nov 5, 2024

simplifyLoopAfterUnroll -> simplifyLoopIVs causes the miscompilation: https://alive2.llvm.org/ce/z/yiBsZt
I will post a fix later.

@dtcxzyw dtcxzyw added llvm:SCEV Scalar Evolution llvm:analysis labels Nov 5, 2024
@dtcxzyw
Copy link
Member

dtcxzyw commented Nov 5, 2024

The root cause is:

  1. simplifyInstruction(%retval.2.peel) returns @GlobIntONE. Thus, ScalarEvolution::createNodeForPHI returns SCEV expr @GlobIntONE for %retval.2.peel.
  2. SimplifyIndvar::replaceIVUserWithLoopInvariant tries to replace the use of %retval.2.peel in %retval.2.lcssa.ph with @GlobIntONE.
  3. SCEVExpander::expand reuses %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ] to generate code for @GlobIntONE. It is incorrect.

Two possible solutions:

  1. Set CanUseUndef to false in simplifyInstruction.
  2. Block this case in ScalarEvolution::canReuseInstruction.

@nikic @preames Any thoughts?

@nikic
Copy link
Contributor

nikic commented Nov 5, 2024

I think setting CanUseUndef=false would be the correct thing to do. The reverse replacement here is particularly fishy, but more generally, if we make an assumption about the value of undef at a particular use, we have to be consistent about that assumption, and SCEV cannot really guarantee that this is the case.

dtcxzyw added a commit that referenced this issue Nov 7, 2024
See the following case:
```
@GlobIntONE = global i32 0, align 4

define ptr @src() {
entry:
  br label %for.body.peel.begin

for.body.peel.begin:                              ; preds = %entry
  br label %for.body.peel

for.body.peel:                                    ; preds = %for.body.peel.begin
  br i1 true, label %cleanup.peel, label %cleanup.loopexit.peel

cleanup.loopexit.peel:                            ; preds = %for.body.peel
  br label %cleanup.peel

cleanup.peel:                                     ; preds = %cleanup.loopexit.peel, %for.body.peel
  %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ]
  br i1 true, label %for.body.peel.next, label %cleanup7

for.body.peel.next:                               ; preds = %cleanup.peel
  br label %for.body.peel.next1

for.body.peel.next1:                              ; preds = %for.body.peel.next
  br label %entry.peel.newph

entry.peel.newph:                                 ; preds = %for.body.peel.next1
  br label %for.body

for.body:                                         ; preds = %cleanup, %entry.peel.newph
  %retval.0 = phi ptr [ %retval.2.peel, %entry.peel.newph ], [ %retval.2, %cleanup ]
  br i1 false, label %cleanup, label %cleanup.loopexit

cleanup.loopexit:                                 ; preds = %for.body
  br label %cleanup

cleanup:                                          ; preds = %cleanup.loopexit, %for.body
  %retval.2 = phi ptr [ %retval.0, %for.body ], [ @GlobIntONE, %cleanup.loopexit ]
  br i1 false, label %for.body, label %cleanup7.loopexit

cleanup7.loopexit:                                ; preds = %cleanup
  %retval.2.lcssa.ph = phi ptr [ %retval.2, %cleanup ]
  br label %cleanup7

cleanup7:                                         ; preds = %cleanup7.loopexit, %cleanup.peel
  %retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ]
  ret ptr %retval.2.lcssa
}

define ptr @tgt() {
entry:
  br label %for.body.peel.begin

for.body.peel.begin:                              ; preds = %entry
  br label %for.body.peel

for.body.peel:                                    ; preds = %for.body.peel.begin
  br i1 true, label %cleanup.peel, label %cleanup.loopexit.peel

cleanup.loopexit.peel:                            ; preds = %for.body.peel
  br label %cleanup.peel

cleanup.peel:                                     ; preds = %cleanup.loopexit.peel, %for.body.peel
  %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ]
  br i1 true, label %for.body.peel.next, label %cleanup7

for.body.peel.next:                               ; preds = %cleanup.peel
  br label %for.body.peel.next1

for.body.peel.next1:                              ; preds = %for.body.peel.next
  br label %entry.peel.newph

entry.peel.newph:                                 ; preds = %for.body.peel.next1
  br label %for.body

for.body:                                         ; preds = %cleanup, %entry.peel.newph
  br i1 false, label %cleanup, label %cleanup.loopexit

cleanup.loopexit:                                 ; preds = %for.body
  br label %cleanup

cleanup:                                          ; preds = %cleanup.loopexit, %for.body
  br i1 false, label %for.body, label %cleanup7.loopexit

cleanup7.loopexit:                                ; preds = %cleanup
  %retval.2.lcssa.ph = phi ptr [ %retval.2.peel, %cleanup ]
  br label %cleanup7

cleanup7:                                         ; preds = %cleanup7.loopexit, %cleanup.peel
  %retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ]
  ret ptr %retval.2.lcssa
}
```
1. `simplifyInstruction(%retval.2.peel)` returns `@GlobIntONE`. Thus,
`ScalarEvolution::createNodeForPHI` returns SCEV expr `@GlobIntONE` for
`%retval.2.peel`.
2. `SimplifyIndvar::replaceIVUserWithLoopInvariant` tries to replace the
use of `%retval.2.peel` in `%retval.2.lcssa.ph` with `@GlobIntONE`.
3. `simplifyLoopAfterUnroll -> simplifyLoopIVs -> SCEVExpander::expand`
reuses `%retval.2.peel = phi ptr [ undef, %for.body.peel ], [
@GlobIntONE, %cleanup.loopexit.peel ]` to generate code for
`@GlobIntONE`. It is incorrect.

This patch disallows simplifying `phi(undef, X)` to `X` by setting
`CanUseUndef` to false.
Closes #114879.
nikic added a commit that referenced this issue Nov 11, 2024
The expansion of a SCEVUnknown is trivial (it's just the wrapped value).
If we try to reuse an existing value it might be a more complex
expression that simplifies to the SCEVUnknown.

This is inspired by #114879,
because SCEVExpander replacing a constant with a phi node is just silly.
(I don't consider this a fix for that issue though.)
Groverkss pushed a commit to iree-org/llvm-project that referenced this issue Nov 15, 2024
The expansion of a SCEVUnknown is trivial (it's just the wrapped value).
If we try to reuse an existing value it might be a more complex
expression that simplifies to the SCEVUnknown.

This is inspired by llvm#114879,
because SCEVExpander replacing a constant with a phi node is just silly.
(I don't consider this a fix for that issue though.)
AZero13 pushed a commit to AZero13/llvm-project that referenced this issue Dec 5, 2024
See the following case:
```
@GlobIntONE = global i32 0, align 4

define ptr @src() {
entry:
  br label %for.body.peel.begin

for.body.peel.begin:                              ; preds = %entry
  br label %for.body.peel

for.body.peel:                                    ; preds = %for.body.peel.begin
  br i1 true, label %cleanup.peel, label %cleanup.loopexit.peel

cleanup.loopexit.peel:                            ; preds = %for.body.peel
  br label %cleanup.peel

cleanup.peel:                                     ; preds = %cleanup.loopexit.peel, %for.body.peel
  %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ]
  br i1 true, label %for.body.peel.next, label %cleanup7

for.body.peel.next:                               ; preds = %cleanup.peel
  br label %for.body.peel.next1

for.body.peel.next1:                              ; preds = %for.body.peel.next
  br label %entry.peel.newph

entry.peel.newph:                                 ; preds = %for.body.peel.next1
  br label %for.body

for.body:                                         ; preds = %cleanup, %entry.peel.newph
  %retval.0 = phi ptr [ %retval.2.peel, %entry.peel.newph ], [ %retval.2, %cleanup ]
  br i1 false, label %cleanup, label %cleanup.loopexit

cleanup.loopexit:                                 ; preds = %for.body
  br label %cleanup

cleanup:                                          ; preds = %cleanup.loopexit, %for.body
  %retval.2 = phi ptr [ %retval.0, %for.body ], [ @GlobIntONE, %cleanup.loopexit ]
  br i1 false, label %for.body, label %cleanup7.loopexit

cleanup7.loopexit:                                ; preds = %cleanup
  %retval.2.lcssa.ph = phi ptr [ %retval.2, %cleanup ]
  br label %cleanup7

cleanup7:                                         ; preds = %cleanup7.loopexit, %cleanup.peel
  %retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ]
  ret ptr %retval.2.lcssa
}

define ptr @tgt() {
entry:
  br label %for.body.peel.begin

for.body.peel.begin:                              ; preds = %entry
  br label %for.body.peel

for.body.peel:                                    ; preds = %for.body.peel.begin
  br i1 true, label %cleanup.peel, label %cleanup.loopexit.peel

cleanup.loopexit.peel:                            ; preds = %for.body.peel
  br label %cleanup.peel

cleanup.peel:                                     ; preds = %cleanup.loopexit.peel, %for.body.peel
  %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ]
  br i1 true, label %for.body.peel.next, label %cleanup7

for.body.peel.next:                               ; preds = %cleanup.peel
  br label %for.body.peel.next1

for.body.peel.next1:                              ; preds = %for.body.peel.next
  br label %entry.peel.newph

entry.peel.newph:                                 ; preds = %for.body.peel.next1
  br label %for.body

for.body:                                         ; preds = %cleanup, %entry.peel.newph
  br i1 false, label %cleanup, label %cleanup.loopexit

cleanup.loopexit:                                 ; preds = %for.body
  br label %cleanup

cleanup:                                          ; preds = %cleanup.loopexit, %for.body
  br i1 false, label %for.body, label %cleanup7.loopexit

cleanup7.loopexit:                                ; preds = %cleanup
  %retval.2.lcssa.ph = phi ptr [ %retval.2.peel, %cleanup ]
  br label %cleanup7

cleanup7:                                         ; preds = %cleanup7.loopexit, %cleanup.peel
  %retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ]
  ret ptr %retval.2.lcssa
}
```
1. `simplifyInstruction(%retval.2.peel)` returns `@GlobIntONE`. Thus,
`ScalarEvolution::createNodeForPHI` returns SCEV expr `@GlobIntONE` for
`%retval.2.peel`.
2. `SimplifyIndvar::replaceIVUserWithLoopInvariant` tries to replace the
use of `%retval.2.peel` in `%retval.2.lcssa.ph` with `@GlobIntONE`.
3. `simplifyLoopAfterUnroll -> simplifyLoopIVs -> SCEVExpander::expand`
reuses `%retval.2.peel = phi ptr [ undef, %for.body.peel ], [
@GlobIntONE, %cleanup.loopexit.peel ]` to generate code for
`@GlobIntONE`. It is incorrect.

This patch disallows simplifying `phi(undef, X)` to `X` by setting
`CanUseUndef` to false.
Closes llvm#114879.

(cherry picked from commit 0b9f1cc)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants