[RegisterCoalescer] Fix SUBREG_TO_REG handling in the RegisterCoalescer. #96839

stefanp-ibm · 2024-06-27T02:43:50Z

The issue with the handling of the SUBREG_TO_REG is that we don't join the subranges correctly when we join live ranges across the SUBREG_TO_REG. For example when joining across this:

32B	  %2:gr64_nosp = SUBREG_TO_REG 0, %0:gr32, %subreg.sub_32bit

we want to join these live ranges:

%0 [16r,32r:0) 0@16r  weight:0.000000e+00
%2 [32r,112r:0) 0@32r  weight:0.000000e+00

Before the fix the range for the resulting merged %2 is:

%2 [16r,112r:0) 0@16r  weight:0.000000e+00

After the fix it is now this:

%2 [16r,112r:0) 0@16r  L000000000000000F [16r,112r:0) 0@16r  weight:0.000000e+00

Two tests are added to this fix. The X86 test fails without the patch.
The PowerPC test passes with and without the patch but is added as a way
track future possible failures when register classes are changed in a
future patch.

llvmbot · 2024-06-27T02:44:22Z

@llvm/pr-subscribers-backend-x86

Author: Stefan Pintilie (stefanp-ibm)

Changes

Two tests are added to this fix. The X86 test fails without the patch.
The PowerPC test passes with and without the patch but is added as a way
track future possible failures when register classes are changed in a
future patch.

Full diff: https://github.com/llvm/llvm-project/pull/96839.diff

3 Files Affected:

(modified) llvm/lib/CodeGen/RegisterCoalescer.cpp (+8)
(added) llvm/test/CodeGen/PowerPC/subreg-coalescer.mir (+26)
(added) llvm/test/CodeGen/X86/subreg-fail.mir (+124)

diff --git a/llvm/lib/CodeGen/RegisterCoalescer.cpp b/llvm/lib/CodeGen/RegisterCoalescer.cpp
index 3397cb4f3661fe..75cdcf811fd401 100644
--- a/llvm/lib/CodeGen/RegisterCoalescer.cpp
+++ b/llvm/lib/CodeGen/RegisterCoalescer.cpp
@@ -3671,6 +3671,14 @@ bool RegisterCoalescer::joinVirtRegs(CoalescerPair &CP) {
     // having stale segments.
     LHSVals.pruneMainSegments(LHS, ShrinkMainRange);
 
+    LHSVals.pruneSubRegValues(LHS, ShrinkMask);
+    RHSVals.pruneSubRegValues(LHS, ShrinkMask);
+  } else if (TrackSubRegLiveness && !CP.getDstIdx() && CP.getSrcIdx()) {
+    LHS.createSubRangeFrom(LIS->getVNInfoAllocator(),
+                           CP.getNewRC()->getLaneMask(), LHS);
+    mergeSubRangeInto(LHS, RHS, TRI->getSubRegIndexLaneMask(CP.getSrcIdx()), CP,
+                      CP.getDstIdx());
+    LHSVals.pruneMainSegments(LHS, ShrinkMainRange);
     LHSVals.pruneSubRegValues(LHS, ShrinkMask);
     RHSVals.pruneSubRegValues(LHS, ShrinkMask);
   }
diff --git a/llvm/test/CodeGen/PowerPC/subreg-coalescer.mir b/llvm/test/CodeGen/PowerPC/subreg-coalescer.mir
new file mode 100644
index 00000000000000..8b04997ff335f7
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/subreg-coalescer.mir
@@ -0,0 +1,26 @@
+# RUN: llc -mtriple powerpc64le-unknown-linux-gnu -mcpu=pwr8 -x mir < %s \
+# RUN:   -verify-machineinstrs --run-pass=register-coalescer -o - | FileCheck %s
+
+---
+name: check_subregs
+alignment:       16
+tracksRegLiveness: true
+body:             |
+  bb.0.entry:
+    liveins: $x3
+
+    %0:g8rc_and_g8rc_nox0 = COPY $x3
+    %3:f8rc, %4:g8rc_and_g8rc_nox0 = LFSUX %0, %0
+    %5:f4rc = FRSP killed %3, implicit $rm
+    %22:vslrc = SUBREG_TO_REG 1, %5, %subreg.sub_64
+    %11:vrrc = XVCVDPSP killed %22, implicit $rm
+    $v2 = COPY %11
+    BLR8 implicit $lr8, implicit $rm, implicit $v2
+...
+
+# CHECK:         %0:g8rc_and_g8rc_nox0 = COPY $x3
+# CHECK-NEXT:    %1:f8rc, dead %2:g8rc_and_g8rc_nox0 = LFSUX %0, %0
+# CHECK-NEXT:    undef %4.sub_64:vslrc = FRSP %1, implicit $rm
+# CHECK-NEXT:    %5:vrrc = XVCVDPSP %4, implicit $rm
+# CHECK-NEXT:    $v2 = COPY %5
+# CHECK-NEXT:    BLR8 implicit $lr8, implicit $rm, implicit $v2
diff --git a/llvm/test/CodeGen/X86/subreg-fail.mir b/llvm/test/CodeGen/X86/subreg-fail.mir
new file mode 100644
index 00000000000000..a563141d59c9fe
--- /dev/null
+++ b/llvm/test/CodeGen/X86/subreg-fail.mir
@@ -0,0 +1,124 @@
+# RUN: llc -mtriple x86_64-unknown-unknown -x mir < %s \
+# RUN:   -verify-machineinstrs -enable-subreg-liveness=true \
+# RUN:   --run-pass=register-coalescer -o - | FileCheck %s
+
+--- |
+  target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+  target triple = "x86_64-unknown-unknown"
+  
+  %pair = type { i64, double }
+  %t21 = type { ptr }
+  %t13 = type { ptr, %t15, %t15 }
+  %t15 = type { i8, i32, i32 }
+  
+  @__force_order = external hidden global i32, align 4
+  @.str = private unnamed_addr constant { [1 x i8], [63 x i8] } zeroinitializer, align 32
+  @a = external global i32, align 4
+  @fn1.g = private unnamed_addr constant [9 x ptr] [ptr null, ptr @a, ptr null, ptr null, ptr null, ptr null, ptr null, ptr null, ptr null], align 16
+  @e = external global i32, align 4
+  @__stack_chk_guard = external dso_local global ptr
+  
+  ; Function Attrs: nounwind ssp
+  define i32 @test1() #0 {
+  entry:
+    %tmp5.i = load volatile i32, ptr undef, align 4
+    %conv.i = zext i32 %tmp5.i to i64
+    %tmp12.i = load volatile i32, ptr undef, align 4
+    %conv13.i = zext i32 %tmp12.i to i64
+    %shl.i = shl i64 %conv13.i, 32
+    %or.i = or i64 %shl.i, %conv.i
+    %add16.i = add i64 %or.i, 256
+    %shr.i = lshr i64 %add16.i, 8
+    %conv19.i = trunc i64 %shr.i to i32
+    store volatile i32 %conv19.i, ptr undef, align 4
+    ret i32 undef
+  }
+...
+---
+name:            test1
+alignment:       16
+exposesReturnsTwice: false
+legalized:       false
+regBankSelected: false
+selected:        false
+failedISel:      false
+tracksRegLiveness: true
+hasWinCFI:       false
+callsEHReturn:   false
+callsUnwindInit: false
+hasEHCatchret:   false
+hasEHScopes:     false
+hasEHFunclets:   false
+isOutlined:      false
+debugInstrRef:   true
+failsVerification: false
+tracksDebugUserValues: false
+registers:
+  - { id: 0, class: gr32, preferred-register: '' }
+  - { id: 1, class: gr64, preferred-register: '' }
+  - { id: 2, class: gr64_nosp, preferred-register: '' }
+  - { id: 3, class: gr32, preferred-register: '' }
+  - { id: 4, class: gr64, preferred-register: '' }
+  - { id: 5, class: gr64, preferred-register: '' }
+  - { id: 6, class: gr64, preferred-register: '' }
+  - { id: 7, class: gr64, preferred-register: '' }
+  - { id: 8, class: gr64, preferred-register: '' }
+  - { id: 9, class: gr32, preferred-register: '' }
+  - { id: 10, class: gr64, preferred-register: '' }
+  - { id: 11, class: gr32, preferred-register: '' }
+liveins:         []
+frameInfo:
+  isFrameAddressTaken: false
+  isReturnAddressTaken: false
+  hasStackMap:     false
+  hasPatchPoint:   false
+  stackSize:       0
+  offsetAdjustment: 0
+  maxAlignment:    1
+  adjustsStack:    false
+  hasCalls:        false
+  stackProtector:  ''
+  functionContext: ''
+  maxCallFrameSize: 4294967295
+  cvBytesOfCalleeSavedRegisters: 0
+  hasOpaqueSPAdjustment: false
+  hasVAStart:      false
+  hasMustTailInVarArgFunc: false
+  hasTailCall:     false
+  isCalleeSavedInfoValid: false
+  localFrameSize:  0
+  savePoint:       ''
+  restorePoint:    ''
+fixedStack:      []
+stack:           []
+entry_values:    []
+callSites:       []
+debugValueSubstitutions: []
+constants:       []
+machineFunctionInfo:
+  amxProgModel:    None
+body:             |
+  bb.0.entry:
+    %0:gr32 = MOV32rm undef %1:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
+    %2:gr64_nosp = SUBREG_TO_REG 0, killed %0, %subreg.sub_32bit
+    %3:gr32 = MOV32rm undef %4:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
+    %5:gr64 = SUBREG_TO_REG 0, killed %3, %subreg.sub_32bit
+    %6:gr64 = COPY killed %5
+    %6:gr64 = SHL64ri %6, 32, implicit-def dead $eflags
+    %7:gr64 = LEA64r killed %6, 1, killed %2, 256, $noreg
+    %8:gr64 = COPY killed %7
+    %8:gr64 = SHR64ri %8, 8, implicit-def dead $eflags
+    %9:gr32 = COPY killed %8.sub_32bit
+    MOV32mr undef %10:gr64, 1, $noreg, 0, $noreg, killed %9 :: (volatile store (s32) into `ptr undef`)
+    RET 0, undef $eax
+
+...
+
+# CHECK:         undef %2.sub_32bit:gr64_nosp = MOV32rm undef %1:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
+# CHECK-NEXT:    undef %6.sub_32bit:gr64_with_sub_8bit = MOV32rm undef %4:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
+# CHECK-NEXT:    %6:gr64_with_sub_8bit = SHL64ri %6, 32, implicit-def dead $eflags
+# CHECK-NEXT:    %8:gr64_with_sub_8bit = LEA64r %6, 1, %2, 256, $noreg
+# CHECK-NEXT:    %8:gr64_with_sub_8bit = SHR64ri %8, 8, implicit-def dead $eflags
+# CHECK-NEXT:    MOV32mr undef %10:gr64, 1, $noreg, 0, $noreg, %8.sub_32bit :: (volatile store (s32) into `ptr undef`)
+# CHECK-NEXT:    RET 0, undef $eax
+

llvmbot · 2024-06-27T02:44:22Z

@llvm/pr-subscribers-llvm-regalloc

Author: Stefan Pintilie (stefanp-ibm)

Changes

Two tests are added to this fix. The X86 test fails without the patch.
The PowerPC test passes with and without the patch but is added as a way
track future possible failures when register classes are changed in a
future patch.

Full diff: https://github.com/llvm/llvm-project/pull/96839.diff

3 Files Affected:

(modified) llvm/lib/CodeGen/RegisterCoalescer.cpp (+8)
(added) llvm/test/CodeGen/PowerPC/subreg-coalescer.mir (+26)
(added) llvm/test/CodeGen/X86/subreg-fail.mir (+124)

diff --git a/llvm/lib/CodeGen/RegisterCoalescer.cpp b/llvm/lib/CodeGen/RegisterCoalescer.cpp
index 3397cb4f3661fe..75cdcf811fd401 100644
--- a/llvm/lib/CodeGen/RegisterCoalescer.cpp
+++ b/llvm/lib/CodeGen/RegisterCoalescer.cpp
@@ -3671,6 +3671,14 @@ bool RegisterCoalescer::joinVirtRegs(CoalescerPair &CP) {
     // having stale segments.
     LHSVals.pruneMainSegments(LHS, ShrinkMainRange);
 
+    LHSVals.pruneSubRegValues(LHS, ShrinkMask);
+    RHSVals.pruneSubRegValues(LHS, ShrinkMask);
+  } else if (TrackSubRegLiveness && !CP.getDstIdx() && CP.getSrcIdx()) {
+    LHS.createSubRangeFrom(LIS->getVNInfoAllocator(),
+                           CP.getNewRC()->getLaneMask(), LHS);
+    mergeSubRangeInto(LHS, RHS, TRI->getSubRegIndexLaneMask(CP.getSrcIdx()), CP,
+                      CP.getDstIdx());
+    LHSVals.pruneMainSegments(LHS, ShrinkMainRange);
     LHSVals.pruneSubRegValues(LHS, ShrinkMask);
     RHSVals.pruneSubRegValues(LHS, ShrinkMask);
   }
diff --git a/llvm/test/CodeGen/PowerPC/subreg-coalescer.mir b/llvm/test/CodeGen/PowerPC/subreg-coalescer.mir
new file mode 100644
index 00000000000000..8b04997ff335f7
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/subreg-coalescer.mir
@@ -0,0 +1,26 @@
+# RUN: llc -mtriple powerpc64le-unknown-linux-gnu -mcpu=pwr8 -x mir < %s \
+# RUN:   -verify-machineinstrs --run-pass=register-coalescer -o - | FileCheck %s
+
+---
+name: check_subregs
+alignment:       16
+tracksRegLiveness: true
+body:             |
+  bb.0.entry:
+    liveins: $x3
+
+    %0:g8rc_and_g8rc_nox0 = COPY $x3
+    %3:f8rc, %4:g8rc_and_g8rc_nox0 = LFSUX %0, %0
+    %5:f4rc = FRSP killed %3, implicit $rm
+    %22:vslrc = SUBREG_TO_REG 1, %5, %subreg.sub_64
+    %11:vrrc = XVCVDPSP killed %22, implicit $rm
+    $v2 = COPY %11
+    BLR8 implicit $lr8, implicit $rm, implicit $v2
+...
+
+# CHECK:         %0:g8rc_and_g8rc_nox0 = COPY $x3
+# CHECK-NEXT:    %1:f8rc, dead %2:g8rc_and_g8rc_nox0 = LFSUX %0, %0
+# CHECK-NEXT:    undef %4.sub_64:vslrc = FRSP %1, implicit $rm
+# CHECK-NEXT:    %5:vrrc = XVCVDPSP %4, implicit $rm
+# CHECK-NEXT:    $v2 = COPY %5
+# CHECK-NEXT:    BLR8 implicit $lr8, implicit $rm, implicit $v2
diff --git a/llvm/test/CodeGen/X86/subreg-fail.mir b/llvm/test/CodeGen/X86/subreg-fail.mir
new file mode 100644
index 00000000000000..a563141d59c9fe
--- /dev/null
+++ b/llvm/test/CodeGen/X86/subreg-fail.mir
@@ -0,0 +1,124 @@
+# RUN: llc -mtriple x86_64-unknown-unknown -x mir < %s \
+# RUN:   -verify-machineinstrs -enable-subreg-liveness=true \
+# RUN:   --run-pass=register-coalescer -o - | FileCheck %s
+
+--- |
+  target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+  target triple = "x86_64-unknown-unknown"
+  
+  %pair = type { i64, double }
+  %t21 = type { ptr }
+  %t13 = type { ptr, %t15, %t15 }
+  %t15 = type { i8, i32, i32 }
+  
+  @__force_order = external hidden global i32, align 4
+  @.str = private unnamed_addr constant { [1 x i8], [63 x i8] } zeroinitializer, align 32
+  @a = external global i32, align 4
+  @fn1.g = private unnamed_addr constant [9 x ptr] [ptr null, ptr @a, ptr null, ptr null, ptr null, ptr null, ptr null, ptr null, ptr null], align 16
+  @e = external global i32, align 4
+  @__stack_chk_guard = external dso_local global ptr
+  
+  ; Function Attrs: nounwind ssp
+  define i32 @test1() #0 {
+  entry:
+    %tmp5.i = load volatile i32, ptr undef, align 4
+    %conv.i = zext i32 %tmp5.i to i64
+    %tmp12.i = load volatile i32, ptr undef, align 4
+    %conv13.i = zext i32 %tmp12.i to i64
+    %shl.i = shl i64 %conv13.i, 32
+    %or.i = or i64 %shl.i, %conv.i
+    %add16.i = add i64 %or.i, 256
+    %shr.i = lshr i64 %add16.i, 8
+    %conv19.i = trunc i64 %shr.i to i32
+    store volatile i32 %conv19.i, ptr undef, align 4
+    ret i32 undef
+  }
+...
+---
+name:            test1
+alignment:       16
+exposesReturnsTwice: false
+legalized:       false
+regBankSelected: false
+selected:        false
+failedISel:      false
+tracksRegLiveness: true
+hasWinCFI:       false
+callsEHReturn:   false
+callsUnwindInit: false
+hasEHCatchret:   false
+hasEHScopes:     false
+hasEHFunclets:   false
+isOutlined:      false
+debugInstrRef:   true
+failsVerification: false
+tracksDebugUserValues: false
+registers:
+  - { id: 0, class: gr32, preferred-register: '' }
+  - { id: 1, class: gr64, preferred-register: '' }
+  - { id: 2, class: gr64_nosp, preferred-register: '' }
+  - { id: 3, class: gr32, preferred-register: '' }
+  - { id: 4, class: gr64, preferred-register: '' }
+  - { id: 5, class: gr64, preferred-register: '' }
+  - { id: 6, class: gr64, preferred-register: '' }
+  - { id: 7, class: gr64, preferred-register: '' }
+  - { id: 8, class: gr64, preferred-register: '' }
+  - { id: 9, class: gr32, preferred-register: '' }
+  - { id: 10, class: gr64, preferred-register: '' }
+  - { id: 11, class: gr32, preferred-register: '' }
+liveins:         []
+frameInfo:
+  isFrameAddressTaken: false
+  isReturnAddressTaken: false
+  hasStackMap:     false
+  hasPatchPoint:   false
+  stackSize:       0
+  offsetAdjustment: 0
+  maxAlignment:    1
+  adjustsStack:    false
+  hasCalls:        false
+  stackProtector:  ''
+  functionContext: ''
+  maxCallFrameSize: 4294967295
+  cvBytesOfCalleeSavedRegisters: 0
+  hasOpaqueSPAdjustment: false
+  hasVAStart:      false
+  hasMustTailInVarArgFunc: false
+  hasTailCall:     false
+  isCalleeSavedInfoValid: false
+  localFrameSize:  0
+  savePoint:       ''
+  restorePoint:    ''
+fixedStack:      []
+stack:           []
+entry_values:    []
+callSites:       []
+debugValueSubstitutions: []
+constants:       []
+machineFunctionInfo:
+  amxProgModel:    None
+body:             |
+  bb.0.entry:
+    %0:gr32 = MOV32rm undef %1:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
+    %2:gr64_nosp = SUBREG_TO_REG 0, killed %0, %subreg.sub_32bit
+    %3:gr32 = MOV32rm undef %4:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
+    %5:gr64 = SUBREG_TO_REG 0, killed %3, %subreg.sub_32bit
+    %6:gr64 = COPY killed %5
+    %6:gr64 = SHL64ri %6, 32, implicit-def dead $eflags
+    %7:gr64 = LEA64r killed %6, 1, killed %2, 256, $noreg
+    %8:gr64 = COPY killed %7
+    %8:gr64 = SHR64ri %8, 8, implicit-def dead $eflags
+    %9:gr32 = COPY killed %8.sub_32bit
+    MOV32mr undef %10:gr64, 1, $noreg, 0, $noreg, killed %9 :: (volatile store (s32) into `ptr undef`)
+    RET 0, undef $eax
+
+...
+
+# CHECK:         undef %2.sub_32bit:gr64_nosp = MOV32rm undef %1:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
+# CHECK-NEXT:    undef %6.sub_32bit:gr64_with_sub_8bit = MOV32rm undef %4:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
+# CHECK-NEXT:    %6:gr64_with_sub_8bit = SHL64ri %6, 32, implicit-def dead $eflags
+# CHECK-NEXT:    %8:gr64_with_sub_8bit = LEA64r %6, 1, %2, 256, $noreg
+# CHECK-NEXT:    %8:gr64_with_sub_8bit = SHR64ri %8, 8, implicit-def dead $eflags
+# CHECK-NEXT:    MOV32mr undef %10:gr64, 1, $noreg, 0, $noreg, %8.sub_32bit :: (volatile store (s32) into `ptr undef`)
+# CHECK-NEXT:    RET 0, undef $eax
+

llvmbot · 2024-06-27T02:44:22Z

@llvm/pr-subscribers-backend-powerpc

Author: Stefan Pintilie (stefanp-ibm)

Changes

Two tests are added to this fix. The X86 test fails without the patch.
The PowerPC test passes with and without the patch but is added as a way
track future possible failures when register classes are changed in a
future patch.

Full diff: https://github.com/llvm/llvm-project/pull/96839.diff

3 Files Affected:

(modified) llvm/lib/CodeGen/RegisterCoalescer.cpp (+8)
(added) llvm/test/CodeGen/PowerPC/subreg-coalescer.mir (+26)
(added) llvm/test/CodeGen/X86/subreg-fail.mir (+124)

diff --git a/llvm/lib/CodeGen/RegisterCoalescer.cpp b/llvm/lib/CodeGen/RegisterCoalescer.cpp
index 3397cb4f3661fe..75cdcf811fd401 100644
--- a/llvm/lib/CodeGen/RegisterCoalescer.cpp
+++ b/llvm/lib/CodeGen/RegisterCoalescer.cpp
@@ -3671,6 +3671,14 @@ bool RegisterCoalescer::joinVirtRegs(CoalescerPair &CP) {
     // having stale segments.
     LHSVals.pruneMainSegments(LHS, ShrinkMainRange);
 
+    LHSVals.pruneSubRegValues(LHS, ShrinkMask);
+    RHSVals.pruneSubRegValues(LHS, ShrinkMask);
+  } else if (TrackSubRegLiveness && !CP.getDstIdx() && CP.getSrcIdx()) {
+    LHS.createSubRangeFrom(LIS->getVNInfoAllocator(),
+                           CP.getNewRC()->getLaneMask(), LHS);
+    mergeSubRangeInto(LHS, RHS, TRI->getSubRegIndexLaneMask(CP.getSrcIdx()), CP,
+                      CP.getDstIdx());
+    LHSVals.pruneMainSegments(LHS, ShrinkMainRange);
     LHSVals.pruneSubRegValues(LHS, ShrinkMask);
     RHSVals.pruneSubRegValues(LHS, ShrinkMask);
   }
diff --git a/llvm/test/CodeGen/PowerPC/subreg-coalescer.mir b/llvm/test/CodeGen/PowerPC/subreg-coalescer.mir
new file mode 100644
index 00000000000000..8b04997ff335f7
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/subreg-coalescer.mir
@@ -0,0 +1,26 @@
+# RUN: llc -mtriple powerpc64le-unknown-linux-gnu -mcpu=pwr8 -x mir < %s \
+# RUN:   -verify-machineinstrs --run-pass=register-coalescer -o - | FileCheck %s
+
+---
+name: check_subregs
+alignment:       16
+tracksRegLiveness: true
+body:             |
+  bb.0.entry:
+    liveins: $x3
+
+    %0:g8rc_and_g8rc_nox0 = COPY $x3
+    %3:f8rc, %4:g8rc_and_g8rc_nox0 = LFSUX %0, %0
+    %5:f4rc = FRSP killed %3, implicit $rm
+    %22:vslrc = SUBREG_TO_REG 1, %5, %subreg.sub_64
+    %11:vrrc = XVCVDPSP killed %22, implicit $rm
+    $v2 = COPY %11
+    BLR8 implicit $lr8, implicit $rm, implicit $v2
+...
+
+# CHECK:         %0:g8rc_and_g8rc_nox0 = COPY $x3
+# CHECK-NEXT:    %1:f8rc, dead %2:g8rc_and_g8rc_nox0 = LFSUX %0, %0
+# CHECK-NEXT:    undef %4.sub_64:vslrc = FRSP %1, implicit $rm
+# CHECK-NEXT:    %5:vrrc = XVCVDPSP %4, implicit $rm
+# CHECK-NEXT:    $v2 = COPY %5
+# CHECK-NEXT:    BLR8 implicit $lr8, implicit $rm, implicit $v2
diff --git a/llvm/test/CodeGen/X86/subreg-fail.mir b/llvm/test/CodeGen/X86/subreg-fail.mir
new file mode 100644
index 00000000000000..a563141d59c9fe
--- /dev/null
+++ b/llvm/test/CodeGen/X86/subreg-fail.mir
@@ -0,0 +1,124 @@
+# RUN: llc -mtriple x86_64-unknown-unknown -x mir < %s \
+# RUN:   -verify-machineinstrs -enable-subreg-liveness=true \
+# RUN:   --run-pass=register-coalescer -o - | FileCheck %s
+
+--- |
+  target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+  target triple = "x86_64-unknown-unknown"
+  
+  %pair = type { i64, double }
+  %t21 = type { ptr }
+  %t13 = type { ptr, %t15, %t15 }
+  %t15 = type { i8, i32, i32 }
+  
+  @__force_order = external hidden global i32, align 4
+  @.str = private unnamed_addr constant { [1 x i8], [63 x i8] } zeroinitializer, align 32
+  @a = external global i32, align 4
+  @fn1.g = private unnamed_addr constant [9 x ptr] [ptr null, ptr @a, ptr null, ptr null, ptr null, ptr null, ptr null, ptr null, ptr null], align 16
+  @e = external global i32, align 4
+  @__stack_chk_guard = external dso_local global ptr
+  
+  ; Function Attrs: nounwind ssp
+  define i32 @test1() #0 {
+  entry:
+    %tmp5.i = load volatile i32, ptr undef, align 4
+    %conv.i = zext i32 %tmp5.i to i64
+    %tmp12.i = load volatile i32, ptr undef, align 4
+    %conv13.i = zext i32 %tmp12.i to i64
+    %shl.i = shl i64 %conv13.i, 32
+    %or.i = or i64 %shl.i, %conv.i
+    %add16.i = add i64 %or.i, 256
+    %shr.i = lshr i64 %add16.i, 8
+    %conv19.i = trunc i64 %shr.i to i32
+    store volatile i32 %conv19.i, ptr undef, align 4
+    ret i32 undef
+  }
+...
+---
+name:            test1
+alignment:       16
+exposesReturnsTwice: false
+legalized:       false
+regBankSelected: false
+selected:        false
+failedISel:      false
+tracksRegLiveness: true
+hasWinCFI:       false
+callsEHReturn:   false
+callsUnwindInit: false
+hasEHCatchret:   false
+hasEHScopes:     false
+hasEHFunclets:   false
+isOutlined:      false
+debugInstrRef:   true
+failsVerification: false
+tracksDebugUserValues: false
+registers:
+  - { id: 0, class: gr32, preferred-register: '' }
+  - { id: 1, class: gr64, preferred-register: '' }
+  - { id: 2, class: gr64_nosp, preferred-register: '' }
+  - { id: 3, class: gr32, preferred-register: '' }
+  - { id: 4, class: gr64, preferred-register: '' }
+  - { id: 5, class: gr64, preferred-register: '' }
+  - { id: 6, class: gr64, preferred-register: '' }
+  - { id: 7, class: gr64, preferred-register: '' }
+  - { id: 8, class: gr64, preferred-register: '' }
+  - { id: 9, class: gr32, preferred-register: '' }
+  - { id: 10, class: gr64, preferred-register: '' }
+  - { id: 11, class: gr32, preferred-register: '' }
+liveins:         []
+frameInfo:
+  isFrameAddressTaken: false
+  isReturnAddressTaken: false
+  hasStackMap:     false
+  hasPatchPoint:   false
+  stackSize:       0
+  offsetAdjustment: 0
+  maxAlignment:    1
+  adjustsStack:    false
+  hasCalls:        false
+  stackProtector:  ''
+  functionContext: ''
+  maxCallFrameSize: 4294967295
+  cvBytesOfCalleeSavedRegisters: 0
+  hasOpaqueSPAdjustment: false
+  hasVAStart:      false
+  hasMustTailInVarArgFunc: false
+  hasTailCall:     false
+  isCalleeSavedInfoValid: false
+  localFrameSize:  0
+  savePoint:       ''
+  restorePoint:    ''
+fixedStack:      []
+stack:           []
+entry_values:    []
+callSites:       []
+debugValueSubstitutions: []
+constants:       []
+machineFunctionInfo:
+  amxProgModel:    None
+body:             |
+  bb.0.entry:
+    %0:gr32 = MOV32rm undef %1:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
+    %2:gr64_nosp = SUBREG_TO_REG 0, killed %0, %subreg.sub_32bit
+    %3:gr32 = MOV32rm undef %4:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
+    %5:gr64 = SUBREG_TO_REG 0, killed %3, %subreg.sub_32bit
+    %6:gr64 = COPY killed %5
+    %6:gr64 = SHL64ri %6, 32, implicit-def dead $eflags
+    %7:gr64 = LEA64r killed %6, 1, killed %2, 256, $noreg
+    %8:gr64 = COPY killed %7
+    %8:gr64 = SHR64ri %8, 8, implicit-def dead $eflags
+    %9:gr32 = COPY killed %8.sub_32bit
+    MOV32mr undef %10:gr64, 1, $noreg, 0, $noreg, killed %9 :: (volatile store (s32) into `ptr undef`)
+    RET 0, undef $eax
+
+...
+
+# CHECK:         undef %2.sub_32bit:gr64_nosp = MOV32rm undef %1:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
+# CHECK-NEXT:    undef %6.sub_32bit:gr64_with_sub_8bit = MOV32rm undef %4:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
+# CHECK-NEXT:    %6:gr64_with_sub_8bit = SHL64ri %6, 32, implicit-def dead $eflags
+# CHECK-NEXT:    %8:gr64_with_sub_8bit = LEA64r %6, 1, %2, 256, $noreg
+# CHECK-NEXT:    %8:gr64_with_sub_8bit = SHR64ri %8, 8, implicit-def dead $eflags
+# CHECK-NEXT:    MOV32mr undef %10:gr64, 1, $noreg, 0, $noreg, %8.sub_32bit :: (volatile store (s32) into `ptr undef`)
+# CHECK-NEXT:    RET 0, undef $eax
+

bzEq · 2024-06-27T03:10:44Z

I think we should pre-commit the X86 test case to show where failure occurs.

arsenm · 2024-06-27T08:23:31Z

Title should be descriptive of what the problem is, not a vague "fix issue"

arsenm · 2024-06-27T08:23:52Z

llvm/test/CodeGen/PowerPC/subreg-coalescer.mir

+alignment:       16
+tracksRegLiveness: true
+body:             |
+  bb.0.entry:


Remove .entry block labels

arsenm · 2024-06-27T08:24:01Z

llvm/test/CodeGen/PowerPC/subreg-coalescer.mir

+    liveins: $x3
+
+    %0:g8rc_and_g8rc_nox0 = COPY $x3
+    %3:f8rc, %4:g8rc_and_g8rc_nox0 = LFSUX %0, %0


Can you run run-pass=none to compact the virtual register numbers

arsenm · 2024-06-27T08:24:17Z

llvm/test/CodeGen/PowerPC/subreg-coalescer.mir

@@ -0,0 +1,26 @@
+# RUN: llc -mtriple powerpc64le-unknown-linux-gnu -mcpu=pwr8 -x mir < %s \
+# RUN:   -verify-machineinstrs --run-pass=register-coalescer -o - | FileCheck %s


-verify-coalescing would be more useful than -verify-machineinstrs. Also, should use update_mir_test_checks

Also not using stdin and directly reading the file avoids the need for -x mir

llvm/test/CodeGen/X86/subreg-fail.mir

qcolombet · 2024-06-27T12:25:24Z

Title should be descriptive of what the problem is, not a vague "fix issue"

+1 on that.

Could you describe what was the issue and how this patch fixes it?

arsenm

Test looks better but title and description still need details

arsenm · 2024-06-27T18:56:45Z

llvm/test/CodeGen/PowerPC/subreg-coalescer.mir

+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple powerpc64le-unknown-linux-gnu -mcpu=pwr8 -x mir < %s \
+# RUN:   -verify-coalescing --run-pass=register-coalescer -o - | FileCheck %s
+


Can you add a comment explaining the test?

arsenm · 2024-06-27T18:56:49Z

llvm/test/CodeGen/X86/subreg-fail.mir

+# RUN: llc -mtriple x86_64-unknown-unknown -x mir < %s \
+# RUN:   -verify-coalescing -enable-subreg-liveness=true \
+# RUN:   --run-pass=register-coalescer -o - | FileCheck %s
+


Can you add a comment explaining the test?

stefanp-ibm · 2024-06-27T23:56:18Z

I think we should pre-commit the X86 test case to show where failure occurs.

I can do this once the reviewers are happy with the test cases. For the X86 testcase I will probably just use -enable-subreg-liveness=true so that it actually passes and then set it to true when this patch goes in.

qcolombet · 2024-06-28T10:25:27Z

%2 [32r,112r:0) 0@32r L000000000000000F [16r,112r:0) 0@16r weight:0.000000e+00

What were we generating previously?

Also, shouldn't it be L000000000000000F [16r, 32r0) 0@16r, i.e., 32r instead of 112r?

bzEq · 2024-06-28T11:10:42Z

%2 [32r,112r:0) 0@32r L000000000000000F [16r,112r:0) 0@16r weight:0.000000e+00

Is it typo? The main liverange should not be shorter than subrange. I tried this patch, looks it should be
Before coalescing

%2 [32r,112r:0) 0@32r  weight:0.000000e+00

After coalescing

%2 [16r,112r:0) 0@16r  L000000000000000F [16r,112r:0) 0@16r  weight:0.000000e+00

During coalescing,

32B %2:gr64_nosp = SUBREG_TO_REG 0, %0:gr32, %subreg.sub_32bit                                                                                                                               
  Considering merging to GR64_NOSP with %0 in %2:sub_32bit                                                                                                                                   
    RHS = %0 [16r,32r:0) 0@16r  weight:0.000000e+00                                                                                                                                          
    LHS = %2 [32r,112r:0) 0@32r  weight:0.000000e+00                                                                                                                                         
    merge %2:0@32r into %0:0@16r --> @16r                                                                                                                                                    
    merge %2:0@32r into %0:0@16r --> @16r                                                                                                                                                    
    joined lanes: 000000000000000F [16r,112r:0) 0@16r                                                                                                                                        
    Expecting instruction removal at 32r                                                                                                                                                     
    erased: 32r %2:gr64_nosp = SUBREG_TO_REG 0, %0:gr32, %subreg.sub_32bit
    updated: 16B  undef %2.sub_32bit:gr64_nosp = MOV32rm undef %1:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)                                                       
  Success: %0:sub_32bit -> %2                                                                                                                                                                
  Result = %2 [16r,112r:0) 0@16r  L000000000000000F [16r,112r:0) 0@16r  weight:0.000000e+00

stefanp-ibm · 2024-07-15T15:47:31Z

%2 [32r,112r:0) 0@32r L000000000000000F [16r,112r:0) 0@16r weight:0.000000e+00

What were we generating previously?

Also, shouldn't it be L000000000000000F [16r, 32r0) 0@16r, i.e., 32r instead of 112r?

@qcolombet

I'm sorry the line above is a typo. The range should be 16r to 112r.

Before we added the fix we would have:

32B	%2:gr64_nosp = SUBREG_TO_REG 0, %0:gr32, %subreg.sub_32bit
	Considering merging to GR64_NOSP with %0 in %2:sub_32bit
		RHS = %0 [16r,32r:0) 0@16r  weight:0.000000e+00
		LHS = %2 [32r,112r:0) 0@32r  weight:0.000000e+00
		merge %2:0@32r into %0:0@16r --> @16r
		erased:	32r	%2:gr64_nosp = SUBREG_TO_REG 0, %0:gr32, %subreg.sub_32bit
AllocationOrder(GR64) = [ $rax $rcx $rdx $rsi $rdi $r8 $r9 $r10 $r11 $rbx $r14 $r15 $r12 $r13 $rbp ]
AllocationOrder(GR64_NOSP) = [ $rax $rcx $rdx $rsi $rdi $r8 $r9 $r10 $r11 $rbx $r14 $r15 $r12 $r13 $rbp ]
		updated: 16B	undef %2.sub_32bit:gr64_nosp = MOV32rm undef %1:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
	Success: %0:sub_32bit -> %2
	Result = %2 [16r,112r:0) 0@16r  weight:0.000000e+00

So, basically the result is %2 [16r,112r:0) 0@16r weight:0.000000e+00.
Which would be verified with -verify-coalescing and we would get the following error:

*** Bad machine code: Live interval for subreg operand has no subranges ***
- function:    test1
- basic block: %bb.0  (0xf2913e5e220) [0B;208B)
- instruction: 16B	undef %2.sub_32bit:gr64_nosp = MOV32rm undef %1:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
- operand 0:   undef %2.sub_32bit:gr64_nosp

After we add the fix we get the following:

32B	%2:gr64_nosp = SUBREG_TO_REG 0, %0:gr32, %subreg.sub_32bit
	Considering merging to GR64_NOSP with %0 in %2:sub_32bit
		RHS = %0 [16r,32r:0) 0@16r  weight:0.000000e+00
		LHS = %2 [32r,112r:0) 0@32r  weight:0.000000e+00
		merge %2:0@32r into %0:0@16r --> @16r
		merge %2:0@32r into %0:0@16r --> @16r
		joined lanes: 000000000000000F [16r,112r:0) 0@16r
		Expecting instruction removal at 32r
		erased:	32r	%2:gr64_nosp = SUBREG_TO_REG 0, %0:gr32, %subreg.sub_32bit
AllocationOrder(GR64) = [ $rax $rcx $rdx $rsi $rdi $r8 $r9 $r10 $r11 $rbx $r14 $r15 $r12 $r13 $rbp ]
AllocationOrder(GR64_NOSP) = [ $rax $rcx $rdx $rsi $rdi $r8 $r9 $r10 $r11 $rbx $r14 $r15 $r12 $r13 $rbp ]
		updated: 16B	undef %2.sub_32bit:gr64_nosp = MOV32rm undef %1:gr64, 1, $noreg, 0, $noreg :: (volatile load (s32) from `ptr undef`)
	Success: %0:sub_32bit -> %2
	Result = %2 [16r,112r:0) 0@16r  L000000000000000F [16r,112r:0) 0@16r  weight:0.000000e+00

So the result is %2 [16r,112r:0) 0@16r L000000000000000F [16r,112r:0) 0@16r weight:0.000000e+00 and in this case the verification doesn't find any issues with it.

stefanp-ibm · 2024-07-15T15:50:37Z

%2 [32r,112r:0) 0@32r L000000000000000F [16r,112r:0) 0@16r weight:0.000000e+00

Is it typo? The main liverange should not be shorter than subrange. I tried this patch, looks it should be Before coalescing

Yes, this is a typo and I will fix the description. As you mentioned the result should be:

%2 [16r,112r:0) 0@16r  L000000000000000F [16r,112r:0) 0@16r  weight:0.000000e+00

stefanp-ibm · 2024-07-15T16:19:54Z

I think we should pre-commit the X86 test case to show where failure occurs.

I would like to do this but the test actually hits an assert so we don't actually generate any code for the X86 test unless I remove the -verify-coalescing option which shows the actual failure.

stefanp-ibm · 2024-07-17T14:10:43Z

Gentle ping.

bzEq · 2024-07-23T15:23:21Z

llvm/test/CodeGen/X86/subreg-fail.mir

@@ -0,0 +1,37 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5


Pre-commit this case to show where MachineVerifier complains when specifying -verify-coalescing -enable-subreg-liveness=true?

bzEq · 2024-07-23T15:30:15Z

I would like to do this but the test actually hits an assert so we don't actually generate any code for the X86 test unless I remove the -verify-coalescing option which shows the actual failure.

We can include error messages in the pre-commit case to show where the problem is. Something like

not --crash llc .... | FileCheck ...
; CHECK: <error information>

I think it's ok we dont't have code generated for an expected failing case.

arsenm

lgtm with nit

arsenm · 2024-07-23T17:08:49Z

llvm/test/CodeGen/PowerPC/subreg-coalescer.mir

@@ -0,0 +1,34 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple powerpc64le-unknown-linux-gnu -mcpu=pwr8 -x mir < %s \


Suggested change

# RUN: llc -mtriple powerpc64le-unknown-linux-gnu -mcpu=pwr8 -x mir < %s \

# RUN: llc -mtriple powerpc64le-unknown-linux-gnu -mcpu=pwr8 %s \

arsenm · 2024-07-23T17:09:08Z

llvm/test/CodeGen/PowerPC/subreg-coalescer.mir

+    ; CHECK-NEXT: BLR8 implicit $lr8, implicit $rm, implicit $v2
+    %0:g8rc_and_g8rc_nox0 = COPY $x3
+    %1:f8rc, %2:g8rc_and_g8rc_nox0 = LFSUX %0, %0
+    %3:f4rc = FRSP killed %1, implicit $rm


Compact the register numbers, can do this with -run-pass=none

I did do this but the numbers didn't change. I think it is because both %1 and %2 are defined on line 27.

arsenm · 2024-07-23T17:09:21Z

llvm/test/CodeGen/X86/subreg-fail.mir

@@ -0,0 +1,37 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple x86_64-unknown-unknown -x mir < %s \


Suggested change

# RUN: llc -mtriple x86_64-unknown-unknown -x mir < %s \

# RUN: llc -mtriple x86_64-unknown-unknown %s \

arsenm · 2024-07-23T17:09:30Z

llvm/test/CodeGen/X86/subreg-fail.mir

@@ -0,0 +1,37 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple x86_64-unknown-unknown -x mir < %s \
+# RUN:   -verify-coalescing -enable-subreg-liveness=true \


Suggested change

# RUN: -verify-coalescing -enable-subreg-liveness=true \

# RUN: -verify-coalescing -enable-subreg-liveness \

qcolombet

Thanks for the update.

Makes sense now!

Two tests are added to this fix. The X86 test fails without the patch. The PowerPC test passes with and without the patch but is added as a way track future possible failures when register classes are changed in a future patch.

…er. (#96839) Summary: The issue with the handling of the SUBREG_TO_REG is that we don't join the subranges correctly when we join live ranges across the SUBREG_TO_REG. For example when joining across this: ``` 32B %2:gr64_nosp = SUBREG_TO_REG 0, %0:gr32, %subreg.sub_32bit ``` we want to join these live ranges: ``` %0 [16r,32r:0) 0@16r weight:0.000000e+00 %2 [32r,112r:0) 0@32r weight:0.000000e+00 ``` Before the fix the range for the resulting merged `%2` is: ``` %2 [16r,112r:0) 0@16r weight:0.000000e+00 ``` After the fix it is now this: ``` %2 [16r,112r:0) 0@16r L000000000000000F [16r,112r:0) 0@16r weight:0.000000e+00 ``` Two tests are added to this fix. The X86 test fails without the patch. The PowerPC test passes with and without the patch but is added as a way track future possible failures when register classes are changed in a future patch. Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60250666

stefanp-ibm · 2024-07-29T19:30:14Z

/cherry-pick 26fa399

…er. (llvm#96839) The issue with the handling of the SUBREG_TO_REG is that we don't join the subranges correctly when we join live ranges across the SUBREG_TO_REG. For example when joining across this: ``` 32B %2:gr64_nosp = SUBREG_TO_REG 0, %0:gr32, %subreg.sub_32bit ``` we want to join these live ranges: ``` %0 [16r,32r:0) 0@16r weight:0.000000e+00 %2 [32r,112r:0) 0@32r weight:0.000000e+00 ``` Before the fix the range for the resulting merged `%2` is: ``` %2 [16r,112r:0) 0@16r weight:0.000000e+00 ``` After the fix it is now this: ``` %2 [16r,112r:0) 0@16r L000000000000000F [16r,112r:0) 0@16r weight:0.000000e+00 ``` Two tests are added to this fix. The X86 test fails without the patch. The PowerPC test passes with and without the patch but is added as a way track future possible failures when register classes are changed in a future patch. (cherry picked from commit 26fa399)

llvmbot · 2024-07-29T19:35:10Z

/pull-request #101071

…er. (#96839) The issue with the handling of the SUBREG_TO_REG is that we don't join the subranges correctly when we join live ranges across the SUBREG_TO_REG. For example when joining across this: ``` 32B %2:gr64_nosp = SUBREG_TO_REG 0, %0:gr32, %subreg.sub_32bit ``` we want to join these live ranges: ``` %0 [16r,32r:0) 0@16r weight:0.000000e+00 %2 [32r,112r:0) 0@32r weight:0.000000e+00 ``` Before the fix the range for the resulting merged `%2` is: ``` %2 [16r,112r:0) 0@16r weight:0.000000e+00 ``` After the fix it is now this: ``` %2 [16r,112r:0) 0@16r L000000000000000F [16r,112r:0) 0@16r weight:0.000000e+00 ``` Two tests are added to this fix. The X86 test fails without the patch. The PowerPC test passes with and without the patch but is added as a way track future possible failures when register classes are changed in a future patch. (cherry picked from commit 26fa399)

…er. (llvm#96839) The issue with the handling of the SUBREG_TO_REG is that we don't join the subranges correctly when we join live ranges across the SUBREG_TO_REG. For example when joining across this: ``` 32B %2:gr64_nosp = SUBREG_TO_REG 0, %0:gr32, %subreg.sub_32bit ``` we want to join these live ranges: ``` %0 [16r,32r:0) 0@16r weight:0.000000e+00 %2 [32r,112r:0) 0@32r weight:0.000000e+00 ``` Before the fix the range for the resulting merged `%2` is: ``` %2 [16r,112r:0) 0@16r weight:0.000000e+00 ``` After the fix it is now this: ``` %2 [16r,112r:0) 0@16r L000000000000000F [16r,112r:0) 0@16r weight:0.000000e+00 ``` Two tests are added to this fix. The X86 test fails without the patch. The PowerPC test passes with and without the patch but is added as a way track future possible failures when register classes are changed in a future patch.

llvmbot added backend:PowerPC backend:X86 llvm:regalloc labels Jun 27, 2024

stefanp-ibm mentioned this pull request Jun 27, 2024

[PowerPC] Add phony subregisters to cover the high half of the VSX registers. #94628

Merged

stefanp-ibm requested review from arsenm and bzEq June 27, 2024 02:48

bzEq requested review from qcolombet and MatzeB June 27, 2024 03:08

arsenm reviewed Jun 27, 2024

View reviewed changes

stefanp-ibm changed the title ~~[RegisterCoalescer] Fix issue in the RegisterCoalescer.~~ [RegisterCoalescer] Fix SUBREG_TO_REG handling in the RegisterCoalescer. Jun 27, 2024

bzEq requested review from jayfoad and arsenm July 18, 2024 03:25

stefanp-ibm force-pushed the stefanp/RegCoalesceFix branch 2 times, most recently from 303826f to 6c707ed Compare July 18, 2024 19:40

bzEq reviewed Jul 23, 2024

View reviewed changes

arsenm approved these changes Jul 23, 2024

View reviewed changes

stefanp-ibm force-pushed the stefanp/RegCoalesceFix branch from 6c707ed to a9c6e9d Compare July 23, 2024 19:11

qcolombet approved these changes Jul 23, 2024

View reviewed changes

stefanp-ibm added 4 commits July 23, 2024 20:51

[RegisterCoalescer] Fix issue in the RegisterCoalescer.

ad3ed4b

Two tests are added to this fix. The X86 test fails without the patch. The PowerPC test passes with and without the patch but is added as a way track future possible failures when register classes are changed in a future patch.

Updated and cleaned up test cases.

88da974

Added little comment to each test.

db3f082

Cleanup of the two test cases.

d23a9b4

stefanp-ibm force-pushed the stefanp/RegCoalesceFix branch from a9c6e9d to d23a9b4 Compare July 24, 2024 01:58

stefanp-ibm merged commit 26fa399 into llvm:main Jul 24, 2024
4 of 7 checks passed

daltenty added this to the LLVM 19.X Release milestone Jul 29, 2024

philnik777 mentioned this pull request Aug 2, 2024

[Clang] Add a release note deprecating __is_nullptr #101638

Merged

mgabka mentioned this pull request Aug 22, 2024

Add release note about ABI mgabka/llvm-project#6

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RegisterCoalescer] Fix SUBREG_TO_REG handling in the RegisterCoalescer. #96839

[RegisterCoalescer] Fix SUBREG_TO_REG handling in the RegisterCoalescer. #96839

stefanp-ibm commented Jun 27, 2024 •

edited

Loading

llvmbot commented Jun 27, 2024

llvmbot commented Jun 27, 2024

llvmbot commented Jun 27, 2024

bzEq commented Jun 27, 2024

arsenm commented Jun 27, 2024

arsenm Jun 27, 2024

arsenm Jun 27, 2024

arsenm Jun 27, 2024

arsenm Jun 27, 2024

qcolombet commented Jun 27, 2024

arsenm left a comment

arsenm Jun 27, 2024

arsenm Jun 27, 2024

stefanp-ibm commented Jun 27, 2024

qcolombet commented Jun 28, 2024

bzEq commented Jun 28, 2024

stefanp-ibm commented Jul 15, 2024

stefanp-ibm commented Jul 15, 2024

stefanp-ibm commented Jul 15, 2024

stefanp-ibm commented Jul 17, 2024

bzEq Jul 23, 2024

bzEq commented Jul 23, 2024 •

edited

Loading

arsenm left a comment

arsenm Jul 23, 2024

arsenm Jul 23, 2024

stefanp-ibm Jul 23, 2024

arsenm Jul 23, 2024

arsenm Jul 23, 2024

qcolombet left a comment

stefanp-ibm commented Jul 29, 2024

llvmbot commented Jul 29, 2024

		@@ -0,0 +1,26 @@
		# RUN: llc -mtriple powerpc64le-unknown-linux-gnu -mcpu=pwr8 -x mir < %s \
		# RUN: -verify-machineinstrs --run-pass=register-coalescer -o - \| FileCheck %s

		@@ -0,0 +1,37 @@
		# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5

		@@ -0,0 +1,34 @@
		# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
		# RUN: llc -mtriple powerpc64le-unknown-linux-gnu -mcpu=pwr8 -x mir < %s \

		@@ -0,0 +1,37 @@
		# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
		# RUN: llc -mtriple x86_64-unknown-unknown -x mir < %s \

	# RUN: -verify-coalescing -enable-subreg-liveness=true \
	# RUN: -verify-coalescing -enable-subreg-liveness \

[RegisterCoalescer] Fix SUBREG_TO_REG handling in the RegisterCoalescer. #96839

[RegisterCoalescer] Fix SUBREG_TO_REG handling in the RegisterCoalescer. #96839

Conversation

stefanp-ibm commented Jun 27, 2024 • edited Loading

llvmbot commented Jun 27, 2024

llvmbot commented Jun 27, 2024

llvmbot commented Jun 27, 2024

bzEq commented Jun 27, 2024

arsenm commented Jun 27, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qcolombet commented Jun 27, 2024

arsenm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stefanp-ibm commented Jun 27, 2024

qcolombet commented Jun 28, 2024

bzEq commented Jun 28, 2024

stefanp-ibm commented Jul 15, 2024

stefanp-ibm commented Jul 15, 2024

stefanp-ibm commented Jul 15, 2024

stefanp-ibm commented Jul 17, 2024

Choose a reason for hiding this comment

bzEq commented Jul 23, 2024 • edited Loading

arsenm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qcolombet left a comment

Choose a reason for hiding this comment

stefanp-ibm commented Jul 29, 2024

llvmbot commented Jul 29, 2024

stefanp-ibm commented Jun 27, 2024 •

edited

Loading

bzEq commented Jul 23, 2024 •

edited

Loading