Skip to content

Conversation

@asb
Copy link
Contributor

@asb asb commented Jul 20, 2025

This refactor was suggested in
#144703.

I have checked for unexpected changes by comparing builds of llvm-test-suite with/without this refactor, including with preferWInst force enabled.


After this, the logic changes for #144703 can be implemented with just:

--- a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
+++ b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
@@ -736,7 +736,8 @@ bool RISCVOptWInstrs::canonicalizeWSuffixes(MachineFunction &MF,
     for (MachineInstr &MI : MBB) {
       std::optional<unsigned> WOpc;
       std::optional<unsigned> NonWOpc;
-      switch (MI.getOpcode()) {
+      unsigned OrigOpc = MI.getOpcode();
+      switch (OrigOpc) {
       default:
         continue;
       case RISCV::ADDW:
@@ -786,7 +787,8 @@ bool RISCVOptWInstrs::canonicalizeWSuffixes(MachineFunction &MF,
         MadeChange = true;
         continue;
       }
-      if (ShouldPreferW && WOpc.has_value() && hasAllWUsers(MI, ST, MRI)) {
+      if ((ShouldPreferW || OrigOpc == RISCV::LWU) && WOpc.has_value() &&
+          hasAllWUsers(MI, ST, MRI)) {
         LLVM_DEBUG(dbgs() << "Replacing " << MI);
         MI.setDesc(TII.get(WOpc.value()));
         MI.clearFlag(MachineInstr::MIFlag::NoSWrap);

…es into canonicalizeWSuffixes

This refactor was suggested in
<llvm#144703>.

I have checked for unexpected changes by comparing builds of
llvm-test-suite with/without this refactor, including with preferWInst
force enabled.
@llvmbot
Copy link
Member

llvmbot commented Jul 20, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Alex Bradbury (asb)

Changes

This refactor was suggested in
<#144703>.

I have checked for unexpected changes by comparing builds of llvm-test-suite with/without this refactor, including with preferWInst force enabled.


After this, the logic changes for #144703 can be implemented with just:

--- a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
+++ b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
@@ -736,7 +736,8 @@ bool RISCVOptWInstrs::canonicalizeWSuffixes(MachineFunction &amp;MF,
     for (MachineInstr &amp;MI : MBB) {
       std::optional&lt;unsigned&gt; WOpc;
       std::optional&lt;unsigned&gt; NonWOpc;
-      switch (MI.getOpcode()) {
+      unsigned OrigOpc = MI.getOpcode();
+      switch (OrigOpc) {
       default:
         continue;
       case RISCV::ADDW:
@@ -786,7 +787,8 @@ bool RISCVOptWInstrs::canonicalizeWSuffixes(MachineFunction &amp;MF,
         MadeChange = true;
         continue;
       }
-      if (ShouldPreferW &amp;&amp; WOpc.has_value() &amp;&amp; hasAllWUsers(MI, ST, MRI)) {
+      if ((ShouldPreferW || OrigOpc == RISCV::LWU) &amp;&amp; WOpc.has_value() &amp;&amp;
+          hasAllWUsers(MI, ST, MRI)) {
         LLVM_DEBUG(dbgs() &lt;&lt; "Replacing " &lt;&lt; MI);
         MI.setDesc(TII.get(WOpc.value()));
         MI.clearFlag(MachineInstr::MIFlag::NoSWrap);

Full diff: https://github.com/llvm/llvm-project/pull/149710.diff

1 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp (+41-53)
diff --git a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
index 09a31fb2306de..35e0f733062f0 100644
--- a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
+++ b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
@@ -69,10 +69,9 @@ class RISCVOptWInstrs : public MachineFunctionPass {
   bool runOnMachineFunction(MachineFunction &MF) override;
   bool removeSExtWInstrs(MachineFunction &MF, const RISCVInstrInfo &TII,
                          const RISCVSubtarget &ST, MachineRegisterInfo &MRI);
-  bool stripWSuffixes(MachineFunction &MF, const RISCVInstrInfo &TII,
-                      const RISCVSubtarget &ST, MachineRegisterInfo &MRI);
-  bool appendWSuffixes(MachineFunction &MF, const RISCVInstrInfo &TII,
-                       const RISCVSubtarget &ST, MachineRegisterInfo &MRI);
+  bool canonicalizeWSuffixes(MachineFunction &MF, const RISCVInstrInfo &TII,
+                             const RISCVSubtarget &ST,
+                             MachineRegisterInfo &MRI);
 
   void getAnalysisUsage(AnalysisUsage &AU) const override {
     AU.setPreservesCFG();
@@ -723,51 +722,38 @@ bool RISCVOptWInstrs::removeSExtWInstrs(MachineFunction &MF,
   return MadeChange;
 }
 
-bool RISCVOptWInstrs::stripWSuffixes(MachineFunction &MF,
-                                     const RISCVInstrInfo &TII,
-                                     const RISCVSubtarget &ST,
-                                     MachineRegisterInfo &MRI) {
+// Strips or adds W suffixes to eligible instructions depending on the
+// subtarget preferences.
+bool RISCVOptWInstrs::canonicalizeWSuffixes(MachineFunction &MF,
+                                            const RISCVInstrInfo &TII,
+                                            const RISCVSubtarget &ST,
+                                            MachineRegisterInfo &MRI) {
+  bool ShouldStripW = !(DisableStripWSuffix || ST.preferWInst());
+  bool ShouldPreferW = ST.preferWInst();
   bool MadeChange = false;
-  for (MachineBasicBlock &MBB : MF) {
-    for (MachineInstr &MI : MBB) {
-      unsigned Opc;
-      // clang-format off
-      switch (MI.getOpcode()) {
-      default:
-        continue;
-      case RISCV::ADDW:  Opc = RISCV::ADD;  break;
-      case RISCV::ADDIW: Opc = RISCV::ADDI; break;
-      case RISCV::MULW:  Opc = RISCV::MUL;  break;
-      case RISCV::SLLIW: Opc = RISCV::SLLI; break;
-      case RISCV::SUBW:  Opc = RISCV::SUB;  break;
-      }
-      // clang-format on
 
-      if (hasAllWUsers(MI, ST, MRI)) {
-        LLVM_DEBUG(dbgs() << "Replacing " << MI);
-        MI.setDesc(TII.get(Opc));
-        LLVM_DEBUG(dbgs() << "     with " << MI);
-        ++NumTransformedToNonWInstrs;
-        MadeChange = true;
-      }
-    }
-  }
-
-  return MadeChange;
-}
-
-bool RISCVOptWInstrs::appendWSuffixes(MachineFunction &MF,
-                                      const RISCVInstrInfo &TII,
-                                      const RISCVSubtarget &ST,
-                                      MachineRegisterInfo &MRI) {
-  bool MadeChange = false;
   for (MachineBasicBlock &MBB : MF) {
     for (MachineInstr &MI : MBB) {
-      unsigned WOpc;
-      // TODO: Add more?
+      std::optional<unsigned> WOpc;
+      std::optional<unsigned> NonWOpc;
       switch (MI.getOpcode()) {
       default:
         continue;
+      case RISCV::ADDW:
+        NonWOpc = RISCV::ADD;
+        break;
+      case RISCV::ADDIW:
+        NonWOpc = RISCV::ADDI;
+        break;
+      case RISCV::MULW:
+        NonWOpc = RISCV::MUL;
+        break;
+      case RISCV::SLLIW:
+        NonWOpc = RISCV::SLLI;
+        break;
+      case RISCV::SUBW:
+        NonWOpc = RISCV::SUB;
+        break;
       case RISCV::ADD:
         WOpc = RISCV::ADDW;
         break;
@@ -781,7 +767,7 @@ bool RISCVOptWInstrs::appendWSuffixes(MachineFunction &MF,
         WOpc = RISCV::MULW;
         break;
       case RISCV::SLLI:
-        // SLLIW reads the lowest 5 bits, while SLLI reads lowest 6 bits
+        // SLLIW reads the lowest 5 bits, while SLLI reads lowest 6 bits.
         if (MI.getOperand(2).getImm() >= 32)
           continue;
         WOpc = RISCV::SLLIW;
@@ -792,19 +778,27 @@ bool RISCVOptWInstrs::appendWSuffixes(MachineFunction &MF,
         break;
       }
 
-      if (hasAllWUsers(MI, ST, MRI)) {
+      if (ShouldStripW && NonWOpc.has_value() && hasAllWUsers(MI, ST, MRI)) {
         LLVM_DEBUG(dbgs() << "Replacing " << MI);
-        MI.setDesc(TII.get(WOpc));
+        MI.setDesc(TII.get(NonWOpc.value()));
+        LLVM_DEBUG(dbgs() << "     with " << MI);
+        ++NumTransformedToNonWInstrs;
+        MadeChange = true;
+        continue;
+      }
+      if (ShouldPreferW && WOpc.has_value() && hasAllWUsers(MI, ST, MRI)) {
+        LLVM_DEBUG(dbgs() << "Replacing " << MI);
+        MI.setDesc(TII.get(WOpc.value()));
         MI.clearFlag(MachineInstr::MIFlag::NoSWrap);
         MI.clearFlag(MachineInstr::MIFlag::NoUWrap);
         MI.clearFlag(MachineInstr::MIFlag::IsExact);
         LLVM_DEBUG(dbgs() << "     with " << MI);
         ++NumTransformedToWInstrs;
         MadeChange = true;
+        continue;
       }
     }
   }
-
   return MadeChange;
 }
 
@@ -821,12 +815,6 @@ bool RISCVOptWInstrs::runOnMachineFunction(MachineFunction &MF) {
 
   bool MadeChange = false;
   MadeChange |= removeSExtWInstrs(MF, TII, ST, MRI);
-
-  if (!(DisableStripWSuffix || ST.preferWInst()))
-    MadeChange |= stripWSuffixes(MF, TII, ST, MRI);
-
-  if (ST.preferWInst())
-    MadeChange |= appendWSuffixes(MF, TII, ST, MRI);
-
+  MadeChange |= canonicalizeWSuffixes(MF, TII, ST, MRI);
   return MadeChange;
 }

@asb asb changed the title [RISCV][NFC] Combine RISCVOpwInstrs::stripWSuffixes and appendWSuffixes into canonicalizeWSuffixes [RISCV][NFC] Combine RISCVOptWInstrs::stripWSuffixes and appendWSuffixes into canonicalizeWSuffixes Jul 20, 2025
Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@asb asb merged commit 9311f38 into llvm:main Jul 21, 2025
11 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jul 21, 2025

LLVM Buildbot has detected a new failure on builder clang-hip-vega20 running on hip-vega20-0 while building llvm at step 3 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/123/builds/23763

Here is the relevant piece of the build log for the reference
Step 3 (annotate) failure: '../llvm-zorg/zorg/buildbot/builders/annotated/hip-build.sh --jobs=' (failure)
...
[57/59] Linking CXX executable External/HIP/math_h-hip-6.3.0
[58/59] Building CXX object External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o
[59/59] Linking CXX executable External/HIP/TheNextWeek-hip-6.3.0
@@@BUILD_STEP Testing HIP test-suite@@@
+ build_step 'Testing HIP test-suite'
+ echo '@@@BUILD_STEP Testing HIP test-suite@@@'
+ ninja check-hip-simple
[0/1] cd /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/External/HIP && /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/llvm/bin/llvm-lit -sv array-hip-6.3.0.test empty-hip-6.3.0.test with-fopenmp-hip-6.3.0.test saxpy-hip-6.3.0.test memmove-hip-6.3.0.test split-kernel-args-hip-6.3.0.test builtin-logb-scalbn-hip-6.3.0.test TheNextWeek-hip-6.3.0.test algorithm-hip-6.3.0.test cmath-hip-6.3.0.test complex-hip-6.3.0.test math_h-hip-6.3.0.test new-hip-6.3.0.test blender.test
-- Testing: 14 tests, 14 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90
FAIL: test-suite :: External/HIP/blender.test (14 of 14)
******************** TEST 'test-suite :: External/HIP/blender.test' FAILED ********************

/home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/tools/timeit-target --timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/External/HIP/Output/blender.test.out --redirect-input /dev/null --summary /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/External/HIP/Output/blender.test.time /bin/bash test_blender.sh
/bin/bash verify_blender.sh /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/External/HIP/Output/blender.test.out
Begin Blender test.
TEST_SUITE_HIP_ROOT=/opt/botworker/llvm/External/hip
Render /opt/botworker/llvm/External/hip/Blender_Scenes/290skydemo_release.blend
Blender 4.1.1 (hash e1743a0317bc built 2024-04-15 23:47:45)
Read blend: "/opt/botworker/llvm/External/hip/Blender_Scenes/290skydemo_release.blend"
Could not open as Ogawa file from provided streams.
Unable to open /opt/botworker/llvm/External/hip/Blender_Scenes/290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.002", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.003", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.004", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.001", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
Could not open as Ogawa file from provided streams.
Unable to open /opt/botworker/llvm/External/hip/Blender_Scenes/290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.002", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.003", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.004", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.001", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
I0721 09:51:33.234767 2366584 device.cpp:39] HIPEW initialization succeeded
I0721 09:51:33.236997 2366584 device.cpp:45] Found HIPCC hipcc
I0721 09:51:33.294867 2366584 device.cpp:207] Device has compute preemption or is not used for display.
I0721 09:51:33.294893 2366584 device.cpp:211] Added device "" with id "HIP__0000:a3:00".
I0721 09:51:33.294979 2366584 device.cpp:568] Mapped host memory limit set to 536,444,985,344 bytes. (499.60G)
I0721 09:51:33.295236 2366584 device_impl.cpp:63] Using AVX2 CPU kernels.
Fra:1 Mem:524.00M (Peak 524.70M) | Time:00:00.75 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Eyepiece_rim
Fra:1 Mem:524.00M (Peak 524.70M) | Time:00:00.75 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Rivets.021
Fra:1 Mem:524.00M (Peak 524.70M) | Time:00:00.75 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Rivets.039
Fra:1 Mem:525.62M (Peak 525.62M) | Time:00:00.75 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Hoses.003
Fra:1 Mem:536.23M (Peak 536.23M) | Time:00:00.76 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Curve_Connectors
Fra:1 Mem:536.23M (Peak 536.23M) | Time:00:00.76 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Curve_Connectors.001
Fra:1 Mem:536.45M (Peak 536.45M) | Time:00:00.76 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Curve_Connectors.002
Fra:1 Mem:536.45M (Peak 536.45M) | Time:00:00.76 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Curve_Connectors.006
Fra:1 Mem:536.56M (Peak 536.56M) | Time:00:00.76 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Curve_Connectors.007
Step 12 (Testing HIP test-suite) failure: Testing HIP test-suite (failure)
@@@BUILD_STEP Testing HIP test-suite@@@
+ build_step 'Testing HIP test-suite'
+ echo '@@@BUILD_STEP Testing HIP test-suite@@@'
+ ninja check-hip-simple
[0/1] cd /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/External/HIP && /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/llvm/bin/llvm-lit -sv array-hip-6.3.0.test empty-hip-6.3.0.test with-fopenmp-hip-6.3.0.test saxpy-hip-6.3.0.test memmove-hip-6.3.0.test split-kernel-args-hip-6.3.0.test builtin-logb-scalbn-hip-6.3.0.test TheNextWeek-hip-6.3.0.test algorithm-hip-6.3.0.test cmath-hip-6.3.0.test complex-hip-6.3.0.test math_h-hip-6.3.0.test new-hip-6.3.0.test blender.test
-- Testing: 14 tests, 14 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90
FAIL: test-suite :: External/HIP/blender.test (14 of 14)
******************** TEST 'test-suite :: External/HIP/blender.test' FAILED ********************

/home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/tools/timeit-target --timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/External/HIP/Output/blender.test.out --redirect-input /dev/null --summary /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/External/HIP/Output/blender.test.time /bin/bash test_blender.sh
/bin/bash verify_blender.sh /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/test-suite-build/External/HIP/Output/blender.test.out
Begin Blender test.
TEST_SUITE_HIP_ROOT=/opt/botworker/llvm/External/hip
Render /opt/botworker/llvm/External/hip/Blender_Scenes/290skydemo_release.blend
Blender 4.1.1 (hash e1743a0317bc built 2024-04-15 23:47:45)
Read blend: "/opt/botworker/llvm/External/hip/Blender_Scenes/290skydemo_release.blend"
Could not open as Ogawa file from provided streams.
Unable to open /opt/botworker/llvm/External/hip/Blender_Scenes/290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.002", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.003", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.004", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.001", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
Could not open as Ogawa file from provided streams.
Unable to open /opt/botworker/llvm/External/hip/Blender_Scenes/290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.002", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.003", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.004", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
WARN (bke.modifier): source/blender/blenkernel/intern/modifier.cc:425 BKE_modifier_set_error: Object: "GEO-flag.001", Modifier: "MeshSequenceCache", Could not create reader for file //290skydemo2_flags.abc
I0721 09:51:33.234767 2366584 device.cpp:39] HIPEW initialization succeeded
I0721 09:51:33.236997 2366584 device.cpp:45] Found HIPCC hipcc
I0721 09:51:33.294867 2366584 device.cpp:207] Device has compute preemption or is not used for display.
I0721 09:51:33.294893 2366584 device.cpp:211] Added device "" with id "HIP__0000:a3:00".
I0721 09:51:33.294979 2366584 device.cpp:568] Mapped host memory limit set to 536,444,985,344 bytes. (499.60G)
I0721 09:51:33.295236 2366584 device_impl.cpp:63] Using AVX2 CPU kernels.
Fra:1 Mem:524.00M (Peak 524.70M) | Time:00:00.75 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Eyepiece_rim
Fra:1 Mem:524.00M (Peak 524.70M) | Time:00:00.75 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Rivets.021
Fra:1 Mem:524.00M (Peak 524.70M) | Time:00:00.75 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Rivets.039
Fra:1 Mem:525.62M (Peak 525.62M) | Time:00:00.75 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Hoses.003
Fra:1 Mem:536.23M (Peak 536.23M) | Time:00:00.76 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Curve_Connectors
Fra:1 Mem:536.23M (Peak 536.23M) | Time:00:00.76 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Curve_Connectors.001
Fra:1 Mem:536.45M (Peak 536.45M) | Time:00:00.76 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Curve_Connectors.002
Fra:1 Mem:536.45M (Peak 536.45M) | Time:00:00.76 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Curve_Connectors.006
Fra:1 Mem:536.56M (Peak 536.56M) | Time:00:00.76 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Curve_Connectors.007
Fra:1 Mem:537.37M (Peak 537.37M) | Time:00:00.76 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Curve_Connectors.011
Fra:1 Mem:538.70M (Peak 538.70M) | Time:00:00.76 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Curve_Wires
Fra:1 Mem:539.39M (Peak 539.39M) | Time:00:00.76 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | GEO-Curve_Connectors.012
Fra:1 Mem:540.85M (Peak 540.84M) | Time:00:00.76 | Mem:0.00M, Peak:0.00M | Scene, View Layer | Synchronizing object | Cylinder.029

asb added a commit that referenced this pull request Jul 21, 2025
After the refactoring in #149710 the logic change is trivial.

Motivation for preferring sign-extended 32-bit loads (LW) vs
zero-extended (LWU):
* LW is compressible while LWU is not.
* Helps to minimise the diff vs RV32 (e.g. LWU vs LW)
* Helps to minimise distracting diffs vs GCC. I see this come up
frequently when comparing GCC code and in these cases it's a red
herring.

Similar normalisation could be done for LHU and LH, but this is less
well motivated as there is a compressed LHU (and if performing the
change in RISCVOptWInstrs it wouldn't be done for RV32). There is a
compressed LBU but not LB, meaning doing a similar normalisation for
byte-sized loads would actually be a regression in terms of code size.
Load narrowing when allowed by hasAllNBitUsers isn't explored in this
patch.

This changes ~20500 instructions in an RVA22 build of the
llvm-test-suite including SPEC 2017. As part of the review, the option
of doing the change at ISel time was explored but was found to be less
effective.
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Jul 28, 2025
…xes into canonicalizeWSuffixes (llvm#149710)

This refactor was suggested in
<llvm#144703>.

I have checked for unexpected changes by comparing builds of
llvm-test-suite with/without this refactor, including with preferWInst
force enabled.
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Jul 28, 2025
After the refactoring in llvm#149710 the logic change is trivial.

Motivation for preferring sign-extended 32-bit loads (LW) vs
zero-extended (LWU):
* LW is compressible while LWU is not.
* Helps to minimise the diff vs RV32 (e.g. LWU vs LW)
* Helps to minimise distracting diffs vs GCC. I see this come up
frequently when comparing GCC code and in these cases it's a red
herring.

Similar normalisation could be done for LHU and LH, but this is less
well motivated as there is a compressed LHU (and if performing the
change in RISCVOptWInstrs it wouldn't be done for RV32). There is a
compressed LBU but not LB, meaning doing a similar normalisation for
byte-sized loads would actually be a regression in terms of code size.
Load narrowing when allowed by hasAllNBitUsers isn't explored in this
patch.

This changes ~20500 instructions in an RVA22 build of the
llvm-test-suite including SPEC 2017. As part of the review, the option
of doing the change at ISel time was explored but was found to be less
effective.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants