AMDGPU: Move LICM after AMDGPUCodeGenPrepare

The commit that added the run says it's to hoist uniform parts of integer division expansion. That expansion is performed later, so this didn't do anything in that case. Move this later so the original test shows the improvement. This also saves a run of "Canonicalize natural loops". Not sure why this appears to be still getting a separate loop PM run. Also feels a bit heavy to run this just for divide. Is there a way to specifically hoist the divide sequence when it expands?
stepthomas · Jun 10, 2023 · 5b657f5 · 5b657f5
1 parent 8fdedcd
commit 5b657f5
Show file tree

Hide file tree

Showing 9 changed files with 438 additions and 534 deletions.
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -967,7 +967,6 @@ void AMDGPUPassConfig::addEarlyCSEOrGVNPass() {
 }
 
 void AMDGPUPassConfig::addStraightLineScalarOptimizationPasses() {
-  addPass(createLICMPass());
   addPass(createSeparateConstOffsetFromGEPPass());
   // ReassociateGEPs exposes more opportunities for SLSR. See
   // the example in reassociate-geps-and-slsr.ll.
@@ -1039,6 +1038,11 @@ void AMDGPUPassConfig::addIRPasses() {
       // TODO: May want to move later or split into an early and late one.
       addPass(createAMDGPUCodeGenPreparePass());
     }
+
+    // Try to hoist loop invariant parts of divisions AMDGPUCodeGenPrepare may
+    // have expanded.
+    if (TM.getOptLevel() > CodeGenOpt::Less)
+      addPass(createLICMPass());
   }
 
   TargetPassConfig::addIRPasses();