Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EntryExitInstrumenter] Move passes out of clang into LLVM default pipelines #92171

Merged
merged 20 commits into from
May 31, 2024

Conversation

pasko
Copy link
Contributor

@pasko pasko commented May 14, 2024

Move EntryExitInstrumenter(PostInlining=true) to as late as possible and
EntryExitInstrumenter(PostInlining=false) to an early pre-inlining stage
(but skip for ThinLTO post-link).

This should fix the issues reported in
rust-lang/rust#92109 and
#52853. These are caused
by https://reviews.llvm.org/D97608.

This change is not ready for landing yet.

Move EntryExitInstrumenter(PostInlining=true) to as late as possible and
EntryExitInstrumenter(PostInlining=false) to an early pre-inlining stage
(but skip for ThinLTO post-link).
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen llvm:transforms labels May 14, 2024
@llvmbot
Copy link
Member

llvmbot commented May 14, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Egor Pasko (pasko)

Changes

This change is not ready for landing yet.

Move EntryExitInstrumenter(PostInlining=true) to as late as possible and EntryExitInstrumenter(PostInlining=false) to an early pre-inlining stage (but skip for ThinLTO post-link).


Full diff: https://github.com/llvm/llvm-project/pull/92171.diff

10 Files Affected:

  • (modified) clang/lib/CodeGen/BackendUtil.cpp (+5-16)
  • (modified) llvm/include/llvm/InitializePasses.h (+2)
  • (modified) llvm/include/llvm/LinkAllPasses.h (+2)
  • (modified) llvm/include/llvm/Passes/CodeGenPassBuilder.h (-3)
  • (modified) llvm/include/llvm/Transforms/Utils.h (+9)
  • (modified) llvm/lib/CodeGen/TargetPassConfig.cpp (+3)
  • (modified) llvm/lib/Transforms/Scalar/Scalar.cpp (+2)
  • (modified) llvm/lib/Transforms/Utils/EntryExitInstrumenter.cpp (+62)
  • (modified) llvm/tools/llc/llc.cpp (+2)
  • (modified) llvm/tools/opt/optdriver.cpp (+2)
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 22c3f8642ad8e..f6ff605bcef87 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -987,22 +987,6 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
                                            /*DropTypeTests=*/true));
           });
 
-    if (CodeGenOpts.InstrumentFunctions ||
-        CodeGenOpts.InstrumentFunctionEntryBare ||
-        CodeGenOpts.InstrumentFunctionsAfterInlining ||
-        CodeGenOpts.InstrumentForProfiling) {
-      PB.registerPipelineStartEPCallback(
-          [](ModulePassManager &MPM, OptimizationLevel Level) {
-            MPM.addPass(createModuleToFunctionPassAdaptor(
-                EntryExitInstrumenterPass(/*PostInlining=*/false)));
-          });
-      PB.registerOptimizerLastEPCallback(
-          [](ModulePassManager &MPM, OptimizationLevel Level) {
-            MPM.addPass(createModuleToFunctionPassAdaptor(
-                EntryExitInstrumenterPass(/*PostInlining=*/true)));
-          });
-    }
-
     // Register callbacks to schedule sanitizer passes at the appropriate part
     // of the pipeline.
     if (LangOpts.Sanitize.has(SanitizerKind::LocalBounds))
@@ -1016,6 +1000,11 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
     if (!IsThinLTOPostLink) {
       addSanitizers(TargetTriple, CodeGenOpts, LangOpts, PB);
       addKCFIPass(TargetTriple, LangOpts, PB);
+      PB.registerPipelineStartEPCallback(
+          [](ModulePassManager &MPM, OptimizationLevel Level) {
+            MPM.addPass(createModuleToFunctionPassAdaptor(
+                EntryExitInstrumenterPass(/*PostInlining=*/false)));
+          });
     }
 
     if (std::optional<GCOVOptions> Options =
diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h
index 9ba75d491c1c9..e970706e8202d 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -101,6 +101,7 @@ void initializeEarlyMachineLICMPass(PassRegistry&);
 void initializeEarlyTailDuplicatePass(PassRegistry&);
 void initializeEdgeBundlesPass(PassRegistry&);
 void initializeEHContGuardCatchretPass(PassRegistry &);
+void initializeEntryExitInstrumenterPass(PassRegistry&);
 void initializeExpandLargeFpConvertLegacyPassPass(PassRegistry&);
 void initializeExpandLargeDivRemLegacyPassPass(PassRegistry&);
 void initializeExpandMemCmpLegacyPassPass(PassRegistry &);
@@ -230,6 +231,7 @@ void initializePostDomOnlyViewerWrapperPassPass(PassRegistry &);
 void initializePostDomPrinterWrapperPassPass(PassRegistry &);
 void initializePostDomViewerWrapperPassPass(PassRegistry &);
 void initializePostDominatorTreeWrapperPassPass(PassRegistry&);
+void initializePostInlineEntryExitInstrumenterPass(PassRegistry&);
 void initializePostMachineSchedulerPass(PassRegistry&);
 void initializePostRAHazardRecognizerPass(PassRegistry&);
 void initializePostRAMachineSinkingPass(PassRegistry&);
diff --git a/llvm/include/llvm/LinkAllPasses.h b/llvm/include/llvm/LinkAllPasses.h
index 30e7c22f31460..b432040b052a9 100644
--- a/llvm/include/llvm/LinkAllPasses.h
+++ b/llvm/include/llvm/LinkAllPasses.h
@@ -113,6 +113,8 @@ namespace {
       (void)llvm::createTLSVariableHoistPass();
       (void) llvm::createConstantHoistingPass();
       (void)llvm::createCodeGenPrepareLegacyPass();
+      (void) llvm::createEntryExitInstrumenterPass();
+      (void) llvm::createPostInlineEntryExitInstrumenterPass();
       (void) llvm::createEarlyCSEPass();
       (void) llvm::createGVNPass();
       (void) llvm::createPostDomTree();
diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index 2e94a19502131..b34f6e82fb7be 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -670,9 +670,6 @@ void CodeGenPassBuilder<Derived, TargetMachineT>::addIRPasses(
       !Opt.DisablePartialLibcallInlining)
     addPass(PartiallyInlineLibCallsPass());
 
-  // Instrument function entry and exit, e.g. with calls to mcount().
-  addPass(EntryExitInstrumenterPass(/*PostInlining=*/true));
-
   // Add scalarization of target's unsupported masked memory intrinsics pass.
   // the unsupported intrinsic will be replaced with a chain of basic blocks,
   // that stores/loads element one-by-one if the appropriate mask bit is set.
diff --git a/llvm/include/llvm/Transforms/Utils.h b/llvm/include/llvm/Transforms/Utils.h
index c6a6a05f3fddb..c3cb2908c8049 100644
--- a/llvm/include/llvm/Transforms/Utils.h
+++ b/llvm/include/llvm/Transforms/Utils.h
@@ -36,6 +36,15 @@ extern char &LowerInvokePassID;
 FunctionPass *createLowerSwitchPass();
 extern char &LowerSwitchID;
 
+//===----------------------------------------------------------------------===//
+//
+// EntryExitInstrumenter pass - Instrument function entry/exit with calls to
+// mcount(), @__cyg_profile_func_{enter,exit} and the like. There are two
+// variants, intended to run pre- and post-inlining, respectively.
+//
+FunctionPass *createEntryExitInstrumenterPass();
+FunctionPass *createPostInlineEntryExitInstrumenterPass();
+
 //===----------------------------------------------------------------------===//
 //
 // BreakCriticalEdges - Break all of the critical edges in the CFG by inserting
diff --git a/llvm/lib/CodeGen/TargetPassConfig.cpp b/llvm/lib/CodeGen/TargetPassConfig.cpp
index 8832b51333d91..6f19db623fe33 100644
--- a/llvm/lib/CodeGen/TargetPassConfig.cpp
+++ b/llvm/lib/CodeGen/TargetPassConfig.cpp
@@ -871,6 +871,9 @@ void TargetPassConfig::addIRPasses() {
   // passes since it emits those kinds of intrinsics.
   addPass(createExpandVectorPredicationPass());
 
+  // TODO: comment
+  addPass(createPostInlineEntryExitInstrumenterPass());
+
   // Add scalarization of target's unsupported masked memory intrinsics pass.
   // the unsupported intrinsic will be replaced with a chain of basic blocks,
   // that stores/loads element one-by-one if the appropriate mask bit is set.
diff --git a/llvm/lib/Transforms/Scalar/Scalar.cpp b/llvm/lib/Transforms/Scalar/Scalar.cpp
index 400b15284c1b8..cb37579f25051 100644
--- a/llvm/lib/Transforms/Scalar/Scalar.cpp
+++ b/llvm/lib/Transforms/Scalar/Scalar.cpp
@@ -48,4 +48,6 @@ void llvm::initializeScalarOpts(PassRegistry &Registry) {
   initializeSpeculativeExecutionLegacyPassPass(Registry);
   initializeStraightLineStrengthReduceLegacyPassPass(Registry);
   initializePlaceBackedgeSafepointsLegacyPassPass(Registry);
+  initializeEntryExitInstrumenterPass(Registry);
+  initializePostInlineEntryExitInstrumenterPass(Registry);
 }
diff --git a/llvm/lib/Transforms/Utils/EntryExitInstrumenter.cpp b/llvm/lib/Transforms/Utils/EntryExitInstrumenter.cpp
index 59a7dd1a00ed4..6326f2231f1dd 100644
--- a/llvm/lib/Transforms/Utils/EntryExitInstrumenter.cpp
+++ b/llvm/lib/Transforms/Utils/EntryExitInstrumenter.cpp
@@ -15,7 +15,10 @@
 #include "llvm/IR/Intrinsics.h"
 #include "llvm/IR/Module.h"
 #include "llvm/IR/Type.h"
+#include "llvm/InitializePasses.h"
 #include "llvm/TargetParser/Triple.h"
+#include "llvm/Pass.h"
+#include "llvm/Transforms/Utils.h"
 
 using namespace llvm;
 
@@ -135,6 +138,65 @@ static bool runOnFunction(Function &F, bool PostInlining) {
   return Changed;
 }
 
+namespace {
+struct EntryExitInstrumenter : public FunctionPass {
+  static char ID;
+  EntryExitInstrumenter() : FunctionPass(ID) {
+    initializeEntryExitInstrumenterPass(*PassRegistry::getPassRegistry());
+  }
+  void getAnalysisUsage(AnalysisUsage &AU) const override {
+    AU.addPreserved<GlobalsAAWrapperPass>();
+    AU.addPreserved<DominatorTreeWrapperPass>();
+  }
+  bool runOnFunction(Function &F) override { return ::runOnFunction(F, false); }
+};
+char EntryExitInstrumenter::ID = 0;
+
+struct PostInlineEntryExitInstrumenter : public FunctionPass {
+  static char ID;
+  PostInlineEntryExitInstrumenter() : FunctionPass(ID) {
+    initializePostInlineEntryExitInstrumenterPass(
+        *PassRegistry::getPassRegistry());
+  }
+  void getAnalysisUsage(AnalysisUsage &AU) const override {
+    AU.addPreserved<GlobalsAAWrapperPass>();
+    AU.addPreserved<DominatorTreeWrapperPass>();
+  }
+  bool runOnFunction(Function &F) override { return ::runOnFunction(F, true); }
+};
+char PostInlineEntryExitInstrumenter::ID = 0;
+}
+
+INITIALIZE_PASS_BEGIN(
+    EntryExitInstrumenter, "ee-instrument",
+    "Instrument function entry/exit with calls to e.g. mcount() (pre inlining)",
+    false, false)
+INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
+INITIALIZE_PASS_END(
+    EntryExitInstrumenter, "ee-instrument",
+    "Instrument function entry/exit with calls to e.g. mcount() (pre inlining)",
+    false, false)
+
+INITIALIZE_PASS_BEGIN(
+    PostInlineEntryExitInstrumenter, "post-inline-ee-instrument",
+    "Instrument function entry/exit with calls to e.g. mcount() "
+    "(post inlining)",
+    false, false)
+INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
+INITIALIZE_PASS_END(
+    PostInlineEntryExitInstrumenter, "post-inline-ee-instrument",
+    "Instrument function entry/exit with calls to e.g. mcount() "
+    "(post inlining)",
+    false, false)
+
+FunctionPass *llvm::createEntryExitInstrumenterPass() {
+  return new EntryExitInstrumenter();
+}
+
+FunctionPass *llvm::createPostInlineEntryExitInstrumenterPass() {
+  return new PostInlineEntryExitInstrumenter();
+}
+
 PreservedAnalyses
 llvm::EntryExitInstrumenterPass::run(Function &F, FunctionAnalysisManager &AM) {
   if (!runOnFunction(F, PostInlining))
diff --git a/llvm/tools/llc/llc.cpp b/llvm/tools/llc/llc.cpp
index b292f70ba89de..16fd7fcec2486 100644
--- a/llvm/tools/llc/llc.cpp
+++ b/llvm/tools/llc/llc.cpp
@@ -335,6 +335,8 @@ int main(int argc, char **argv) {
   initializeCodeGen(*Registry);
   initializeLoopStrengthReducePass(*Registry);
   initializeLowerIntrinsicsPass(*Registry);
+  initializeEntryExitInstrumenterPass(*Registry);
+  initializePostInlineEntryExitInstrumenterPass(*Registry);
   initializeUnreachableBlockElimLegacyPassPass(*Registry);
   initializeConstantHoistingLegacyPassPass(*Registry);
   initializeScalarOpts(*Registry);
diff --git a/llvm/tools/opt/optdriver.cpp b/llvm/tools/opt/optdriver.cpp
index 948148bb80498..dc15329515718 100644
--- a/llvm/tools/opt/optdriver.cpp
+++ b/llvm/tools/opt/optdriver.cpp
@@ -439,6 +439,8 @@ extern "C" int optMain(
   initializeIndirectBrExpandLegacyPassPass(Registry);
   initializeInterleavedLoadCombinePass(Registry);
   initializeInterleavedAccessPass(Registry);
+  initializeEntryExitInstrumenterPass(Registry);
+  initializePostInlineEntryExitInstrumenterPass(Registry);
   initializeUnreachableBlockElimLegacyPassPass(Registry);
   initializeExpandReductionsPass(Registry);
   initializeExpandVectorPredicationPass(Registry);

@pasko pasko marked this pull request as draft May 14, 2024 20:39
clang/lib/CodeGen/BackendUtil.cpp Outdated Show resolved Hide resolved
llvm/include/llvm/Passes/CodeGenPassBuilder.h Outdated Show resolved Hide resolved
llvm/lib/Transforms/Utils/EntryExitInstrumenter.cpp Outdated Show resolved Hide resolved
@aeubanks
Copy link
Contributor

can you add links to https://reviews.llvm.org/D97608, rust-lang/rust#92109, #52853

@pasko
Copy link
Contributor Author

pasko commented May 15, 2024

can you add links to https://reviews.llvm.org/D97608, rust-lang/rust#92109, #52853

Done (Updated the toplevel comment on the PR)

@pasko
Copy link
Contributor Author

pasko commented May 15, 2024

@aeubanks Thanks for the first set of comments. I think I addressed them all and checked that instrumentation is still WAI in my ThinkLTO example. I did not run/update tests yet, this is TBD. Another round of review pre-tests would be appreciated.

@pasko pasko requested a review from aeubanks May 15, 2024 12:20
Copy link
Contributor

@aeubanks aeubanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks pretty good to me

for testing the pre-inliner one, we should add some tests in llvm/test/Transforms/EntryExitInstrumenter/ that invoke things like opt -passes='default<O1>', opt -passes='thinlto-pre-link<O2>', opt -passes='thinlto<O3>' to make sure that the pass did/didn't insert the call given some IR with the appropriate function attribute

for testing the post-inliner one, an llc x86-64 test that checks that a call to the function was generated in the assembly given some IR with the appropriate function attribute is enough

llvm/include/llvm/InitializePasses.h Outdated Show resolved Hide resolved
llvm/lib/Passes/PassBuilderPipelines.cpp Outdated Show resolved Hide resolved
pasko added 2 commits May 16, 2024 14:59
A few more tests to go, still a WIP.

Change a number of existing tests to swallow the newly inserted pass
name.

Add two new tests:
* CodeGen/X86/instrument-function-inlined.ll
* llvm/test/Transforms/EntryExitInstrumenter/pre-inliner-instrumentation.ll
@pasko
Copy link
Contributor Author

pasko commented May 21, 2024

Hello @aeubanks,

I added a couple of new tests as you suggested. I do not have good intuition on their coverage. I think I will need to add a few cases, like always_inline or slightly more complicated code. Can you please take a look at the general structure? Does it roughly match your proposal above?

I would appreciate if you leave comments with specific suggestions if they come to your mind immediately.

This is still a WIP, but I thought that this PR would converge faster if I publish the current state now. A few trivial changes are still needed to tests, and I did not remove Clang tests yet.

@pasko pasko requested a review from aeubanks May 21, 2024 19:19
@@ -0,0 +1,31 @@
; RUN: opt -passes="default<O1>" -S < %s | FileCheck %s
; RUN: opt -passes="thinlto-pre-link<O2>" -S < %s | FileCheck %s
; RUN: opt -passes="thinlto-pre-link<O2>,thinlto<O3>" -S < %s | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to run thinlto<O3> here, but we should test that thinlto/lto don't run this pass by adding different RUN lines with a different FileCheck %s --checkprefix=NO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good idea! I still went with O2 for lto/thinlto because even at O1 it does not inline the alwaysinline leaf, oops.

@@ -0,0 +1,31 @@
; RUN: opt -passes="default<O1>" -S < %s | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should also test O0 versions of these, plus lto-pre-link

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I think, but it is evening, and I might be missing something.

define void @leaf_function() #0 {
; CHECK-LABEL: leaf_function:
; CHECK: callq __cyg_profile_func_enter_bare
; CHECK: movq leaf_function@GOTPCREL(%rip), %rdi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably don't need this CHECK line, it just makes the test more brittle

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was also thinking that the use of a particular relocation makes it more brittle. I think it is nice to check that func_exit is used without an argument (unlike __cyg_profile_func_enter_bare). This can be done with the CHECK-NEXT on the line that follows. Though I did not notice this kind of parameter passing is already checked in llvm/test/Transforms/EntryExitInstrumenter/mcount.ll, hence now I am not sure how much value there is in this check.

I would not probably regret removing the line, but I'd like to try one more thing:

; CHECK:       {{.*}} %rdi
; CHECK-NEXT:  callq  __cyg_profile_func_exit

Will it blend?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checking the specifics of how the function is called shouldn't be in this test that tests the pipeline, the point of this test is just a quick end-to-end test testing we inserted instrumentation. if we want to test this we should have a separate test focused on the exact instruction sequence. but I don't think we need to add that test coverage in this PR

so I'd remove the %rdi CHECK line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I removed the argument passing check

@pasko pasko requested a review from aeubanks May 22, 2024 19:08
define void @leaf_function() #0 {
; CHECK-LABEL: leaf_function:
; CHECK: callq __cyg_profile_func_enter_bare
; CHECK: movq leaf_function@GOTPCREL(%rip), %rdi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checking the specifics of how the function is called shouldn't be in this test that tests the pipeline, the point of this test is just a quick end-to-end test testing we inserted instrumentation. if we want to test this we should have a separate test focused on the exact instruction sequence. but I don't think we need to add that test coverage in this PR

so I'd remove the %rdi CHECK line

clang/test/Frontend/gnu-mcount.c Show resolved Hide resolved
Copy link
Contributor

@aeubanks aeubanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forgot to hit "submit review"...

clang/test/CodeGen/X86/x86_64-instrument-functions.c Outdated Show resolved Hide resolved
clang/test/CodeGen/X86/x86_64-instrument-functions.c Outdated Show resolved Hide resolved
clang/test/CodeGen/X86/x86_64-instrument-functions.c Outdated Show resolved Hide resolved
@pasko pasko changed the title wip: Move instrumentation passes Move instrumentation passes May 24, 2024
@pasko pasko marked this pull request as ready for review May 24, 2024 18:45
@pasko pasko requested a review from aeubanks May 24, 2024 18:45
Copy link
Contributor

@aeubanks aeubanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks pretty good, can you update the commit title to something like [EntryExitInstrumenter] Move passes out of clang into LLVM default pipelines

I've added test coverage for mcount-aix in 3ec57a7, can you merge that in?

clang/test/CodeGen/mcount-aix.c Outdated Show resolved Hide resolved
@@ -1030,6 +1036,12 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
Phase != ThinOrFullLTOPhase::ThinLTOPostLink)
MPM.addPass(SampleProfileProbePass(TM));

// Instrument function entry and exit before all inlining.
if (!isLTOPostLink(Phase)) {
MPM.addPass(createModuleToFunctionPassAdaptor(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually is it possible to add this into EarlyFPM below? it's nice to have fewer module->function adaptors. (e.g. in EarlyFPM we run all the function passes on a single function before moving to the next one which is nice for memory locality)

sorry for suggesting this so late, since it'll require updating all the pipeline tests... but first check that none of the other tests fail

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this, and all non-pipeline tests pass. Though in my local FDO+ThinLTO reproducer I saw this causes __cyg_profile_func_enter_bare to be inserted multiple times per toplevel function.

This behaviour surprised me. The bare instrumentation does not move with the change. Is removing the module to function pass adaptor making something easier to inline later on?

I'd like to investigate, but I am afraid I'll need some clues :)

The change:

diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 3a7634f90c07..cae340a19fb6 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -397,11 +397,6 @@ static bool isLTOPreLink(ThinOrFullLTOPhase Phase) {
          Phase == ThinOrFullLTOPhase::FullLTOPreLink;
 }
 
-static bool isLTOPostLink(ThinOrFullLTOPhase Phase) {
-  return Phase == ThinOrFullLTOPhase::ThinLTOPostLink ||
-         Phase == ThinOrFullLTOPhase::FullLTOPostLink;
-}
-
 // TODO: Investigate the cost/benefit of tail call elimination on debugging.
 FunctionPassManager
 PassBuilder::buildO1FunctionSimplificationPipeline(OptimizationLevel Level,
@@ -1039,12 +1034,6 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
       Phase != ThinOrFullLTOPhase::ThinLTOPostLink)
     MPM.addPass(SampleProfileProbePass(TM));
 
-  // Instrument function entry and exit before all inlining.
-  if (!isLTOPostLink(Phase)) {
-    MPM.addPass(createModuleToFunctionPassAdaptor(
-        EntryExitInstrumenterPass(/*PostInlining=*/false)));
-  }
-
   bool HasSampleProfile = PGOOpt && (PGOOpt->Action == PGOOptions::SampleUse);
 
   // In ThinLTO mode, when flattened profile is used, all the available
@@ -1081,6 +1070,9 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
     MPM.addPass(CoroEarlyPass());
 
     FunctionPassManager EarlyFPM;
+    if (Phase != ThinOrFullLTOPhase::FullLTOPostLink) {
+      EarlyFPM.addPass(EntryExitInstrumenterPass(/*PostInlining=*/false));
+    }
     // Lower llvm.expect to metadata before attempting transforms.
     // Compare/branch metadata may alter the behavior of passes like
     // SimplifyCFG.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the check for Phase != ThinOrFullLTOPhase::FullLTOPostLink is unnecessary, buildModuleSimplificationPipeline isn't called for FullLTO post link

I'm not sure why you're seeing that behavior if we're only running the post-inline instrumenter once in the codegen pipeline

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you have a small local repro, you can pass -Wl,-mllvm,-print-after-all to the link command to have it print IR after every pass and see where the calls are getting added. or to see changes in the pre-link optimization pipeline, -mllvm=-print-after-all (or -mllvm=-print-changed=quiet, which doesn't work with the IR parts of the codegen pipeline for reasons)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the check for Phase != ThinOrFullLTOPhase::FullLTOPostLink is unnecessary, buildModuleSimplificationPipeline isn't called for FullLTO post link

Removed. Indeed there is an assertion for this a few lines prior.

I'm not sure why you're seeing that behavior if we're only running the post-inline instrumenter once in the codegen pipeline

Apparently I made a mistake when building the reproducer. When checking today I saw everything working as expected, i.e. <=1 bare hooks inserted per function in the final DSO. Sorry for the noise.

@pasko pasko changed the title Move instrumentation passes [EntryExitInstrumenter] Move passes out of clang into LLVM default pipelines May 27, 2024
@pasko pasko requested a review from aeubanks May 30, 2024 14:15
@aeubanks
Copy link
Contributor

looks like CodeGen/AMDGPU/llc-pipeline.ll is failing

@pasko
Copy link
Contributor Author

pasko commented May 31, 2024

looks like CodeGen/AMDGPU/llc-pipeline.ll is failing

Fixed.

@pasko pasko requested a review from aeubanks May 31, 2024 15:06
@aeubanks
Copy link
Contributor

thanks for doing this!

@aeubanks aeubanks merged commit cab81dd into llvm:main May 31, 2024
6 checks passed
Copy link

@pasko Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested
by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as
the builds can include changes from many authors. It is not uncommon for your
change to be included in a build that fails due to someone else's changes, or
infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself.
This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang Clang issues not falling into any other category llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants