Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
f34ab2d
[llvm-profgen] Loading binary functions from .symtab when DWARF info …
HighW4y2H3ll Oct 9, 2025
a1f32b5
Merge branch 'main' into fix-profgen-symbol
HighW4y2H3ll Oct 15, 2025
0fd352d
formatting
HighW4y2H3ll Oct 15, 2025
c097d37
Fix branch target check when an instruction branches to itself. (i.e.…
HighW4y2H3ll Oct 16, 2025
a19064d
Making the API compatible with non-ELF binaries
HighW4y2H3ll Oct 17, 2025
e12e694
Fix
HighW4y2H3ll Oct 20, 2025
5eead6b
Clean up getSymbolSize API and warnings
HighW4y2H3ll Oct 22, 2025
8f59bfa
Add cmdline option --load-function-from-symbol
HighW4y2H3ll Oct 23, 2025
a967994
Get symbol size only for ELFObjectFile
HighW4y2H3ll Oct 23, 2025
0dc2c66
Add unit test
HighW4y2H3ll Oct 23, 2025
75dc996
Nit
HighW4y2H3ll Oct 24, 2025
5600e83
Cleanup loggings and comments
HighW4y2H3ll Oct 30, 2025
0df504e
Mark function FromSymtab too if new range is found
HighW4y2H3ll Oct 31, 2025
644fce9
Formatting
HighW4y2H3ll Oct 31, 2025
9188eff
Fix suffix strip && refactor the function checks
HighW4y2H3ll Nov 1, 2025
b6ae0ba
Fixup corrupted DWARF function names using symbol table info
HighW4y2H3ll Nov 12, 2025
b1cbdd0
Fixup overwritten DWARF symbol name when decoding pseudo probe
HighW4y2H3ll Nov 14, 2025
8927a27
Fixup GuidFilter && Pseudo probe callee mismatch
HighW4y2H3ll Nov 16, 2025
108bc08
format
HighW4y2H3ll Nov 17, 2025
6e3a5ac
Clean up fixup logic in pseudo probe
HighW4y2H3ll Nov 17, 2025
3d8ad53
Further cleanup and add more GuidFilters
HighW4y2H3ll Nov 18, 2025
1344cdd
Infer callee name with pseudo probe names
HighW4y2H3ll Nov 21, 2025
2978e21
promote eligible entry point functions
HighW4y2H3ll Nov 22, 2025
eaab9df
Update pseudo probe search range && range checks
HighW4y2H3ll Nov 22, 2025
c53e0ae
Iterate through ProfiledFunctions to reduce scope
HighW4y2H3ll Nov 24, 2025
b707e80
Add usePseudoProbes() guard and comments
HighW4y2H3ll Nov 25, 2025
57ab1cd
Clean-ups
HighW4y2H3ll Dec 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions llvm/include/llvm/ProfileData/SampleProf.h
Original file line number Diff line number Diff line change
Expand Up @@ -1214,12 +1214,18 @@ class FunctionSamples {
// Note the sequence of the suffixes in the knownSuffixes array matters.
// If suffix "A" is appended after the suffix "B", "A" should be in front
// of "B" in knownSuffixes.
const char *KnownSuffixes[] = {LLVMSuffix, PartSuffix, UniqSuffix};
SmallVector<StringRef> KnownSuffixes({LLVMSuffix, PartSuffix, UniqSuffix});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is widely used and maybe be expensive. Now with this changing to use vector, I suspect it would introduce more overheads(alloc, dealloc ..), if so, would it possible to continue using the static array or constexpr?

return getCanonicalFnName(FnName, KnownSuffixes, Attr);
}

static StringRef getCanonicalFnName(StringRef FnName,
const SmallVector<StringRef> &Suffixes,
StringRef Attr = "selected") {
if (Attr == "" || Attr == "all")
return FnName.split('.').first;
if (Attr == "selected") {
StringRef Cand(FnName);
for (const auto &Suf : KnownSuffixes) {
for (const auto &Suf : Suffixes) {
StringRef Suffix(Suf);
// If the profile contains ".__uniq." suffix, don't strip the
// suffix for names in the IR.
Expand Down
32 changes: 32 additions & 0 deletions llvm/tools/llvm-profgen/ProfileGenerator.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -449,29 +449,61 @@ bool ProfileGeneratorBase::collectFunctionsFromRawProfile(
// Go through all the stacks, ranges and branches in sample counters, use
// the start of the range to look up the function it belongs and record the
// function.
uint64_t ErrStkAddr = 0, ErrFuncRange = 0, ErrSrc = 0, ErrTgt = 0;
uint64_t TotalStkAddr = 0, TotalFuncRange = 0, TotalSrc = 0, TotalTgt = 0;
for (const auto &CI : *SampleCounters) {
if (const auto *CtxKey = dyn_cast<AddrBasedCtxKey>(CI.first.getPtr())) {
for (auto StackAddr : CtxKey->Context) {
uint64_t inc = Binary->addressIsCode(StackAddr) ? 1 : 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about doing the stats using the weighted number, i,e. the sample count in SampleCounters
We do this for the other stats(https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-profgen/PerfReader.cpp#L1293), it may be more insightful for performance investigations.

TotalStkAddr += inc;
if (FuncRange *FRange = Binary->findFuncRange(StackAddr))
ProfiledFunctions.insert(FRange->Func);
else
ErrStkAddr += inc;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: inc --> Inc , same to other places below.

}
}

for (auto Item : CI.second.RangeCounter) {
uint64_t StartAddress = Item.first.first;
uint64_t inc = Binary->addressIsCode(StartAddress) ? 1 : 0;
TotalFuncRange += inc;
if (FuncRange *FRange = Binary->findFuncRange(StartAddress))
ProfiledFunctions.insert(FRange->Func);
else
ErrFuncRange += inc;
}

for (auto Item : CI.second.BranchCounter) {
uint64_t SourceAddress = Item.first.first;
uint64_t TargetAddress = Item.first.second;
uint64_t srcinc = Binary->addressIsCode(SourceAddress) ? 1 : 0;
uint64_t tgtinc = Binary->addressIsCode(TargetAddress) ? 1 : 0;
TotalSrc += srcinc;
if (FuncRange *FRange = Binary->findFuncRange(SourceAddress))
ProfiledFunctions.insert(FRange->Func);
else
ErrSrc += srcinc;
TotalTgt += tgtinc;
if (FuncRange *FRange = Binary->findFuncRange(TargetAddress))
ProfiledFunctions.insert(FRange->Func);
else
ErrTgt += tgtinc;
}
}

if (ErrStkAddr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: No need to check the value here.
There are checks in the emitWarningSummary (https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-profgen/ErrorHandling.h#L50)

WithColor::warning() << "Cannot find Stack Address from DWARF Info: "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: we have a helper function for emitting warning(though may need to rephrase the message info ), https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-profgen/PerfReader.cpp#L1325-L1337

Also we have a similar warning in "of samples are from ranges that do not belong to any functions."(https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-profgen/PerfReader.cpp#L1328-L1330), the goal should be same as the code here, I wonder if we can consolidate them or perhaps remove the earlier one if it's redundant, that'd be great.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I've made an update on the logging! For a comparison, it seems the earlier warning from "Ranges" has a little bit more samples counted comparing to the later warning from the aggregated "SampleCounters". Maybe it's caused by the filtering here?

if (Binary->addressIsCode(StartAddress) &&
Binary->addressIsCode(EndAddress) &&
isValidFallThroughRange(StartAddress, EndAddress, Binary))
Counter.recordRangeCount(StartAddress, EndAddress, Repeat);
EndAddress = SourceAddress;

warning: 0.07%(3501587/4906133148) of samples are from ranges that do not belong to any functions.
warning: 0.07%(3500591/4835840652) of function range samples do not belong to any function

<< ErrStkAddr << "/" << TotalStkAddr << " missing\n";
if (ErrFuncRange)
WithColor::warning() << "Cannot find Function Range from DWARF Info: "
<< ErrFuncRange << "/" << TotalFuncRange
<< " missing\n";
if (ErrSrc)
WithColor::warning() << "Cannot find LBR Source Addr from DWARF Info: "
<< ErrSrc << "/" << TotalSrc << " missing\n";
if (ErrTgt)
WithColor::warning() << "Cannot find LBR Target Addr from DWARF Info: "
<< ErrTgt << "/" << TotalTgt << " missing\n";
return true;
}

Expand Down
44 changes: 44 additions & 0 deletions llvm/tools/llvm-profgen/ProfiledBinary.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,8 @@ void ProfiledBinary::load() {
if (ShowDisassemblyOnly)
decodePseudoProbe(Obj);

populateSymbolsFromElf(Obj);

// Disassemble the text sections.
disassemble(Obj);

Expand Down Expand Up @@ -820,6 +822,48 @@ void ProfiledBinary::populateSymbolAddressList(const ObjectFile *Obj) {
}
}

void ProfiledBinary::populateSymbolsFromElf(const ObjectFile *Obj) {
Copy link
Member

@dtellenbach dtellenbach Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this is intended to work for ELF only, so it needs to be conditional on the object format being ELF, at least. I think an abstraction in between would be helpful to allow other object formats to implement similar functionality if needed.

Copy link
Contributor

@wlei-llvm wlei-llvm Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! llvm-profgen does support for non-ELF format, e.g. #158207, then this may break other format.

cc @HaohaiWen

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedbacks! Let me fix up the non-ELF compatibility first. I added another SymbolRef::getSize() API to take care of that. Not sure if this is the best way to abstract things, but I'm thinking of following with another NFC patch that renames ObjectFile::getCommonSymbolSizeImpl to ObjectFile::getSymbolSizeImpl. (so this API no longer attaches to the SymbolRef::SF_Common flag?)

// Load binary functions from ELF symbol table when DWARF info is incomplete
StringRef FileName = Obj->getFileName();
for (const ELFSymbolRef Symbol : Obj->symbols()) {
const SymbolRef::Type Type = unwrapOrError(Symbol.getType(), FileName);
const uint64_t Addr = unwrapOrError(Symbol.getAddress(), FileName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: maybe Addr --> StartAddr

const StringRef Name = unwrapOrError(Symbol.getName(), FileName);
const uint64_t Size = Symbol.getSize();

if (Size == 0 || Type != SymbolRef::ST_Function)
continue;

SmallVector<StringRef> Suffixes(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to my previous comment, this is inside a hot loop, I guess this introduce the alloc/dealloc overheads, we may find a more efficient way(at least we can hoist loop invariant part).

{".destroy", ".resume", ".llvm.", ".cold", ".warm"});
const StringRef SymName =
FunctionSamples::getCanonicalFnName(Name, Suffixes);

auto Ret = BinaryFunctions.emplace(SymName, BinaryFunction());
auto &Func = Ret.first->second;
if (Ret.second)
Func.FuncName = Ret.first->first;

if (auto Range = findFuncRange(Addr)) {
if (Ret.second && ShowDetailedWarning)
WithColor::warning()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this result in a large number of warnings since most of the symbols can be obtained from DWARF?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Ret.second check shall skip all the symbols that already obtained from DWARF. (the symbol name will exists in the BinaryFunctions and the emplace shall prevent the insertation and Ret.second will be set to "false")

auto Ret = BinaryFunctions.emplace(SymName, BinaryFunction());

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Symbol table finding entry already existing in dwarf should not be a warning.

When that happens, we should check if they both point to same ranges, and only warn when

  1. symbol table and dwarf conflicts for the actual range.
  2. something is in symbol table, but not in dwarf (suggesting dwarf corrupted, e.g. due to relo overflow)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this part and the warning shall now capture case 1. And I've merged (2) into the PerfScriptReader::warnInvalidRange :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Can you test on a big binary to make sure this warning does not fire?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually didn't see this warning firing at all in my test, but it shall capture the naming conflicts between dwarf and symbol table, like... in case I missed to strip some internal suffixes...

<< "Symbol " << Name << " start address "
<< format("%8" PRIx64, Addr) << " already exists in DWARF at "
<< format("%8" PRIx64, Range->StartAddress) << " in function "
<< Range->getFuncName() << "\n";
} else {
// Store/Update Function Range from SymTab
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only be storing additional range, not "updating" existing range, right?

Also suggest to structure the code like below for clarify:

auto &Func = Ret.first->second;
if (Ret.second) {
   // Function from symbol table not found previously in DWARF, store ranges.
   Func.FuncName = Ret.first->first;
   Func.FromSymtab = true;
   HashBinaryFunctions[MD5Hash(StringRef(SymName))] = &Func;
   ...
}
else {
   // Function already found from DWARF, check consistency between symbol table and DWARF. 
   ...
}

Copy link
Member Author

@HighW4y2H3ll HighW4y2H3ll Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, right.... "updating" is actually not a correct behavior. let me flag that in a warning too :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually... we probably still need to "update" the range.. because like, coroutine creates a few functions: "foo", "foo.destroy", "foo.resume", "foo.cleanup", and each of them has a different address range in the symbol table. Even Ret.second == false could still be some function added from the symbol table...

Maybe the StartAddr works better to check if a function is already found in DWARF or not? I'm changing the logic to something like this:

auto Range = findFuncRange(StartAddr);
if (!Range || Range->StartAddress != StartAddr) {
  // Function from symbol table not found previously in DWARF, store ranges.
  ...
} else if (SymName != Range->getFuncName() && ShowDetailedWarning) {
  // Function already found from DWARF, check consistency between symbol table and DWARF.
  WithColor::warning()...
}

Func.Ranges.emplace_back(Addr, Addr + Size);

auto R = StartAddrToFuncRangeMap.emplace(Addr, FuncRange());
FuncRange &FRange = R.first->second;
FRange.Func = &Func;
FRange.StartAddress = Addr;
FRange.EndAddress = Addr + Size;
}
}
}

void ProfiledBinary::loadSymbolsFromDWARFUnit(DWARFUnit &CompilationUnit) {
for (const auto &DieInfo : CompilationUnit.dies()) {
llvm::DWARFDie Die(&CompilationUnit, &DieInfo);
Expand Down
3 changes: 3 additions & 0 deletions llvm/tools/llvm-profgen/ProfiledBinary.h
Original file line number Diff line number Diff line change
Expand Up @@ -356,6 +356,9 @@ class ProfiledBinary {
// Create symbol to its start address mapping.
void populateSymbolAddressList(const object::ObjectFile *O);

// Load functions from its symbol table (when DWARF info is missing).
void populateSymbolsFromElf(const object::ObjectFile *O);

// A function may be spilt into multiple non-continuous address ranges. We use
// this to set whether start a function range is the real entry of the
// function and also set false to the non-function label.
Expand Down
Loading