-
Notifications
You must be signed in to change notification settings - Fork 15.7k
[docs][IRPGO]Document two binary formats for instrumentation-based profiles, with a focus on IRPGO. #76105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docs][IRPGO]Document two binary formats for instrumentation-based profiles, with a focus on IRPGO. #76105
Changes from 3 commits
e3bfecf
1b3a6b9
ad7fe43
ac0e550
ba27d13
b761ebc
36bc637
8932e9b
c1b2e19
f81d91f
de896d5
0f06aff
238f517
3799f32
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,387 @@ | ||
| ===================== | ||
| IRPGO Profile Format | ||
| ===================== | ||
|
|
||
| .. contents:: | ||
| :local: | ||
|
|
||
|
|
||
| Overview | ||
| ========== | ||
|
|
||
| IR-based instrumentation (IRPGO) and its context-sensitive variant (CS-IRPGO) | ||
|
||
| inserts `llvm.instrprof.*` `code generator intrinsics <https://llvm.org/docs/LangRef.html#code-generator-intrinsics>`_ | ||
| in LLVM IR to generate profiles. This document describes two binary profile | ||
| formats (raw and indexed) used by IR-based instrumentation. | ||
mingmingl-llvm marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| .. note:: | ||
|
|
||
| Both the compiler-rt profiling infrastructure and profile format are general | ||
|
||
| and could support other use cases (e.g., coverage and temporal profiling). | ||
| This document will focus on IRPGO while briefly introducing other use cases | ||
| with pointers. | ||
|
|
||
| Raw PGO Profile Format | ||
| ======================== | ||
|
|
||
| The raw PGO profile is generated by running the instrumented binary. It is a | ||
| memory dump of the profile data. | ||
|
|
||
| Two kinds of frequently used profile information are function's basic block | ||
|
||
| counters and its (various flavors of) value profiles. A function's profiled | ||
|
||
| information span across several sections in the profile. | ||
|
|
||
| General Storage Layout | ||
| ----------------------- | ||
|
|
||
| A raw profile for an executable [1]_ consists of a profile header and several | ||
|
||
| sections. The storage layout is illustrated below. Generally, when raw profile | ||
|
||
| is read into an memory buffer, the actual byte offset of a section is inferred | ||
| from the section's order in the layout and size information of all sections | ||
| ahead of it. | ||
|
|
||
| :: | ||
|
|
||
| +----+-----------------------+ | ||
| | | Magic | | ||
| | +-----------------------+ | ||
| | | Version | | ||
| | +-----------------------+ | ||
| H | Size Info for | | ||
| E | Section 1 | | ||
| A +-----------------------+ | ||
| D | Size Info for | | ||
| E | Section 2 | | ||
| R +-----------------------+ | ||
| | | ... | | ||
| | +-----------------------+ | ||
| | | Size Info for | | ||
| | | Section N | | ||
| +----+-----------------------+ | ||
| P | Section 1 | | ||
| A +-----------------------+ | ||
| Y | Section 2 | | ||
| L +-----------------------+ | ||
| O | ... | | ||
| A +-----------------------+ | ||
| D | Section N | | ||
| +----+-----------------------+ | ||
|
|
||
|
|
||
| .. note:: | ||
| Sections might be padded to meet platform-specific alignment requirements. | ||
| For simplicity, header fields and data sections solely for padding purpose | ||
| are omitted in the data layout graph above and the rest of this document. | ||
|
|
||
| Header | ||
| ------- | ||
|
|
||
| ``Magic`` | ||
| With the magic number, data consumer could detect profile format and | ||
| endianness of the data, and quickly tells whether/how to continue reading. | ||
|
||
|
|
||
| ``Version`` | ||
| The lower 32 bits specifies the actual version and the most significant 32 | ||
| bits specify the variant types of the profile. IRPGO and CS-IRPGO are two | ||
| variant types. | ||
|
|
||
| ``BinaryIdsSize`` | ||
| The byte size of binary id section. | ||
|
|
||
| ``NumData`` | ||
| The number of per-function profile data control structures. The byte size of | ||
| profile data section could be computed with this field. | ||
|
|
||
| ``NumCounter`` | ||
| The number of entries in the profile counter section. The byte size of counter | ||
| section could be computed with this field. | ||
|
|
||
| ``NumBitmapBytes`` | ||
| The number of bytes in the profile bitmap section. | ||
|
|
||
| ``NamesSize`` | ||
| The number of bytes in the name section. | ||
|
|
||
| ``CountersDelta`` | ||
| Records the in-memory address difference between the data and counter section, | ||
| i.e., `start(__llvm_prf_cnts) - start(__llvm_prf_data)`. It's used jointly | ||
| with the in-memory address difference of profile data record and its counter | ||
| to find the counter of a profile data record. Check out calculation-of-counter-offset_ | ||
| for details. | ||
|
|
||
| ``BitmapDelta`` | ||
| Records the in-memory address difference between the data and bitmap section, | ||
| i.e., `start(__llvm_prf_bits) - start(__llvm_prf_data)`. It's used jointly | ||
| with the in-memory address difference of a profile data record and its bitmap | ||
| to find the bitmap of a profile data record, in a similar to how counters are | ||
| referenced as explained by calculation-of-counter-offset_ . | ||
|
|
||
| ``NamesDelta`` | ||
| Records the in-memory address of compressed name section. Not used except for | ||
|
||
| raw profile reader error checking. | ||
|
|
||
| ``ValueKindLast`` | ||
| Records the number of value kinds. As of writing, two kinds of value profiles | ||
| are supported. `IndirectCallTarget` is to profile the frequent callees of | ||
| indirect call instructions and `MemOPSize` is for memory intrinsic function | ||
| size profiling. | ||
|
|
||
| The number of value kinds affects the byte size of per function profile data | ||
| control structure. | ||
|
|
||
| Payload Sections | ||
| ------------------ | ||
|
|
||
| Binary Ids | ||
| ^^^^^^^^^^^ | ||
| Stores the binary ids of the instrumented binaries to associate binaries with | ||
| profiles for source code coverage. See `Binary Id RFC`_ for introduction. | ||
|
|
||
| .. _`Binary Id RFC`: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151154.html | ||
|
|
||
| Profile Data | ||
| ^^^^^^^^^^^^^ | ||
|
|
||
| This section stores per-function profile data control structure. The in-memory | ||
| representation of the control structure is `__llvm_profile_data` and the fields | ||
| are defined by `INSTRPROFDATA` macro. Some fields are used to reference data | ||
| from other sections in the profile. The fields are documented as follows: | ||
|
|
||
| ``NameRef`` | ||
| The MD5 of the function's IRPGO name. IRPGO name has the format | ||
| `[<filepath>;]<linkage-name>` where `<filepath>;` is provided for local-linkage | ||
| functions to tell possibly identical function names. | ||
|
|
||
| ``FuncHash`` | ||
| A fingerprint of the function's control flow graph. | ||
|
||
|
|
||
| ``CounterPtr`` | ||
|
||
| The in-memory address difference between profile data and its corresponding counters. | ||
|
|
||
| ``BitmapPtr`` | ||
| The in-memory address difference between profile data and its bitmap. | ||
|
|
||
| ``FunctionPointer`` | ||
| Records the function address when instrumented binary runs. This is used to | ||
| map the profiled callee address of indirect calls to the `NameRef` during | ||
| conversion from raw to indexed profiles. | ||
|
|
||
| ``Values`` | ||
| Represents value profiles in a two dimensional array. The number of elements | ||
| in the first dimension is the number of instrumented value sites across all | ||
| kinds. Each element in the first dimension is the head of a linked list, and | ||
| the each element in the second dimension is linked list element, carrying | ||
| `<profiled-value, count>` as payload. This is used by compiler runtime when | ||
| writing out value profiles. | ||
|
|
||
| ``NumCounters`` | ||
| The number of counters for the instrumented function. | ||
|
|
||
| ``NumValueSites`` | ||
| This is an array of counters, and each counter represents the number of | ||
| instrumented sites for a kind of value in the function. | ||
|
|
||
| ``NumBitmapBytes`` | ||
| The number of bitmap bytes for the function. | ||
|
|
||
| Profile Counters | ||
| ^^^^^^^^^^^^^^^^^ | ||
|
|
||
| For IRPGO [2]_, the counters within an instrumented function are stored contiguously | ||
| and in an order that is consistent with basic block selection in the instrumentation | ||
| pass. | ||
|
|
||
| .. _calculation-of-counter-offset: | ||
|
|
||
| So how are function counters associated with a function? | ||
|
|
||
| Basically, the profile reader iterates per-function control structure (from the | ||
| profile data section) and makes use of the recorded relative distances, as | ||
| illustrated below. | ||
|
||
|
|
||
| :: | ||
|
|
||
| + --> start(__llvm_prf_data) --> +---------------------+ ------------+ | ||
| | | Data 1 | | | ||
| | +---------------------+ =====|| | | ||
| | | Data 2 | || | | ||
| | +---------------------+ || | | ||
| | | ... | || | | ||
| Counter| +---------------------+ || | | ||
| Delta | | Data N | || | | ||
| | +---------------------+ || | CounterPtr1 | ||
| | || | | ||
| | CounterPtr2 || | | ||
| | || | | ||
| | || | | ||
| + --> start(__llvm_prf_cnts) --> +---------------------+ || | | ||
| | ... | || | | ||
| +---------------------+ -----||----+ | ||
| | Counter 1 | || | ||
| +---------------------+ || | ||
| | ... | || | ||
| +---------------------+ =====|| | ||
| | Counter 2 | | ||
| +---------------------+ | ||
| | ... | | ||
| +---------------------+ | ||
| | Counter N | | ||
| +---------------------+ | ||
|
|
||
|
|
||
| In the graph, | ||
|
|
||
| * The profile header records `CounterDelta` with the value as `start(__llvm_prf_cnts) - start(__llvm_prf_data)`. | ||
| We will call it `CounterDeltaInitVal` below for convenience. | ||
| * For each profile data record, `CounterPtrN` is recorded as `start(Counter) - start(ProfileData)`. | ||
mingmingl-llvm marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| Each time the reader advances to the next data record, it updates `CounterDelta` to minus the size of one `ProfileData`. | ||
|
|
||
| For the counter corresponding to the first data record, the byte offset | ||
| relative to the start of the counter section is calculated as `CounterPtr1 - CounterDeltaInitVal`. | ||
| When profile reader advances to the second data record, note `CounterDelta` is now `CounterDeltaInitVal - sizeof(ProfileData)`. | ||
| Thus the byte offset relative to the start of the counter section is calculated as `CounterPtr2 - (CounterDeltaInitVal - sizeof(ProfileData))`. | ||
|
|
||
| Bitmap | ||
| ^^^^^^^ | ||
| This section is used for source-based MC/DC code coverage. Check out `Bitmap RFC`_ | ||
|
||
| if interested. | ||
|
|
||
| .. _`Bitmap RFC`: https://discourse.llvm.org/t/rfc-source-based-mc-dc-code-coverage/59244 | ||
|
|
||
| Names | ||
| ^^^^^^ | ||
|
|
||
| This section contains the concatenated string of function IRPGO names. If | ||
| compressed, zlib compression algorithm is used. | ||
|
|
||
| Function names serve as keys in the PGO data hash table when raw profiles are | ||
| converted into indexed profiles. They are also crucial for `llvm-profdata` to | ||
| show the profiles in a human-readable way. | ||
|
|
||
| Value Profile Data | ||
| ^^^^^^^^^^^^^^^^^^^^ | ||
|
|
||
| This section contains the profile data for value profiling. | ||
|
|
||
| The value profiles corresponding to a profile data are serialized contiguously | ||
| as one record, and value profile records are stored in the same order as the | ||
| respective profile data, such that a raw profile reader advances the pointer to | ||
| profile data and the pointer to value profile records simutaneously [3]_ to find | ||
| value profiles for a per function, per cfg fingerprint profile data. | ||
|
|
||
| Indexed PGO Profile Format | ||
| =========================== | ||
|
|
||
| General Storage Layout | ||
| ----------------------- | ||
|
|
||
| :: | ||
|
|
||
| +-----------------------+---+ | ||
|
||
| | Magic | | | ||
| +-----------------------+ | | ||
| | Version | | | ||
| +-----------------------+ | | ||
| | HashType | H | ||
| +-----------------------+ E | ||
| +-------| HashOffset | A | ||
| | +-----------------------+ D | ||
| +-----------| MemProfOffset | E | ||
| | | +-----------------------+ R | ||
| | | | BinaryIdOffset | | | ||
| | | +-----------------------+ | | ||
| +---------------| TemporalProf- | | | ||
| | | | | TracesOffset | | | ||
| | | | +-----------------------+---+ | ||
| | | | | Profile Summary | | | ||
| | | | +-----------------------+ P | ||
| | | +------>| Function PGO data | A | ||
| | | +-----------------------+ Y | ||
| | +---------- | MemProf profile data | L | ||
| | +-----------------------+ O | ||
| | | Binary Ids | A | ||
| | +-----------------------+ D | ||
| +-------------->| Temporal profiles | | | ||
| +-----------------------+---+ | ||
|
|
||
| Header | ||
| -------- | ||
|
|
||
| ``Magic`` | ||
| The purpose of the magic number is to be able to quickly tell if the profile | ||
| is an indexed profile. | ||
|
|
||
| ``Version`` | ||
| Similar to raw profile version, the lower 32 bits specifies the version of the | ||
| indexed profile and the most significant 32 bits are reserved to specify the | ||
| variant types of the profile. | ||
|
|
||
| ``HashType`` | ||
| The hashing scheme for on-disk hash table keys. Only MD5 hashing is used as of | ||
| writing. | ||
|
|
||
| ``HashOffset`` | ||
| An on-disk hash table stores the per-function profile records. | ||
| Precisely speaking, `HashOffset` records the offset of this hash table's | ||
| metadata (i.e., the number of buckets and entries), which follows right after | ||
| the payload of the entire hash table. | ||
|
|
||
| ``MemProfOffset`` | ||
| Records the byte offset of MemProf profiling data. | ||
|
|
||
| ``BinaryIdOffset`` | ||
| Records the byte offset of binary id sections. | ||
|
|
||
| ``TemporalProfTracesOffset`` | ||
| Records the byte offset of temporal profiles. | ||
|
|
||
| Payload Sections | ||
| ------------------ | ||
|
|
||
| (CS) Profile Summary | ||
| ^^^^^^^^^^^^^^^^^^^^^ | ||
| This section is right after profile header. It stores the serialized profile | ||
| summary. For context-sensitive IRPGO, this section stores an additional profile | ||
| summary corresponding to the context-sensitive profiles. | ||
|
|
||
| Function PGO data | ||
| ^^^^^^^^^^^^^^^^^^ | ||
| This section stores functions and their PGO profiling data as an on-disk hash | ||
|
||
| table. The key of a hash table entry is function's PGO name, and the in-memory | ||
| representation of value is a map. The key of this map is CFG hash, and the value | ||
| is C++ struct `llvm::InstrProfRecord`. The C++ struct collects the profiling | ||
| information like counters and value profiles. | ||
|
|
||
| MemProf Profile data | ||
| ^^^^^^^^^^^^^^^^^^^^^^ | ||
| This section stores function's memory profiling data. See | ||
| `MemProf binary serialization format RFC`_ for the design. | ||
|
|
||
| .. _`MemProf binary serialization format RFC`: https://lists.llvm.org/pipermail/llvm-dev/2021-September/153007.html | ||
|
|
||
| Binary Ids | ||
| ^^^^^^^^^^^^^^^^^^^^^^ | ||
| The section to carry on binary-id information from raw profiles. | ||
|
|
||
| Temporal Profile Traces | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^ | ||
| The section to carry on temporal profile information from raw profiles. | ||
| See `Temporal profiling RFC`_ for an overview. | ||
|
|
||
| .. _`Temporal profiling RFC`: https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068 | ||
|
|
||
| Profile Data Usage | ||
| ======================================= | ||
|
|
||
| `llvm-profdata` is the command line tool to display and process profile data. | ||
| For supported usages, check out its `documentation <https://llvm.org/docs/CommandGuide/llvm-profdata.html>`_. | ||
|
|
||
|
|
||
| .. [1] A raw profile file could contain multiple raw profiles. Raw profile | ||
| reader could parse all raw profiles from the file correctly. | ||
| .. [2] The counter section is used by a few variant types (like coverage and | ||
| temporal profiling) and might have different semantics there. | ||
| .. [3] The step size of data pointer is the `sizeof(ProfileData)`, and the step | ||
| size of value profile pointer is calcuated based on the number of collected | ||
| values. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IRPGO --> Instrumentation PGO. Note that Frontend PGO uses the same format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.