-
Notifications
You must be signed in to change notification settings - Fork 15.5k
[BOLT] Compress YAML pseudo probe information #166680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: users/aaupov/spr/main.bolt-compress-yaml-pseudo-probe-information
Are you sure you want to change the base?
[BOLT] Compress YAML pseudo probe information #166680
Conversation
Created using spr 1.3.4
You can test this locally with the following command:git-clang-format --diff origin/main HEAD --extensions cpp,h -- bolt/include/bolt/Profile/ProfileYAMLMapping.h bolt/include/bolt/Profile/YAMLProfileWriter.h bolt/include/bolt/Utils/CommandLineOpts.h bolt/lib/Profile/DataAggregator.cpp bolt/lib/Profile/YAMLProfileReader.cpp bolt/lib/Profile/YAMLProfileWriter.cpp bolt/lib/Rewrite/PseudoProbeRewriter.cpp --diff_from_common_commit
View the diff from clang-format here.diff --git a/bolt/lib/Profile/YAMLProfileWriter.cpp b/bolt/lib/Profile/YAMLProfileWriter.cpp
index aad433a8b..817590cb9 100644
--- a/bolt/lib/Profile/YAMLProfileWriter.cpp
+++ b/bolt/lib/Profile/YAMLProfileWriter.cpp
@@ -485,7 +485,7 @@ std::error_code YAMLProfileWriter::writeProfile(const RewriteInstance &RI) {
// Add probe inline tree nodes.
InlineTreeDesc InlineTree;
if (const MCPseudoProbeDecoder *Decoder =
- opts::ProfileWritePseudoProbes ? BC.getPseudoProbeDecoder() : nullptr)
+ opts::ProfileWritePseudoProbes ? BC.getPseudoProbeDecoder() : nullptr)
std::tie(BP.PseudoProbeDesc, InlineTree) = convertPseudoProbeDesc(*Decoder);
// Add all function objects.
|
Pseudo probe matching (#100446) needs callee information for call probes. Embed call probe information (probe id, inline tree node, indirect flag) into CallSiteInfo. As a consequence: - Remove call probes from PseudoProbeInfo to avoid duplication, making it only contain block probes. - Probe grouping across inline tree nodes becomes more potent + allows to unambiguously elide block id 1 (common case). Block mask (blx) encoding becomes a low-ROI optimization and will be replaced by a more compact encoding leveraging simplified PseudoProbeInfo in #166680. The size increase is ~3% for an XL profile (461->475MB). Compact block probe encoding shrinks it by ~6%. Test Plan: updated pseudoprobe-decoding-{inline,noinline}.test
Pseudo probe matching (#100446) needs callee information for call probes. Embed call probe information (probe id, inline tree node, indirect flag) into CallSiteInfo. As a consequence: - Remove call probes from PseudoProbeInfo to avoid duplication, making it only contain block probes. - Probe grouping across inline tree nodes becomes more potent + allows to unambiguously elide block id 1 (common case). Block mask (blx) encoding becomes a low-ROI optimization and will be replaced by a more compact encoding leveraging simplified PseudoProbeInfo in #166680. The size increase is ~3% for an XL profile (461->475MB). Compact block probe encoding shrinks it by ~6%. Test Plan: updated pseudoprobe-decoding-{inline,noinline}.test Reviewers: paschalis-mpeis, ayermolo, yota9, yozhu, rafaelauler, maksfb Reviewed By: rafaelauler Pull Request: #165490
🐧 Linux x64 Test ResultsThe build failed before running any tests. Click on a failure below to see the details. tools/bolt/lib/Profile/CMakeFiles/LLVMBOLTProfile.dir/YAMLProfileReader.cpp.oIf these failures are unrelated to your changes (for example tests are broken or flaky at HEAD), please open an issue at https://github.com/llvm/llvm-project/issues and add the |
Introduce an optional value
compactforprofile-write-pseudo-probesflag,enabling compressed encoding of pseudo probe information in YAML profile:
inline tree nodes in format
[probes]_[nodes]:[from,...,to]are coalesced intofrom^run_length,using base36-encoding.
This significantly reduces the size of YAML profile and makes parsing faster
(as it replaces nested FlowVectors with strings).
Test Plan:
TBD: compress pseudo_probe_desc. Use DFS for probes, and BFS for inline tree.
Measure emission time.