Profiles: simplify profile stack trace representation by felixge · Pull Request #645 · open-telemetry/opentelemetry-proto

felixge · 2025-04-17T14:46:58Z

Changes

Introduce a first-class Stack message type and lookup table.
Replace location index range based stack trace encoding on Sample with a single stack_index reference.
Remove the location_indices lookup table.

Motivation

tl;dr: Let's KISS. The stack trace encoding proposed in this PR is competitive with more advanced stack trace encoding formats, but much simpler to implement. See table below or @christos68k's benchmark for details.

	Uncompressed	vs Dedup	Compressed	vs Dedup
ExportDedup (current)	688709	0.0%	277955	0.0%
ExportStacks (this PR)	703704	2.8%	277713	-0.1%
ExportArrays (PR #651)	602920	-12.5%	305235	9.8%

The current stack trace encoding is somewhat complex as it tries to optimize for CPU profiling stack traces where it's relatively common to see prefixes of a stack trace.

For example, a program that looks like this ...

func a() { b() }
func b() { c() }

func c() {
	for {
		d()
		e()
	}
}

... will produce a CPU profile with stack traces like the ones shown below. The number indicate samples taken at different locations (program counters) within the same function:

a;b;c;d1
a;b;c;d2
a;b;c;d3
...
a;b;c;e1
a;b;c;e2
a;b;c;e3
...
a;b;c

This makes it attractive to optimize away the prefix repetition (a;b;c) in the stack trace dictionary table. But the current format doesn't quite succeed in this, as it would only allow us to get rid of the last stack trace in the table. This probably explains why it doesn't end up performing very differently from the encoding proposed in this PR.

The alternative double array proposal from @aalexand achieves over 12% reduction in profile size before compression, because it can exploit this pattern effectively. However, for some reason the resulting data patterns end up being much harder to compress, and the encoding loses by almost 10% after compression has been applied.

Last but not least, not all profiling data will benefit from prefix-encoding as much as CPU profiles. For example memory allocations and especially Off-CPU stack traces have significantly less prefix repetition. We don't have a good benchmark for this, but I hope this argument is easy to follow regardless.

Given the above, I'm proposing to chose the most trivial stack trace encoding format as outlined in this PR as it seems to hit a sweet spot between complexity and efficiency. The main tasks for reviewers is to validate benchmarks from @christos68k's and confirm if they're okay with this direction.

It seems like we had rough consensus around this change in previous SIG meetings.

florianl

Especially mixed reports of profiles, e.g. on- and off-CPU profiles, will benefit from this approach.

In open-telemetry#645 a simplified stack trace representation is proposed. A discussion there asks for a more efficient way to represent stacks that slightly differ at the leaf. A specialized solution proposed there would put the leaf location as a separate field. That seems partial and potentially error-prone. This PR proposes an alternative stack trace representation, shaped as a tree in the form of two arrays. See the code for details and an example. The advantage of this approach is that stacks with the same prefix are encoded more compactly. Also, since we use only two arrays for the encoding, the in-memory representation is also more efficient in terms of the number of allocations required. The main disadvantage is that this approach is more complex semantically and for this reason can be more error-prone.

christos68k · 2025-06-13T14:57:59Z

See open-telemetry/opentelemetry-ebpf-profiler#524 for benchmarks that also cover this proposal.

- Introduce a first-class Stack message type and lookup table. - Replace location index range based stack trace encoding on Sample with a single stack_index reference. - Remove the location_indices lookup table. The primary motivation is laying the ground work for [timestamp based profiling][timestamp proposal] where the same stack trace needs to be referenced much more frequently compared to aggregation based on low cardinality attributes. Timestamp based profiling is also expected to be used with the the upcoming [Off-CPU profiling][off-cpu pr] feature in the eBPF profiler. Off-CPU stack traces have a different distribution compared to CPU samples. In particular stack traces are much more repetitive because they only occur at call sites such as syscalls. For the same reason it is also uncommon to see a stack trace are a root-prefix of a previously observed stack trace. We might need to revisit the previous [previous benchmarks][benchmarks] to confirm these claims. The secondary motivation is simplicitly. Arguably the proposed change here will make it easier to write exporters, processors as well as receivers. It seems like we had rough consensus around this change in previous SIG meetings, and it seems like a good incremental step to make progress on the timestamp proposal. [timestamp proposal]: open-telemetry#594 [off-cpu pr]: open-telemetry/opentelemetry-ebpf-profiler#196 [benchmarks]: https://docs.google.com/spreadsheets/d/1Q-6MlegV8xLYdz5WD5iPxQU2tsfodX1-CDV1WeGzyQ0/edit?gid=2069300294#gid=2069300294

felixge · 2025-07-24T19:44:16Z

This PR has been rebased and the description has been updated. It's ready for another round of review.

christos68k · 2025-07-25T13:40:27Z

  bool has_inline_frames = 9;
 }

+// A Stack represents a stack trace as a list of locations. The first location


We should also add this to and update the helper ascii graph at the beginning of the file.

Opened felixge#2

tigrannajaryan · 2025-08-08T02:30:48Z

Please resolve comments so that I can merge.

aalexand · 2025-08-21T15:52:40Z

  bool has_inline_frames = 9;
 }

+// A Stack represents a stack trace as a list of locations. The first location


nit: "The first location is the leaf frame" - would it make sense to put this in the field comment rather than message?

I added this here.

aalexand · 2025-08-21T15:53:45Z

  repeated AttributeUnit attribute_units = 7;
+
+  // Stacks referenced by samples via Sample.stack_index.
+  repeated Stack stack_table = 8;


Maybe group stack_table with location_table and friends and keep attribute_units last since it's somewhat special in semantics?

I guess with #672 it's less special .

tigrannajaryan · 2025-08-26T15:09:22Z

We are planning to release 1.8.0. Do you need this merged before that?

christos68k · 2025-08-26T15:23:25Z

We are planning to release 1.8.0. Do you need this merged before that?

Before if possible, otherwise we'll need to trigger another release very soon.

tigrannajaryan · 2025-08-26T16:29:28Z

We are planning to release 1.8.0. Do you need this merged before that?

Before if possible, otherwise we'll need to trigger another release very soon.

OK, in that case please address / close open comments and resolve the merge conflicts so that this PR can be merged. We can wait a day or so but if it is going to take much longer then it will have to wait until the release after that.

christos68k · 2025-08-26T17:51:44Z

We are planning to release 1.8.0. Do you need this merged before that?

Before if possible, otherwise we'll need to trigger another release very soon.

OK, in that case please address / close open comments and resolve the merge conflicts so that this PR can be merged. We can wait a day or so but if it is going to take much longer then it will have to wait until the release after that.

@felixge will probably be busy until the end of this week so I'm guessing we won't be able to resolve conflicts and merge #645, but I've opened #708 which is essentially #645 with updates and hopefully we can get that in before 1.8.0 is out.

tigrannajaryan · 2025-08-26T22:11:39Z

We are planning to release 1.8.0. Do you need this merged before that?

Before if possible, otherwise we'll need to trigger another release very soon.

OK, in that case please address / close open comments and resolve the merge conflicts so that this PR can be merged. We can wait a day or so but if it is going to take much longer then it will have to wait until the release after that.

@felixge will probably be busy until the end of this week so I'm guessing we won't be able to resolve conflicts and merge #645, but I've opened #708 which is essentially #645 with updates and hopefully we can get that in before 1.8.0 is out.

If you have a particular reason this change has to go together with some previous ones we can wait a few more days. If it doesn't need to be released together then we can release 1.8.0 now and then will do 1.9.0 immediate after your change is merged. Releases are not hard and we don't time them to any particular cadence, we can release when we want.

christos68k · 2025-08-27T07:27:06Z

If you have a particular reason this change has to go together with some previous ones we can wait a few more days. If it doesn't need to be released together then we can release 1.8.0 now and then will do 1.9.0 immediate after your change is merged. Releases are not hard and we don't time them to any particular cadence, we can release when we want.

There's no special reason other than speeding up our roadmap (and also updating the profiling agent to include all recent protocol changes) but being able to do another 1.9.0 release soon is great and therefore I see no reason to delay 1.8.0.

pellared · 2025-08-28T07:44:19Z

@christos68k, should this PR be closed given #708 is merged?

christos68k · 2025-08-28T12:14:51Z

Closing, @tigrannajaryan has merged #708 which contains the changes in this PR.

felixge mentioned this pull request Apr 17, 2025

Simplify profile stack trace representation #615

Closed

beaubelgrave reviewed Apr 17, 2025

View reviewed changes

Comment thread opentelemetry/proto/profiles/v1development/profiles.proto

florianl approved these changes May 2, 2025

View reviewed changes

aalexand reviewed May 5, 2025

View reviewed changes

Comment thread opentelemetry/proto/profiles/v1development/profiles.proto Outdated

aalexand mentioned this pull request May 6, 2025

Profiles: optimize profile stack trace representation. #651

Closed

reyang changed the title ~~Simplify profile stack trace representation~~ Profiles: simplify profile stack trace representation May 21, 2025

reyang added the spec:profiles label May 21, 2025

aalexand mentioned this pull request Jun 4, 2025

profile: add support for custom attributes on individual samples #668

Closed

aalexand mentioned this pull request Jun 12, 2025

Add wire protocol benchmark agent and stub gRPC server open-telemetry/opentelemetry-ebpf-profiler#524

Closed

felixge force-pushed the simpler-stack-trace-v2 branch from 52ad266 to d972082 Compare July 10, 2025 14:30

felixge mentioned this pull request Jul 17, 2025

proposal: new Recorder types in runtime/pprof golang/go#74545

Open

felixge marked this pull request as ready for review July 24, 2025 14:50

felixge requested review from a team July 24, 2025 14:50

christos68k approved these changes Jul 25, 2025

View reviewed changes

christos68k mentioned this pull request Jul 25, 2025

Missing locationStartIndex from Data open-telemetry/opentelemetry-ebpf-profiler#654

Closed

aalexand reviewed Aug 21, 2025

View reviewed changes

aalexand approved these changes Aug 21, 2025

View reviewed changes

christos68k mentioned this pull request Aug 22, 2025

Update ASCII diagram for new Stack message felixge/opentelemetry-proto#2

Closed

christos68k mentioned this pull request Aug 26, 2025

Profiles: simplify profile stack trace representation (ASCII fix, conflicts resolved) #708

Merged

christos68k closed this Aug 28, 2025

Conversation

felixge commented Apr 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Motivation

Uh oh!

Uh oh!

florianl left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

christos68k commented Jun 13, 2025

Uh oh!

felixge commented Jul 24, 2025

Uh oh!

christos68k Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

christos68k Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

tigrannajaryan commented Aug 8, 2025

Uh oh!

aalexand Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

christos68k Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

aalexand Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

aalexand Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

tigrannajaryan commented Aug 26, 2025

Uh oh!

christos68k commented Aug 26, 2025

Uh oh!

tigrannajaryan commented Aug 26, 2025

Uh oh!

christos68k commented Aug 26, 2025

Uh oh!

tigrannajaryan commented Aug 26, 2025

Uh oh!

christos68k commented Aug 27, 2025

Uh oh!

pellared commented Aug 28, 2025

Uh oh!

christos68k commented Aug 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

felixge commented Apr 17, 2025 •

edited

Loading