Skip to content

Profiles: optimize profile stack trace representation.#651

Closed
aalexand wants to merge 6 commits intoopen-telemetry:mainfrom
aalexand:stack-tree
Closed

Profiles: optimize profile stack trace representation.#651
aalexand wants to merge 6 commits intoopen-telemetry:mainfrom
aalexand:stack-tree

Conversation

@aalexand
Copy link
Copy Markdown
Member

@aalexand aalexand commented May 6, 2025

In #645 a simplified stack trace representation is proposed. A discussion there asks for a more efficient way to represent stacks that slightly differ at the leaf. A specialized solution proposed there would put the leaf location as a separate field. That seems partial and potentially error-prone.

This PR proposes an alternative stack trace representation, shaped as a tree in the form of two arrays. See the code for details and an example.

The advantage of this approach is that stacks with the same prefix are encoded more compactly. Also, since we use only two arrays for the encoding, the in-memory representation is also more efficient in terms of the number of allocations required. The main disadvantage is that this approach is more complex semantically and for this reason can be more error-prone.

In open-telemetry#645 a simplified stack trace representation is proposed. A
discussion there asks for a more efficient way to represent stacks that
slightly differ at the leaf. A specialized solution proposed there would
put the leaf location as a separate field. That seems partial and
potentially error-prone.

This PR proposes an alternative stack trace representation, shaped as
a tree in the form of two arrays. See the code for details and an
example.

The advantage of this approach is that stacks with the same prefix are
encoded more compactly. Also, since we use only two arrays for the
encoding, the in-memory representation is also more efficient in terms
of the number of allocations required. The main disadvantage is that
this approach is more complex semantically and for this reason can be
more error-prone.
// location_table = ["", "main", "foo", "bar", "baz1", "baz2"]
//
// And the stack tree arrays would be:
// (stack index) = 0 1 2 3 4 5 6
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great you found the time to come up with an implementation!

Just some comments

  1. In the example there are 4 stack traces, but you encode 6 (index 0 ignored) because you also encode partial stacks line main and main -> foo. I don't think this is required because these partial stacks can easily be constructed from the complete stacks if needed.
  2. Add a step-by-step example to reconstruct one of the stacks.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All those nodes are needed since they serve as parents to the child nodes, actually.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification!

What if we use 0 as the root index instead of -1?
And additionally, use 0 in stack_location_index for non-leaf frames, which a) potentially improves compression and b) marks complete stacks?

  // (index)              =   0  1  2  3  4  5  6
  // stack_parent_array   = [ 0, 0, 1, 2, 3, 3, 2]
  // stack_location_index = [ 0, 0, 0, 3, 4, 5, 5]

When constructing a stack, start at a leaf frame (index > 0) and stop at index 0?
Example for index 6
stack_location_index[6] is 5 --> "baz2"
parent_parent_array[6] is 2 --> "foo"
parent_parent_array[2] is 1 --> "main"
parent_parent_array[1] is 0 --> "" (root), stop

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there could be a convention that the node at index 0 is the nil node, instead of using -1 to indicate that. I don't have strong preference.

We can't use 0 for non-leaf frames' location ID. In your decoding example, for the nodes 2 and 1 you need to know that they correspond to the foo and main locations. If you store the zero as the code location for them then this information is lost.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might got it wrong. Can you provide a step-by-step example (in the proto file comment) to to properly reconstruct the stacks?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized there was a mistake in the example, I corrected it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a decoding example in Go:

package main

import (
	"fmt"
)

// Stack tree encoding these stacks (the number is the stack index):
//
//	main|0 -> foo|1 -> bar|2
//	main|0 -> foo|1 -> bar|2 -> baz1|3
//	main|0 -> foo|1 -> bar|2 -> baz2|4
//	main|0 -> foo|1 -> baz2|5
var locationTable = []string{"", "main", "foo", "bar", "baz1", "baz2"}
var stackParentArray = []int{-1, 0, 1, 2, 2, 1}
var stackLocationIndex = []int{1, 2, 3, 4, 5, 5}

func main() {
	stackIdx := 4
	fmt.Printf("stackIdx=%d:", stackIdx)
	for stackIdx != -1 {
		fmt.Printf(" %s", locationTable[stackLocationIndex[stackIdx]])
		stackIdx = stackParentArray[stackIdx]
	}
	fmt.Println()
}

This outputs:

stackIdx=4: baz2 bar foo main

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a few reads (and this thread helped a lot), I like this approach. Thanks for putting this together!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also took me a few reads. Here are some suggestions for making the idea more clear in the comments, feel free to merge @aalexand .

https://github.com/aalexand/opentelemetry-proto/pull/1/files

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @felixge - merged.

@aalexand
Copy link
Copy Markdown
Member Author

aalexand commented May 6, 2025

In theory, one more disadvantage of this approach is that the implementation would have to check for cycles when dealing with the parent links. In practice I think we can just require stack_parent_array[i] < i which is a simple check.

@jhalliday
Copy link
Copy Markdown
Contributor

Any benchmarking available to indicate how much of a space saving we can expect from this extra complexity?

@aalexand
Copy link
Copy Markdown
Member Author

aalexand commented May 7, 2025

Any benchmarking available to indicate how much of a space saving we can expect from this extra complexity?

No, but we should benchmark both this and #645.

@felixge
Copy link
Copy Markdown
Member

felixge commented May 12, 2025

Thanks for writing this down. Looking forward to discussing it this Thursday.

profiles: improve stack tree description
Based on internal feedback, putting the location and stack indexed lists
into the same table is very confusing because they are really two
separate tables.

Make this clear by splitting them up into two.
@felixge
Copy link
Copy Markdown
Member

felixge commented May 13, 2025

@aalexand maybe you can also merge aalexand#2 ? I shared this PR with some people in my team, and they still struggled to follow the idea because they got confused by the location data being in the same table as the stack data ... which was indeed confusing because the values at the same index don't belong together across those two tables.

profiles: split the stack tree ASCII table in the comment into two
@aalexand
Copy link
Copy Markdown
Member Author

@aalexand maybe you can also merge aalexand#2 ? I shared this PR with some people in my team, and they still struggled to follow the idea because they got confused by the location data being in the same table as the stack data ... which was indeed confusing because the values at the same index don't belong together across those two tables.

Good point - merged.

@rockdaboot
Copy link
Copy Markdown
Contributor

Here is an example implementation (buildArrays and buildStacks): rockdaboot#1
It's possibly useful for benchmarks and a real world implementation.

@reyang reyang changed the title Optimized profile stack trace representation. Profiles: optimize profile stack trace representation. May 21, 2025
@rockdaboot
Copy link
Copy Markdown
Contributor

Generic implementation to encode the arrays from (generic) stacks and decode from arrays back into stacks: https://github.com/rockdaboot/go-trie

// Index into ProfilesData.{stack_parent_array, stack_location_index} stack
// table, representing the stack tree node of the sample.
int32 stack_index = 1;
// The type and unit of each value is defined by the corresponding
Copy link
Copy Markdown

@tsint tsint Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aalexand
As you said here, have you considered adding a field to indicate the personalized attributes of single samples?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I may have been unclear. There are no plans to add ability to specify different attributes to timestamps within a sample. All timestamps within a sample correspond to the same stack and attributes. What I meant is that with a single field fo referencing the stack the sizeof(Sample) is smaller.

@christos68k
Copy link
Copy Markdown
Member

See open-telemetry/opentelemetry-ebpf-profiler#524 for benchmarks that also cover this proposal.

@aalexand
Copy link
Copy Markdown
Member Author

Closing since we agree to move on with #645.

@aalexand aalexand closed this Jul 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants