Increase frame buffer to max 1024 frames per trace by dalehamel · Pull Request #908 · open-telemetry/opentelemetry-ebpf-profiler

dalehamel · 2025-10-31T20:08:02Z

Most ruby / rails apps have deep stacks. As part of my efforts to make the otel BPF profiler work with ruby, I'm requesting to increase the maximum number of frames per trace to 1024, which fixes #760.

This is particularly needed for ruby, as it is a very common to have wrapper functions in frameworks like rails, such request as middlewares which all implement "env.call". When we have an intermixing of ruby and native frames, this becomes even more problematic.

This was originally part of #907 but I'm pulling it out to a separate PR at @florianl 's request so it can be discussed separately, and the memory implications for other interpreters / users can be considered.

Per @fabled 's thoughts in #760 though, I'm hoping this change is reasonable:

Having said this, it probably would be OK to bump the limit. And languages like Ruby would see instant improvement on it (but e.g. Java would not get the full benefit).

ivoanjo · 2025-11-03T10:02:20Z

+1 this is quite valuable to improving the Ruby profiling out-of-the-box usability.

Our experience at Datadog for our Ruby profiler matches @dalehamel's experience of Ruby apps consistently having deeper stacks than most other languages/runtimes. This is why the Datadog Ruby profiler defaults to 400 frames, and I know of customers that needed to increase this further.

florianl · 2025-11-05T08:02:17Z

-#define MAX_FRAME_UNWINDS 128
+// 1024 is the maximum power of 2 we can fit in a single perf event, and
+// in a single per CPU array entry
+#define MAX_FRAME_UNWINDS 1024


Not every environment does run ruby application and so will not directly benefit from this change. As with this change, eight times more memory is required for every trace that is reported from kernel to user space, did you measure the impact of this change for various workloads?

The link you shared is for race sanitizer to workaround exactly that fact(only for race sanitizer) that we do not send the whole trace to userspace.
We should not be sending empty frames

opentelemetry-ebpf-profiler/support/ebpf/tracemgmt.h

Line 381 in e413d8c

const u64 send_size = sizeof(Trace) - sizeof(Frame) * num_empty_frames;

Sorry that I mixed up the race sanitizer work around. As this was just resently changed, it was the first that dig pop into my head.

Another component that will grow with this increase are the eBPF maps per_cpu_records.

How close are we getting to unwind 1024 frames? Without further changes, we might be limited:

Unwinder Unwinder limit Unwinder limit * tail call limit

native NATIVE_FRAMES_PER_PROGRAM = 4 4 * 32 = 128

perl PERL_FRAMES_PER_PROGRAM = 12 12 * 32 = 384

php FRAMES_PER_WALK_PHP_STACK = 19 19 * 32 = 608

python FRAMES_PER_WALK_PYTHON_STACK = 12 12 * 32 = 384

ruby FRAMES_PER_WALK_RUBY_STACK = 27 27 * 32 = 864

v8 V8_FRAMES_PER_PROGRAM = 8 8 * 32 = 256

dotnet DOTNET_FRAMES_PER_PROGRAM = 6 6 * 12 = 192

With further jumps between the different unwinders, the actual number might be lower.

@florianl yeah so just about everything is potentially truncated right now.

Another component that will grow with this increase are the eBPF maps per_cpu_records.

I picked 1024 because It is the largest number that still fits into a single perf event, which is the limit for the per cpu record. Any higher power of 2 won't fit. There is a discussion about how to potentially go higher than that, as the per cpu record is currently an array of size of 1 but it doesn't necessarily need to be.

It is actually half of the size of other ruby profilers (stackprof and vernier allow a buffer of 2048, believe it or not - and rbspy is theoretically unlimited). My PR #907 raises the number of frames per tail call to 32, so the max theoretical number for ruby would actually be ~ 32 * 32 = 1024.

I don't believe we will really be wasting any memory here as profilers that don't write this many frames won't really allocate it. The per CPU array entry size does increase to something like 24Kb, but this isn't much of an increase on the system over all even if you have lots of processors. The benefit of being able to have untruncated stacks I think is worth it.

On the golang side, we should only end up actually transferring as much data as was actually written, so if memory increases it should be because we legitimately are sending more (useful) data to userspace.

fabled · 2025-11-06T19:02:52Z

I think this should be done together when fixing #940. There we can just set the variable length uint64_t hold the maximum amount of elements that fit in a perf event. That PR should also fix the Trace caching so that there will be no huge GC pressure on the agent side. Currently each Trace and []Frame slice are probably allocated separately without reusing data structures. I suspect this is a big GC pressure point currently.

dalehamel · 2025-11-06T19:08:23Z

I think this should be done together when fixing #940. There we can just set the variable length uint64_t hold the maximum amount of elements that fit in a perf event. That PR should also fix the Trace caching so that there will be no huge GC pressure on the agent side. Currently each Trace and []Frame slice are probably allocated separately without reusing data structures. I suspect this is a big GC pressure point currently.

Yeah, I agree. I think it is a better solution overall from the sound of it

florianl · 2025-11-11T08:29:35Z

[..] this should be done together when fixing #940 [..]

Can we then close this PR?

dalehamel · 2025-11-11T13:16:57Z

[..] this should be done together when fixing #940 [..]

Can we then close this PR?

Yup i'll close it and wait for #943

Increase frame buffer to max 1024 frames per trace

df428ea

dalehamel requested review from a team as code owners October 31, 2025 20:08

florianl reviewed Nov 5, 2025

View reviewed changes

This was referenced Nov 5, 2025

Allow pushing an "extra" value in Frame in padding #931

Closed

[feat][Ruby][meta] - Overall Ruby interpreter improvements #941

Open

dalehamel closed this Nov 11, 2025

dalehamel mentioned this pull request Nov 17, 2025

Refactor the ebpf frame data to be variable length #943

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase frame buffer to max 1024 frames per trace#908

Increase frame buffer to max 1024 frames per trace#908
dalehamel wants to merge 1 commit intoopen-telemetry:mainfrom
Shopify:bump-frame-buffer-to-1024

dalehamel commented Oct 31, 2025 •

edited

Loading

Uh oh!

ivoanjo commented Nov 3, 2025

Uh oh!

florianl Nov 5, 2025 •

edited

Loading

Uh oh!

korniltsev-grafanista Nov 5, 2025

Uh oh!

florianl Nov 5, 2025

Uh oh!

dalehamel Nov 5, 2025

Uh oh!

fabled commented Nov 6, 2025

Uh oh!

dalehamel commented Nov 6, 2025

Uh oh!

florianl commented Nov 11, 2025

Uh oh!

dalehamel commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Unwinder	Unwinder limit	Unwinder limit * tail call limit
native	NATIVE_FRAMES_PER_PROGRAM = 4	4 * 32 = 128
perl	PERL_FRAMES_PER_PROGRAM = 12	12 * 32 = 384
php	FRAMES_PER_WALK_PHP_STACK = 19	19 * 32 = 608
python	FRAMES_PER_WALK_PYTHON_STACK = 12	12 * 32 = 384
ruby	FRAMES_PER_WALK_RUBY_STACK = 27	27 * 32 = 864
v8	V8_FRAMES_PER_PROGRAM = 8	8 * 32 = 256
dotnet	DOTNET_FRAMES_PER_PROGRAM = 6	6 * 12 = 192

Conversation

dalehamel commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ivoanjo commented Nov 3, 2025

Uh oh!

florianl Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

korniltsev-grafanista Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

florianl Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

dalehamel Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

fabled commented Nov 6, 2025

Uh oh!

dalehamel commented Nov 6, 2025

Uh oh!

florianl commented Nov 11, 2025

Uh oh!

dalehamel commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

dalehamel commented Oct 31, 2025 •

edited

Loading

florianl Nov 5, 2025 •

edited

Loading