Skip to content

ruby: Detect garbage collection and push special frames with GC mode#1101

Merged
fabled merged 9 commits into
open-telemetry:mainfrom
Shopify:ruby-gc-frames-upstream
Jan 26, 2026
Merged

ruby: Detect garbage collection and push special frames with GC mode#1101
fabled merged 9 commits into
open-telemetry:mainfrom
Shopify:ruby-gc-frames-upstream

Conversation

@dalehamel
Copy link
Copy Markdown
Contributor

What

When we are running GC, we insert dummy frames to indicate the GC state, and not unwind the ruby stack:

Image Screenshot 2025-11-06 at 10 46 54 AM

Why

Fixes #936

When GC runs, we are no longer actually running the interpreter code. The current approach will erroneously attribute the GC native frames to whatever the state of the ruby VM is, even though that ruby stack had nothing to do with the triggering of GC.

This also allows us to keep track of GC overhead very easily, as the various GC modes (predominantly marking and sweeping) are grouped under a single "garbage collection" frame.

How

Taking inspiration from stackprof, we copy the same convention they use for dummy frames:

https://github.com/tmm1/stackprof/blob/8085169f071b2e25d5d798482bd1737e012af877/ext/stackprof/stackprof.c#L29-L33

This gives us a similar view to what stackprof request profiles:

Image Image

Only we get additional details, we can see the actual native code that the ruby VM is running.

To do this, we will check objspace for the flags that indicate if gc is running, and if so, in which mode. If GC is running, we push a special "GC" frame type indicating what GC mode we are in, and handle this accordingly on the userspace side.

@dalehamel dalehamel requested review from a team as code owners January 21, 2026 19:52
@dalehamel
Copy link
Copy Markdown
Contributor Author

@florianl I've uploaded the coredump moduledata to drive, which should make the failing tests pass.

Note that for now I only have aarch64 coredumps. Please take an initial look, and if the approach looks good, I'll generate coredumps for amd64 as well.

Copy link
Copy Markdown
Member

@florianl florianl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just minor comments. Overall I think it is a good state and progress 👍

Comment thread tools/coredump/testdata/README-RUBY-GC-CORES.md Outdated
Comment thread interpreter/ruby/ruby.go
Comment thread interpreter/ruby/ruby.go Outdated
Comment thread interpreter/ruby/ruby.go Outdated
u32 gc_flags;

if (bpf_probe_read_user(
&thread_ptr, sizeof(thread_ptr), (void *)(current_ctx_addr + rubyinfo->thread_ptr))) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're doing arithmetic on void pointers, let's add -std=gnu17 to the Makefile FLAGS to be explicit.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added - FYI these were already in use extensively in ruby_tracer.ebpf.c, i just cargo-culted them in #907, this PR, and others. I'm open to casting them in a follow-up PR if that's preferred.

Copy link
Copy Markdown
Member

@christos68k christos68k Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to use GNU extensions which gnu17 enables (this is also the default/implicit C standard in clang), let's just make it explicit.

@dalehamel
Copy link
Copy Markdown
Contributor Author

@florianl I uploaded some amd64 coredumps and added the tests for them as well, which should make CI fail due to the missing coredump modules.

Copy link
Copy Markdown
Member

@florianl florianl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just uploaded new coredumps and retriggered CI.

Comment thread support/ebpf/errors.h
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[feat][Ruby] Support detecting GC state and handle it accordingly

4 participants