ruby: Detect garbage collection and push special frames with GC mode#1101
Conversation
|
@florianl I've uploaded the coredump moduledata to drive, which should make the failing tests pass. Note that for now I only have aarch64 coredumps. Please take an initial look, and if the approach looks good, I'll generate coredumps for amd64 as well. |
florianl
left a comment
There was a problem hiding this comment.
Just minor comments. Overall I think it is a good state and progress 👍
| u32 gc_flags; | ||
|
|
||
| if (bpf_probe_read_user( | ||
| &thread_ptr, sizeof(thread_ptr), (void *)(current_ctx_addr + rubyinfo->thread_ptr))) { |
There was a problem hiding this comment.
Since we're doing arithmetic on void pointers, let's add -std=gnu17 to the Makefile FLAGS to be explicit.
There was a problem hiding this comment.
Added - FYI these were already in use extensively in ruby_tracer.ebpf.c, i just cargo-culted them in #907, this PR, and others. I'm open to casting them in a follow-up PR if that's preferred.
There was a problem hiding this comment.
I think it's fine to use GNU extensions which gnu17 enables (this is also the default/implicit C standard in clang), let's just make it explicit.
|
@florianl I uploaded some amd64 coredumps and added the tests for them as well, which should make CI fail due to the missing coredump modules. |
florianl
left a comment
There was a problem hiding this comment.
Just uploaded new coredumps and retriggered CI.
Co-authored-by: Christos Kalkanis <christos.kalkanis@elastic.co> Co-authored-by: Florian Lehner <florianl@users.noreply.github.com>
f35f84a to
ec5426b
Compare
What
When we are running GC, we insert dummy frames to indicate the GC state, and not unwind the ruby stack:
Why
Fixes #936
When GC runs, we are no longer actually running the interpreter code. The current approach will erroneously attribute the GC native frames to whatever the state of the ruby VM is, even though that ruby stack had nothing to do with the triggering of GC.
This also allows us to keep track of GC overhead very easily, as the various GC modes (predominantly marking and sweeping) are grouped under a single "garbage collection" frame.
How
Taking inspiration from stackprof, we copy the same convention they use for dummy frames:
https://github.com/tmm1/stackprof/blob/8085169f071b2e25d5d798482bd1737e012af877/ext/stackprof/stackprof.c#L29-L33
This gives us a similar view to what stackprof request profiles:
Only we get additional details, we can see the actual native code that the ruby VM is running.
To do this, we will check
objspacefor the flags that indicate if gc is running, and if so, in which mode. If GC is running, we push a special "GC" frame type indicating what GC mode we are in, and handle this accordingly on the userspace side.