Initial support for BEAM (Erlang/Elixir) by GregMefford · Pull Request #289 · open-telemetry/opentelemetry-ebpf-profiler

GregMefford · 2025-01-01T18:56:13Z

This PR wraps up the the work scaffolded in initial plumbing and minimal unwinder with a complete unwinder that symbolizes the stack frames for BEAM code as well as the runtime's native code.

It supports OTP 27 and 28 running on 64-bit x86 and ARM architectures. It does currently depend on the static symbol r not being stripped from the binary, but otherwise doesn't require any special runtime flags or code instrumentation - it can detect whether Frame Pointer support is included (e.g. via +JPperf true VM flag), and use them if available to more efficiently unwind the stack frames, but also works without them.

DevFiler showing 4 cores running 4 BEAM schedulers. The magenta stack frames are BEAM code:

Zooming in on just one of the BEAM code sections, we can see the details of the call stack. In this case, it's showing a sample app that I created for testing, which uses Plug and Bandit to serve HTTP requests generated by a GenServer process using the Finch library.

Zooming in more and hovering over the stack frames, we can see that the agent has resolved the symbols to show the source code line numbers as well as the module/function/arity information.

Original PR description in case it's relevant for following the discussion thread later

I have begun to work on support for BEAM languages like Erlang and Elixir, and wanted to open this PR early as a draft, so that I can get any feedback you may have to help the process go more smoothly. I don't have much experience with Go or eBPF, so any feedback you have is very welcome. What I have so far is mostly based on digging through the existing support for other languages as well as the BEAM / OTP source code, and trying to understand how all the parts fit together.

I have also been digging through the BEAM / OTP source code, and also the gdb scripts that it includes for working directly with the memory image of a running system or core dump.

So far, I am able to see the logs from my Go code coming through, and confirming that it's working correctly as far as loading and attaching the interpreter support, like this:

$ sudo ./ebpf-profiler -collection-agent=127.0.0.1:11000 -disable-tls
INFO[0000] Starting OTEL profiling agent v0.0.0 (revision main-67f28f2b, build timestamp 1735697412)
INFO[0000] Interpreter tracers: perl,php,python,hotspot,ruby,beam
INFO[0000] Determined PAC mask to be 0x007F000000000000
INFO[0000] Found offsets: task stack 0x38, pt_regs 0x3eb0, tpbase 0x1c30
INFO[0000] Supports generic eBPF map batch operations
INFO[0000] Supports LPM trie eBPF map batch operations
INFO[0000] eBPF tracer loaded
INFO[0000] Attached tracer program
INFO[0000] Attached sched monitor
INFO[0026] BEAM interpreter found: [beam.smp]
INFO[0026] read symbol value etp_otp_release: 27
INFO[0026] read symbol value etp_erts_version: 15.1.3
INFO[0026] BEAM loaded, otp_version: 27, interpRanges: [{718368 718496}]
INFO[0026] BEAM interpreter attaching

However, I can't seem to get any tracing logs out of the eBPF program, so I suspect that it's never being run. If I modify the native eBPF script to write the same kind of log there, I can confirm that I'm seeing it in /sys/kernel/tracing/trace_pipe after doing the following to narrow down the logs I want to see, but the same doesn't work for my beam program:

# echo 1 > /sys/kernel/tracing/tracing_on
# echo 0 > /sys/kernel/tracing/events/enable
# echo 1 > /sys/kernel/tracing/events/bpf_trace/bpf_trace_printk/enable

I was thinking that this was because OTP 27 includes a JIT, so the interpreter might never be used, but I am also not seeing any frames for the native JIT code executing, so I'd love any advice you may have there in terms of how I might go about troubleshooting that. Maybe the native unwinder is just missing some heuristic that's needed for the way the ASMJIT / BEAMJIT works? I'm not clear on how the profiler resolves symbols for JIT code or how those should show up in devfiler, so maybe it is working and I just don't know how to use to the tool... 😅 But from what I can tell, I don't think the frames are showing up there for anything but the C code for Erlang itself (and built-in C functions). I was expecting to be able to see which Erlang code was running, for example.

I also tried building OTP 27 with the JIT disabled to confirm my theory that it just wasn't working, but it behaved the same (though with a different memory address showing for the interpRanges, which confirms that it really did build a different set of code).

fabled · 2025-01-02T11:05:58Z

I was thinking that this was because OTP 27 includes a JIT, so the interpreter might never be used, but I am also not seeing any frames for the native JIT code executing, so I'd love any advice you may have there in terms of how I might go about troubleshooting that.

Typically JIT is on mmaped anonymous memory. You will need to add hooks to call your unwinder for this memory mappings. For a generic catch it all example, see the v8 unwinder's code at https://github.com/open-telemetry/opentelemetry-ebpf-profiler/blob/main/interpreter/nodev8/v8.go#L544

If you can extract the exact memory area where JIT code exists directly from the VM, you can refer to hotspot unwinder code at https://github.com/open-telemetry/opentelemetry-ebpf-profiler/blob/main/interpreter/hotspot/instance.go#L784

Maybe the native unwinder is just missing some heuristic that's needed for the way the ASMJIT / BEAMJIT works?

The native winder will not have heuristic for it. You need to implement the code to hook your unwinder for the memory areas where JIT code is at (see above).

After that you'll need to have eBPF code that actually unwinds the JIT code. It might be simple if the JIT frame layout is frame pointer based, see e.g. v8 unwinder https://github.com/open-telemetry/opentelemetry-ebpf-profiler/blob/main/support/ebpf/v8_tracer.ebpf.c, or highly complicated if there is a custom frame layout, see e.g. hotspot unwinder https://github.com/open-telemetry/opentelemetry-ebpf-profiler/blob/main/support/ebpf/hotspot_tracer.ebpf.c. The unwinder will need to collect the extra needed by the symolization in the next step.

I'm not clear on how the profiler resolves symbols for JIT code or how those should show up in devfiler, so maybe it is working and I just don't know how to use to the tool... 😅

Once the unwinding is done, the core will code interpreter plugins symolization code which will need to extract the symbol data from the target process. Again, see some examples how its done for the hotspot https://github.com/open-telemetry/opentelemetry-ebpf-profiler/blob/main/interpreter/hotspot/instance.go#L860 or v8 https://github.com/open-telemetry/opentelemetry-ebpf-profiler/blob/main/interpreter/nodev8/v8.go#L1689.

But from what I can tell, I don't think the frames are showing up there for anything but the C code for Erlang itself (and built-in C functions). I was expecting to be able to see which Erlang code was running, for example.

Correct, you will need to implement both the unwinding and symbolization yourself. Depending on the VM internals, this can be highly complicated and extensive work that is needed to cover all the corner cases within the ebpf constraints.

GregMefford · 2025-01-03T03:24:03Z

Typically JIT is on mmaped anonymous memory. You will need to add hooks to call your unwinder for this memory mappings. For a generic catch it all example, see the v8 unwinder's code at https://github.com/open-telemetry/opentelemetry-ebpf-profiler/blob/main/interpreter/nodev8/v8.go#L544

If you can extract the exact memory area where JIT code exists directly from the VM, you can refer to hotspot unwinder code at https://github.com/open-telemetry/opentelemetry-ebpf-profiler/blob/main/interpreter/hotspot/instance.go#L784

Aha! Thanks, this was the connection I was missing. I was thinking that since I can't statically know about all the JIT code that might be generated in the future, I can't possibly add it all to the maps, but I believe the BEAM does have ways to pretty easily locate the memory of all the JITted code, so I'll dig into that and the v8 example.

The BEAM does use frame pointers, so I believe it should be relatively straightforward to figure out.

Thanks for the tips! ❤️ 🚀

fabled · 2025-01-03T13:13:59Z

INFO[0026] BEAM loaded, otp_version: 27, interpRanges: [{718368 718496}]

The range for the interpreter looks suspiciously small - only 128 bytes. This can be valid, if its just a small stub but guaranteed to be on stack for interpreter frames.

Alternatively, this could be a function doing something else that is not necessarily on stack when executing interpreted code.

You might want to double check which functions are on stack when executing interpreted code. If it can be a set of multiple functions (e.g. several functions with same signature tailcalling each other -- compiler can convert call to jump), you need to extract the range that covers all of these. It would become a problem if these functions are not contiguously in the executable area.

Aha! Thanks, this was the connection I was missing. I was thinking that since I can't statically know about all the JIT code that might be generated in the future, I can't possibly add it all to the maps, but I believe the BEAM does have ways to pretty easily locate the memory of all the JITted code, so I'll dig into that and the v8 example.

No problem. But in short, you'll need to manually extract those areas and then call UpdatePidInterpreterMapping. Internally, the ebpf "misuses" the network prefix lookup by looking up memory address from a prefix lookup map. So these calls should properly preprocess memory ranges to prefix lists as shown by the examples.

You can also provide little bit of context data for each memory area. This could be useful if there's some auxiliary data connected to each memory area the unwinder needs.

The BEAM does use frame pointers, so I believe it should be relatively straightforward to figure out.

Nice! Then v8 or dotnet unwinders are closest equivalents. The hotspot unwinder implements a custom frame layout. v8 is simplest because all the extra data is accessible directly from stack data. The dotnet is a step more complicated as mapping from PC to auxiliary data is non-trivial step.

Thanks for the tips! ❤️ 🚀

You're welcome. Looking forward to the BEAM support! Thank you for working on this!

florianl · 2025-01-10T10:08:26Z

FYI: I have opened a open-telemetry/semantic-conventions#1735 with OTel semconv to add a type for beam.

GregMefford · 2025-01-12T02:24:20Z

I had some more time today to make progress, by copying the v8 implementation of SynchronizeMappings, and also by using a JITDUMP file that the BEAM outputs when it's running with the frame pointer support that we'll want anyway for unwinding (It's the +JPperf true Erlang runtime flag). Either way, it looks like I'm seeing these new frames showing up in DevFiler, so I think that's forward progress, but I don't see any evidence that my EBPF code is getting called still.

With the "catch-all" mappings, I get logs like this out of the agent, so I assume that means it's working and there's just one large region of memory it's treating as BEAM JIT code:

INFO[0000] Enabling BEAM for 0xec8d54800000/0x4000000

With the fancier JITDump code, I get logs like this, so I am pretty sure it's parsing the file correctly. Those function names look sensible to me and they're function names I recognize as things I did include in my test app I'm running.

INFO[1164] JITDump Code Load 'Elixir.Mint.HTTP1.Parse':token_list_sep_downcase/2 @ 0xec8d55485834 (364 bytes)
INFO[1164] JITDump Code Load 'Elixir.Mint.HTTP1.Parse':transfer_encoding_header/1-CodeInfoPrologue @ 0xec8d554859a0 (52 bytes)

I was hoping to see logs from the eBPF code using the following (as root), but nothing is showing up:

echo 0 > /sys/kernel/tracing/events/enable
echo 1 > /sys/kernel/tracing/events/bpf_trace/bpf_trace_printk/enable
echo 1 > /sys/kernel/tracing/tracing_on
cat /sys/kernel/tracing/trace_pipe

This is the kind of thing I'm seeing in DevFiler, so it's looking promising that it's doing something, but I'm not clear on where to look next for signs of life / why the eBFP unwinder does not seem to be getting called like I was expecting.

florianl · 2025-01-13T08:22:42Z

I was hoping to see logs from the eBPF code using the following (as root), but nothing is showing up:

To get log lines using the DEBUG_PRINT macro, like here, you need to compile the eBPF code with the target debug-amd64 in support/ebpf before compiling the Go part of the agent code. Once done, you should be able to see output via bpftool prog tracelog.
Just using make debug-agent from the main Makefile will not call the debug-amd64 target in support/ebpf iirc.

GregMefford · 2025-01-14T14:37:50Z

To get log lines using the DEBUG_PRINT macro [...]

Ah, thanks! I wasn't sure how to make that work, which is why I did it this way instead, which did work when I put it in one of the other eBPF programs, but I don't see anything coming from my program still.

florianl · 2025-01-14T14:57:58Z

With #145 things changed a bit and I missed that part. Sorry that I have missed this one in the first place.

Here are steps I used to generate output to bpftool prog tracelog:

$ make debug-amd64 -C support/ebpf
$ make debug-agent
$ sudo ./ebpf-profiler -collection-agent=127.0.0.1:11000 -disable-tls -v

I missed, that -v is now coupled with the debug eBPF blobs that produce the output via DEBUG_PRINT. Hope this helps.

GregMefford · 2025-02-02T03:06:50Z

I spent some more time today to make progress on this, and was able to get past where I was stuck before, and now I am able to see that my eBPF unwinder is running (not sure yet if it's doing the correct thing, but it's doing something) and my Symbolize function is getting called on the Golang side. I think I can see what some of the next steps are in terms of making it work correctly, but I think I might be stuck on something that I'm still unsure how to solve.

In the output form the ebpf-profiler agent, I get occasional errors like the following, and I believe it's preventing from the trace showing up at all in DevFiler (I tried downloading the latest 0.11.0 as well, so that's the version I'm using).

ERRO[0037] Request failed: rpc error: code = InvalidArgument desc = unsupported frame kind: beam

Is this because of the PR you mentioned opening to add that as a supported type in the OTEL spec, and it hasn't yet been pulled into DevFiler? Or is there just somewhere in the ebpf-profiler code that it doing some validation that I need to update in my PR to allow that as a valid frame kind? I wasn't able to find that error string in the code, so I'm guessing it's coming from somewhere else, and given that it's an rpc error, I'm guessing it's DevFiler.

Thanks again for all your help!

florianl · 2025-02-03T08:43:44Z

Is this because of the PR you mentioned opening to add that as a supported type in the OTEL spec, and it hasn't yet been pulled into DevFiler?

Yes - devfiler uses a filter on frame types. This might change in the future. For the mean time and to enable you to continue your work, I have created a version of devfiler that handles beam frame types:

curl -L -H 'Authorization: 6ac0e3549c910483' -o 'devfiler-v0.11.0-beam.tar.gz' https://upload.elastic.co/d/7a3f64a4545109bdcc99a85ec80038f55f8b4522aa5d3978e2d56575a0246b79

Working on live processes can be tricky and reproducing edge cases can be hard. For that reason, I recommend looking into coredump. With the tool coredump one can import a core dump of a process and run all the Go and eBPF code in user space just like a regular Go test. Please feel free to ping me, if you need help.

GregMefford · 2025-02-09T19:12:14Z

Sorry it took so long to make some time to actually give it a try, but when I tried running the custom devfiler build on my M1 Mac using the apple-silicon version, it crashes with the following error. I'm not familiar with this kind of error, but it looks like it's trying to dynamically load a library that I don't have installed - should that have been statically compiled into the app instead?

-------------------------------------
Translated Report (Full Report Below)
-------------------------------------

Process:               devfiler [75900]
Path:                  /Applications/devfiler.app/Contents/MacOS/devfiler
Identifier:            org.nixos.devfiler
Version:               ???
Code Type:             ARM-64 (Native)
Parent Process:        launchd [1]
User ID:               501

Date/Time:             2025-02-09 14:05:58.3904 -0500
OS Version:            macOS 15.3 (24D60)
Report Version:        12
Anonymous UUID:        3B6A222A-F595-8AE9-0C62-BC866F9E1776

Sleep/Wake UUID:       CB6571CC-1695-408A-BF33-2158F4604E6C

Time Awake Since Boot: 190000 seconds
Time Since Wake:       2159 seconds

System Integrity Protection: enabled

Crashed Thread:        0

Exception Type:        EXC_CRASH (SIGABRT)
Exception Codes:       0x0000000000000000, 0x0000000000000000

Termination Reason:    Namespace DYLD, Code 1 Library missing
Library not loaded: /nix/*/libc++abi.1.0.dylib
Referenced from: <FBBC286B-508A-36A4-9257-CC5C94E09504> /Applications/devfiler.app/Contents/MacOS/devfiler
Reason: tried: '/nix/*/libc++abi.1.0.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/nix/*/libc++abi.1.0.dylib' (no such file), '/nix/*/libc++abi.1.0.dylib' (no such file), '/usr/local/lib/libc++abi.1.0.dylib' (no such file), '/usr/lib/libc++abi.1.0.dylib' (no such file, not in dyld cache)
(terminated at launch; ignore backtrace)

I'm planning to try to figure out how to get the core dump testing thing working, though, so not blocking at the moment. I just wanted to see whether it got me further along with the path I was on before, having a version of devfiler with support for the beam type.

florianl · 2025-02-13T09:59:56Z

sorry - this was not supposed to happen 🙏

GregMefford · 2025-02-15T19:39:19Z

With the updated version of devfiler that supports the Beam type, I think it's actually pretty close to working! I just need to sort out the actual symbolization, because for now I'm just sending a static "Some Bogus Name" for the file to see that it's working at all.

fabled · 2025-02-19T08:03:10Z

+// Minimal JITDUMP file reader for BEAM
+
+// This has the minimal code we need to read the JITDUMP files that the BEAM
+// writes to `/tmp/jit-<pid>.dump`. It isn't BEAM-specific, so it could probably
+// be used more generally. The spec for this file format is at:
+// https://raw.githubusercontent.com/torvalds/linux/refs/heads/master/tools/perf/Documentation/jitdump-specification.txt


The general idea has been to natively support VMs without jitdump. The main three reasons are:

on many VMs enabling jitdump can give significant negative performance impact on the VM

usually the jitdump output is inferior and gives only symbol names. most of our plugins are superior by extracting source code line level information and decode inlined function information

ebpf-profiler was designed to be a zero changes to system required profiler, and often enabling jitdump requires changes to system

In other words, native support is preferred if possible. Have you looked at all if extract the symbolization information is from the process directly is doable? I do understand this is more work, but as explained above it also gives much better results.

OTOH, we have had discussion on supporting jitdump earlier. And if supporting beam directly is not feasible or possible, I think no one will object on adding jitdump plugin either. But then I think this should be renamed to jitdump plugin and made generic. Though, understandably this may require changes in core code. E.g. to automatically enable jitdump plugin if the corresponding jitdump file is found (instead of using elf file specific regexes).

Also another potential issue with jitdump format is that, its a linear dump of what the VM does with all of JIT output. Basically the file can grow boundlessly, and reading/parsing it may require non-trivial amount of memory. In other words, depending on the VM work load, it may result the profiler to require a huge amount of memory to track all JITted functions.

Typically the interpreter plugins only track the functions it sees in the traces helping a lot to keep the memory usage within reasonable limits. Though there are caveats here also.

Just a few comments in advance to note the observations we had earlier on the approach of using jitdump. While the initial implementation might be simple(r), the complexities come from trying to enable jitdump in a long living / large processes on a production system.

Hello! Erlang VM developer jumping into the discussion...

In other words, native support is preferred if possible. Have you looked at all if extract the symbolization information is from the process directly is doable? I do understand this is more work, but as explained above it also gives much better results.

I don't know anything about what is available to you when running ebpf, but if you can get symbol locations and then read any memory from the process executing (which it sound like you can do), then it is definitely possible to get the symbols without dumping.

This gdb macro function etp-cp-func-info-1 shows how you can get to that information.

We also have a jit-reader plugin for gdb that does the same thing: https://github.com/erlang/otp-gdb-tools/blob/master/jit-reader.c.

Thanks for the input! I had started to look into using jitdump originally because I thought it was necessary for identifying which memory addresses to map to the BEAM process for unwinding, but have later learned that isn't necessary. I left it in there for now in case I ended up needing or wanting it for some other purpose.

Ultimately I agree that it shouldn't be necessary to use jitdump, and I don't mind figuring out how to make it work without that. It seems like it's possible. I've been staring at the etp macros and I think I mostly understand what they're doing - I just need to figure out how to properly implement them in eBPF.

If there's interest in having a jitdump plugin for other non-BEAM processes, we could talk about splitting it out into a separate PR once we get close to finishing this one up.

garazdawi · 2025-02-27T21:09:13Z

it's running with the frame pointer support that we'll want anyway for unwinding

Using frame-pointers will make your life easier, but it should not technically be needed. We added frame-pointers in order for perf to work as adding a separate crawling scheme to it was not trivial. However crawling the Erlang stack without frame-pointers is quite easy as you just have to rewind the stack looking for all the values that have the two laest significant bits set to 0, stopping at a certain end-marker.

Maybe having a look at the gdb unwind code that we have can be inspirational? https://github.com/erlang/otp-gdb-tools/blob/master/jit-reader.c#L222-L312

or gdb scripts if that more to you taste: etp-stacktrace-1.

One problem that you will notice though is that we write to rsp when calling C code. That is when executing Erlang code we run on an Erlang stack, and then when calling any C code, we switch stacks to a native stack (that is much larger than the Erlang process stack). So when crawling you will have to know if you are crawling C code or not. I suppose you can look at the instruction pointer and see if it is JIT:ed code or not and from there figure out how you should unwind.

fabled · 2025-02-28T08:33:55Z

Maybe having a look at the gdb unwind code that we have can be inspirational? https://github.com/erlang/otp-gdb-tools/blob/master/jit-reader.c#L222-L312

I think this is the ideal approach for the profiler too. Basically rewrite the gdb jit reader plugin as an ebpf unwinder and a host agent plugin.

GregMefford · 2025-03-01T18:01:36Z

Thanks for the tips, @fabled and @garazdawi!

I've spent a lot of time staring at how the gdb scripts work, and how it might be different when running on ARM / Apple Silicon vs. x86, but I hadn't dug into the jit-reader.c code yet, so I think that'll be really helpful (if nothing else, it's another implementation to compare that I'm more familiar with than with GDB macros).

It looks like there's a pretty clear path forward, I just need to understand how to traverse the symbols and memory offsets in eBPF to get to where I need to read the function info, and then how to pass that from eBPF to the Golang agent code.

GregMefford · 2025-04-09T00:24:02Z

Sorry for the long silence! Here's an encouraging update!

With the most recent commit I just pushed, I have things working pretty well. I'm still struggling to sort out why I can't get the symbolization to work by directly reading the memory, but I am able to confirm that the pieces all do fall into place when I use the JITDump output to do the symbolization. So I think all I need to do is figure out what's different between what the BEAM is writing to the JITDump and what's in its internal beam_ranges location in memory. My guess is there's just some memory offset I'm missing to make it all click into place.

I also obviously need to get this PR cleaned up and rebased in general, just due to drift as I've been slowly chipping away at this integration.

Here are some fun screen shots, though! 🎉 🚀

100% zoomed out view for overall context:

We can see that there are 4 schedulers (it's a 4-core machine) and they're doing similar work. We can also see the cyan frames coming from native code, and when I index the beam.smp binary in devfiler, it correctly shows the references into the C code as well as the magenta ones for BEAM code.

Zooming in to just erts_sched_1, which is the first "regular" BEAM scheduler (not dirty I/O or dirty CPU) that's running the testing app I wrote:

It's pretty cool that just about any Elixir developer could pretty quickly tell from this that the app is using the Phoenix and Bandit libraries to build a web server.

If we zoom in on some of the smaller narrower/less-obvious stacks:

We can see that the other major thing the app is doing is something called ElixirLoadApp.Worker (the name of the test app I made) which is calling into Finch (the HTTP client library I'm using). That makes sense, because the only thing this test app is doing is hosting an HTTP API and then starting some worker processes that call that API in a loop as fast as it can, to keep the system busy.

There are a few things I'd like to clean up, like the module names looking like 'Elixir.Finch':'-request/3-fun-1'/4- instead of Finch.request/4 like an Elixir developer would expect to see, but that's just because that's the Erlang-formatted string name for the code at this memory address that JITDump is outputting, so it should be possible to parse that module/function/arity differently once I sort out the memory offset issue.

GregMefford · 2025-05-24T19:39:54Z

Status update time!

After a long struggle against pointer arithmetic, I have an initial working version that traverses the in-memory C structs to resolve the module/function/arity as well as the file name and line number. 🚀 They're also now formatted in a friendly way based on whether it's an Erlang (some_module:some_function/1) or Elixir (SomeModule.some_function/2) module.

The Go code still needs some clean-up, but I wanted to capture this working baseline first and then figure out if there's some more obvious way to do it. Any help there in terms of Go idioms or how I should be using the remotememory library would be greatly appreciated. I am pretty sure that I'm just using the remotememory.Ptr function incorrectly, because I can't understand how it works, but it does seem to work.

I also need help understanding if there's some better way to introspect into the struct offsets I need here, so that they aren't hard-coded and they don't break if some new field is added to the Erlang structs. That might take some change in Erlang itself to expose a stable API e.g. via erl_etp.c, but I'm not sure. If there's some way to look up the struct offset by name, I can do that for now, though, and that would hopefully at least be more stable than hard-coding the offsets. I assume there's a way, because gdb seems to be able to do it, but I'm not familiar with how ELF and DWARF work, so I'll need to do some research. Help appreciated there if there's some easy way to do it already.

I am currently assuming that the +JPperf option is being used to enable frame pointers, but I actually think it wouldn't be hard to detect whether that's enabled or not, and scan the stack for continuation pointers in the case that it's not. That would make one less step that users would need to consider before collecting profiles in their environments.

I'm not sure what those UNREPORTED frames are, but I see errors in the log saying that the PC address it in an unexpected range, so I need to track down why that's happening. Maybe it's a case where Erlang code is calling into C code or something else unusual? 🤷

Other than those, I think I just need to do some testing to see how things work on different architectures and with different OTP versions, so that I can either support them or at least detect that they're not supported and just not try.

GregMefford · 2025-12-11T23:34:06Z

I've updated the original PR description and reworked this existing PR to be based on the others we've recently merged. Hopefully that's not too confusing, but I wanted to preserve the history for anyone who was watching this PR to keep up to date on the status.

It's now ready for review / final polishing to get it merged.

fabled

Nice! Some initial comments added.

GregMefford · 2026-01-03T20:41:00Z

Sorry for the delay - I ended up being more busy than I expected over the holiday break, but now I'm back at it and I believe I have addressed the outstanding feedback.

fabled

Thanks! Looks pretty good now. Some (mostly stylistic/doc related) comments added.

Would be able to also generate some coredump test cases to ensure this works as expected?

fabled · 2026-01-05T09:42:47Z

+	hashMFA := func(key beamMfa) uint32 {
+		data := make([]byte, 12)
+		binary.LittleEndian.PutUint32(data[0:4], key.module)
+		binary.LittleEndian.PutUint32(data[4:8], key.function)
+		binary.LittleEndian.PutUint32(data[8:12], key.arity)
+		return crc32.ChecksumIEEE(data)
+	}


Seems this is mixing three 32-bit values. We have libpf.hash to transform Uint32. Typically CRC is a bit slow, and we are not really using it for hashing anwhere (seems we have one use it in pfelf, but that's to match on-disk file format values).

Could you use libpf/hash, or for hashing []byte we have used zeebo/xxh3 (wondering if there should be a wrapper for this in libpf/hash to keep hashing code in sync everwhere).

I'm not sure if there's style guide on this, but I'd prefer this to be a top level function instead of lambda looking definition.

Yep, no problem - that's exactly what I was trying to do and just didn't know what the preferred solution was for that.

fabled · 2026-01-05T09:43:37Z

+	codeHeader := libpf.Address(frame.File)
+	pc := libpf.Address(frame.Lineno)


This needs an update due to merge of PR #943

Ah, this looks like a great improvement, because I was feeling rather constrained about how to send over the information I wanted and now I have more freedom to send over several different pieces of data.

fabled · 2026-01-05T09:46:17Z

+	numFunctions := i.rm.Uint32(codeHeader + libpf.Address(vms.beamCodeHeader.numFunctions))
+	functions := codeHeader + libpf.Address(vms.beamCodeHeader.functions)
+
+	midBuffer := make([]byte, 16)


Could you add a link for the beam code for this buffer struct/layout?
In addition/alternatively, it would improve readability/maintainability to define the size/offset constants in vmStructs to have symbolic names for these.

Yeah I will try to clarify that, but it's not a specific 16-byte data structure here. It's just a buffer space that gets used inside the loop to combine two sequential reads of 8-byte pointers to the start and end of a memory range:

midStart := nopanicslicereader.Ptr(midBuffer, 0) midEnd := nopanicslicereader.Ptr(midBuffer, 8)

fabled · 2026-01-05T09:47:15Z

 	"go.opentelemetry.io/ebpf-profiler/interpreter"
 	"go.opentelemetry.io/ebpf-profiler/libpf"
 	"go.opentelemetry.io/ebpf-profiler/lpm"
+	"go.opentelemetry.io/ebpf-profiler/nopanicslicereader"


In rest of the code we alias this to npsr to make the code shorter. Perhaps the same alias could be done in this files?

fabled · 2026-01-05T09:48:15Z

+	lineTable := i.rm.Ptr(codeHeader + libpf.Address(vms.beamCodeHeader.lineTable))
+	functionTable := lineTable + libpf.Address(vms.beamCodeLineTab.funcTab)
+
+	lineRange := make([]byte, 16)


Same for this buffer, source link and/or vmStructs symbolic names for size/offsets.

fabled · 2026-01-05T09:51:05Z

+		}
+	}
+
+	nameString := libpf.Intern(string(name))


to avoid data duplication:

Suggested change

nameString := libpf.Intern(string(name))

nameString := libpf.Intern(pfunsafe.ToString(name))

GregMefford · 2026-01-19T16:05:15Z

Would be able to also generate some coredump test cases to ensure this works as expected?

I've addressed the other items in the PR. Would you be OK if we tackle the coredump tests as a separate PR since this one is already super long and complicated, or do you want to add them here to wrap this up as a "finished" integration?

For now, I'll work on getting a commit in a separate PR based on this one, and we can decide whether to merge it with this one or separately. I just didn't want to block this PR any longer than we need to while we're iterating on those.

fabled · 2026-01-20T14:12:31Z

I've addressed the other items in the PR. Would you be OK if we tackle the coredump tests as a separate PR since this one is already super long and complicated, or do you want to add them here to wrap this up as a "finished" integration?

Sounds good to me. I think this looks good to go, and potential fixes can be done as a follow up if needed. If you are working on the coredumps, its ok for me to ship it as separate PR and land this now. Approving. Thanks!

github-advanced-security AI found potential problems Jan 1, 2025

View reviewed changes

Comment thread interpreter/beam/beam.go Fixed

GregMefford force-pushed the beam_support branch from 93d0726 to caa65c3 Compare January 12, 2025 02:02

github-advanced-security AI found potential problems Jan 12, 2025

View reviewed changes

Comment thread interpreter/beam/beam.go Fixed

florianl closed this Feb 13, 2025

florianl reopened this Feb 13, 2025

GregMefford force-pushed the beam_support branch 2 times, most recently from f8f3cb5 to ea28f8a Compare February 16, 2025 19:13

fabled reviewed Feb 19, 2025

View reviewed changes

gnurizen mentioned this pull request Feb 22, 2025

Erlang VM support parca-dev/parca-agent#3008

Open

GregMefford force-pushed the beam_support branch from 797ce5f to afa5580 Compare June 8, 2025 02:31

GregMefford force-pushed the beam_support branch 2 times, most recently from 8c8ffd3 to 373543b Compare June 19, 2025 22:20

GregMefford force-pushed the beam_support branch from 52b6a8d to e4b2cb4 Compare December 8, 2025 00:55

GregMefford marked this pull request as ready for review December 11, 2025 23:32

GregMefford requested review from a team as code owners December 11, 2025 23:32

fabled reviewed Dec 12, 2025

View reviewed changes

GregMefford force-pushed the beam_support branch 2 times, most recently from c0c0beb to ff0c9b5 Compare January 3, 2026 20:34

fabled reviewed Jan 5, 2026

View reviewed changes

christos68k mentioned this pull request Jan 15, 2026

README: Add Erlang to list of supported HLL #1073

Closed

GregMefford added 8 commits January 19, 2026 15:51

Symbolize BEAM stack frames

24fc561

Consolidate reads and user more LRU caches

eca4e28

Replace Uint64 with Ptr for pointers

1b97e1e

Simplify hashMFA function with libpf.hash

e32379b

Link to reference implementations for MFA and line info lookups

6a222ce

Alias nopanicslicereader as npsr per convention

d8d97be

Parse otpRelease to an integer on load instead of storing the string

7e706e0

Avoid duplicate data when interning atom names

87f89c4

GregMefford force-pushed the beam_support branch from 8c8bd1c to 87f89c4 Compare January 19, 2026 15:52

github-advanced-security AI found potential problems Jan 19, 2026

View reviewed changes

Comment thread interpreter/beam/beam.go Fixed

Enforce otpRelease fits in uint8

8320ea7

fabled approved these changes Jan 20, 2026

View reviewed changes

christos68k approved these changes Jan 20, 2026

View reviewed changes

florianl approved these changes Jan 20, 2026

View reviewed changes

fabled merged commit 821251b into open-telemetry:main Jan 20, 2026
28 checks passed

GregMefford deleted the beam_support branch January 24, 2026 19:16

gnurizen mentioned this pull request Mar 9, 2026

Optimize distro QEMU tests to be more efficient parca-dev/opentelemetry-ebpf-profiler#228

Closed

		codeHeader := libpf.Address(frame.File)
		pc := libpf.Address(frame.Lineno)

	nameString := libpf.Intern(string(name))
	nameString := libpf.Intern(pfunsafe.ToString(name))

Conversation

GregMefford commented Jan 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

fabled commented Jan 2, 2025

Uh oh!

GregMefford commented Jan 3, 2025

Uh oh!

fabled commented Jan 3, 2025

Uh oh!

florianl commented Jan 10, 2025

Uh oh!

Uh oh!

GregMefford commented Jan 12, 2025

Uh oh!

florianl commented Jan 13, 2025

Uh oh!

GregMefford commented Jan 14, 2025

Uh oh!

florianl commented Jan 14, 2025

Uh oh!

GregMefford commented Feb 2, 2025

Uh oh!

florianl commented Feb 3, 2025

Uh oh!

GregMefford commented Feb 9, 2025

Uh oh!

florianl commented Feb 13, 2025

Uh oh!

GregMefford commented Feb 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

garazdawi commented Feb 27, 2025

Uh oh!

fabled commented Feb 28, 2025

Uh oh!

GregMefford commented Mar 1, 2025

Uh oh!

GregMefford commented Apr 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GregMefford commented May 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GregMefford commented Dec 11, 2025

Uh oh!

fabled left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

GregMefford commented Jan 3, 2026

Uh oh!

fabled left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

GregMefford commented Jan 1, 2025 •

edited

Loading

GregMefford commented Feb 15, 2025 •

edited

Loading

GregMefford commented Apr 9, 2025 •

edited

Loading

GregMefford commented May 24, 2025 •

edited

Loading