Skip to content

feat: generic probe support#953

Merged
christos68k merged 7 commits intoopen-telemetry:mainfrom
Shopify:generic-probe-support
Dec 4, 2025
Merged

feat: generic probe support#953
christos68k merged 7 commits intoopen-telemetry:mainfrom
Shopify:generic-probe-support

Conversation

@manuelfelipe
Copy link
Copy Markdown
Contributor

@manuelfelipe manuelfelipe commented Nov 10, 2025

Summary

Refactors probe infrastructure introduced in #651 to support kprobes, kretprobes, uprobes, and uretprobes through a unified interface with syntax inspired by bpftrace. Previously only uprobes were supported.

Motivation

We wanted to understand what causes shared memory pages to turn into private memory pages (copy-on-write) in our Ruby workers. Pitchfork workers start with everything shared post-fork but quickly grow private memory as requests come in.

Being able to attach kprobes to kernel functions like do_wp_page allows us to capture stack traces when CoW events occur, showing exactly which Ruby code is causing pages to be copied. This data could help us identify memory hotspots and optimization opportunities.

Previous uprobe-only support limited profiling to user space functions. Adding kprobe support enables profiling kernel-level events (page faults, syscalls, etc.) alongside regular CPU profiling, giving a complete picture of system behavior.

What Changed

Renamed uprobe-specific code to generic probe terminology:

  • --uprobe-link--probe-link flag
  • UProbeLinksProbeLinks config
  • TraceOriginUProbeTraceOriginProbe
  • support/ebpf/uprobe.ebpf.cgeneric_probe.ebpf.c

Added kprobe support:

  • New kprobe__generic eBPF entry point
  • Unified probe attachment logic in tracer/probe.go
  • Support for kprobe/kretprobe/uprobe/uretprobe types

Enhanced probe-ctrl tool:

  • Works with all probe types
  • Added --list flag to show loaded eBPF programs
  • Better error messages

Usage


# Attach to kernel function
./ebpf-profiler -collection-agent=127.0.0.1:11000 -disable-tls -probe-link "kprobe:do_wp_page`

# Attach to user space function  
./ebpf-profiler -collection-agent=127.0.0.1:11000 -disable-tls -probe-link "uprobe:/root/ebpf-profiler:runtime.(*mheap).alloc"

Running ^ locally, with traces going to devfiler:

kprobe-do_wp_page Screenshot 2025-11-10 at 3 19 27 PM

@manuelfelipe manuelfelipe requested review from a team as code owners November 10, 2025 21:23
Comment thread tracer/probe.go
ProgName string
}

func ParseProbe(spec string) (*ProbeSpec, error) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we cover USDTs here as well?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. good idea. can we do that in a follow up PR?

from a quick look, i failed trying to find any support for this in cilium/ebpf. looks like we would need to parse SDT notes, and then, in certain cases, based on https://www.sourceware.org/systemtap/wiki/UserSpaceProbeImplementation, deal with semaphores ourselves?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth noting that i also looked into adding support for generic tracepoints but was not trivial either because we need struct pt_regs *.

my guess is that we could maybe do something like what happens here, but then that would only get us support for tracepoints on syscalls

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

follow up sounds good to me.

Comment thread tools/probe-ctrl/cmd/probe-ctrl/probe.go Outdated
Comment thread tracer/probe.go
Comment thread tracer/probe.go
Comment thread support/ebpf/generic_probe.ebpf.c Outdated
}

// uprobe__generic serves as entry point for uprobe based profiling.
SEC("uprobe/generic")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following feedback from #651 (comment), we should be able to skip uprobe/generic and just use kprobe/generic and attach it acccordingly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeah. good point. change it in e7dd520 to make use of kprobe/generic for all types

Comment thread tracer/probe.go
}

func ParseProbe(spec string) (*ProbeSpec, error) {
parts := strings.SplitN(spec, ":", 3)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we handle here user input, should we handle the case where capitalized letters are input?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure how much further we want to get into parsing the user input.

In last rev i added https://github.com/Shopify/opentelemetry-ebpf-profiler/blob/2e9f35d274a89f8e96c1f79d07b331244e496eca/tracer/probe.go#L75 to make sure the probe type is always lower case before any comparison.

As for the other parts of the input, i feel that is a concern of the link pkg. with an invalid probe name, we get something like the error msg below which feels like the right behaviour to me:

{"level":"error","msg":"failed to attach probes: creating perf_kprobe PMU (arch-specific fallback for \"do_WP_page\"): token __x64_do_WP_page: not found: no such file or directory","time":"2025-11-14T14:07:17.902603356Z"} 

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no strong opinions here. strings.ToLower() is always a good start and for the rest pkg link will return an error if it does not fit.

Comment thread tracer/probe.go Outdated
Comment thread tools/probe-ctrl/cmd/probe-ctrl/main.go Outdated
Comment thread tools/probe-ctrl/cmd/probe-ctrl/main.go Outdated
Comment thread tools/probe-ctrl/cmd/probe-ctrl/main.go Outdated
Comment thread tools/probe-ctrl/cmd/probe-ctrl/main.go Outdated
Comment thread tracer/probe.go Outdated
Comment thread tracer/probe.go Outdated
Comment thread tracer/probe.go
Comment on lines +112 to +115

default:
return nil, fmt.Errorf("unsupported probe type: %s", probeTypeStr)
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can never happen (also maybe some linters can warn if we don't have a default here but do not exhaustively check all values for this type)

Suggested change
default:
return nil, fmt.Errorf("unsupported probe type: %s", probeTypeStr)
}
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without this golang complains about the missing return. I could just move it out of the switch. just not sure it makes much of a difference

tracer/probe.go:115:1: missing return 

@christos68k
Copy link
Copy Markdown
Member

@manuelfelipe Can you resolve conflicts?

manuelfelipe and others added 7 commits December 4, 2025 10:14
Rename uprobe-specific code to generic probe terminology and add
support for kprobes alongside uprobes. Update probe-ctrl tool to
handle both probe types.
* Add docs to functions and fields from probe struct
Co-authored-by: Florian Lehner <florianl@users.noreply.github.com>
@manuelfelipe manuelfelipe force-pushed the generic-probe-support branch from ba86928 to b0807bd Compare December 4, 2025 15:15
@manuelfelipe
Copy link
Copy Markdown
Contributor Author

done @christos68k. rebased against latest main

@christos68k christos68k merged commit 9d4a7fd into open-telemetry:main Dec 4, 2025
29 checks passed
dalehamel pushed a commit to Shopify/opentelemetry-ebpf-profiler that referenced this pull request Dec 5, 2025
Co-authored-by: Florian Lehner <florianl@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants