-
Notifications
You must be signed in to change notification settings - Fork 2
feat(perf): add eBPF CPU profiler with ring buffer streaming #182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jra3
wants to merge
9
commits into
main
Choose a base branch
from
PROFILING-clean
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Replace dual lifecycle management (Stop() + context) with context-only cancellation. This simplifies the interface, and follows Go idioms for context-based lifecycle management. Key changes: - Remove Stop() method from ContinuousCollector interface - Update all collectors to use context cancellation for cleanup - Update tests to use context cancellation instead of Stop() calls Benefits: - Simpler interface with single cancellation mechanism - No more "double stop" edge cases to handle - Reduced code duplication across collectors - More maintainable and idiomatic Go code
- Add MetricTypeProfile to supported metric types - Add ProfileStats struct for eBPF profiler output data - Add ProfileStack struct for stack trace representation with counts - Add ProfileProcess struct for process-level profiling aggregation - Integrate profiler types with existing performance monitoring system
- Add profiler.bpf.c eBPF program for CPU event sampling - Implement ring buffer streaming for efficient data transfer - Add stack trace collection for user and kernel space - Add profiler_types.h with shared data structures (ProfileEvent) - Support perf event attachment with drop counter tracking - Provide 8MB ring buffer for high-frequency sampling
… support - Add ProfilerCollector implementing ContinuousCollector interface - Support flexible perf event configuration (hardware/software/PMU events) - Implement cross-platform design with Linux implementation and non-Linux stubs - Add comprehensive perf event enumeration and discovery system - Support multiple CPU attachment with online CPU detection - Add ring buffer reading and stack trace aggregation - Include graceful degradation for missing PMU access - Provide runtime event validation and helpful error messages
- Add unit tests for profiler configuration and setup - Add integration tests for full profiler lifecycle and multi-CPU scenarios - Add hardware tests requiring bare metal PMU access - Add ring buffer unit tests for event parsing and binary format validation - Add stability tests for long-running validation and memory leak detection - Add perf event enumeration tests for event discovery validation - Include ring buffer benchmark tests for performance validation - Support proper build tags for different test environments (linux/hardware/integration)
- Add comprehensive perf event enumeration guide covering hardware/software events - Document cross-architecture compatibility and PMU event portability - Add streaming profiler testing methodology with validation procedures - Include bare metal testing setup instructions for Hetzner servers - Document troubleshooting procedures for perf event issues - Provide performance validation guidelines and hardware requirements
- Remove duplicate function declarations between profiler_helpers.go and profiler_perf_events.go - Fix import issues by removing unused syscall imports - Temporarily stub out perf event attachment pending proper cilium/ebpf API implementation - Clean up unused imports (unsafe, unix)
Remove unused github.com/stretchr/testify/assert import from kernel_compat_integration_test.go that was causing CI build failures
…tests The perf event tests were failing in CI because they require actual system interaction through perf_event_open() syscalls. These tests need either root permissions or specific perf_event_paranoid settings (<=1). Changes: - Renamed profiler_perf_events_test.go to profiler_perf_events_integration_test.go - Added 'integration' build tag to exclude from unit test runs - Added proper skip conditions when perf events aren't available - Removed unnecessary GetPerfEventParanoid checks in favor of simpler availability checks This ensures unit tests can run in restricted CI environments without failing due to missing system permissions.
This was referenced Oct 6, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Key Features
✨ eBPF Profiler Implementation
🎯 Context-Driven Lifecycle
📊 Perf Event Enumeration
Architecture Changes
Testing
Documentation
Related Issues
Closes: