Move DynamicPIDSelector to its own mutually exclusive swarm node#1584
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1584 +/- ##
==========================================
+ Coverage 44.59% 44.63% +0.03%
==========================================
Files 333 334 +1
Lines 35928 36026 +98
==========================================
+ Hits 16022 16079 +57
- Misses 18900 18938 +38
- Partials 1006 1009 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
bc10a1e to
c1624f4
Compare
|
converting back to wip need to do a little more digging on this |
a489255 to
2452b79
Compare
2452b79 to
04c0525
Compare
|
This is ready for review now |
There was a problem hiding this comment.
Good finding! I wonder if we can simplify the implementation. Instead of adding the dynamic PIDs logic in the criteria matcher, and fill it with if-elses to consider separating the DyamicPIDSelector to another node:
- Change the signature of
criteriaMatcherProviderto returnswarm.Instancer - If DynamicPidSelector != nil, criteriaMatcherProvider will just return
swarm.EmptyRunFunc, that would prevent any message from flowing through there. - Move the PIDs filtering logic to a simple node with similar input/output channels. It would return the PIDs matching function, or
swarm.EmptyRunFuncif the dynamic PID selector is null.
The architecture would be something like:
flowchart LR
DockerEnricher --> |if pidsMatcher == nil| CriteriaMatcher --> ExecTyper
DockerEnricher --> |if pidsMatcher != nil| PidsMatcher --> ExecTyper
There was a problem hiding this comment.
Pull request overview
This PR makes DynamicPIDSelector an exclusive discovery criterion so that PID-targeted instrumentation can’t be broadened by default/empty config selectors, preventing unintended instrumentation of unrelated processes.
Changes:
- Forces the App O11y pipeline on when
WithDynamicPIDSelectoris provided (even ifFeatureAppO11yis disabled) to support PID-only instrumentation. - In dynamic-PID mode, prevents config-based discovery criteria from being used alongside the dynamic selector (avoids OR-matching unintended processes).
- Ensures an empty dynamic PID set (or PID not in set) never falls through to attribute-based matching.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| pkg/instrumenter/instrumenter.go | Enables App O11y when a dynamic PID selector is present. |
| pkg/appolly/discover/finder.go | Drops config criteria in dynamic-PID mode to avoid parallel default discovery. |
| pkg/appolly/discover/matcher.go | Uses only the dynamic selector when present and prevents empty-selector fallthrough matches. |
|
Regarding my previous suggestion, this PR shows an example on how to use the EmptyRunFunc to run two mutually exclusive nodes #1636 |
|
@mariomac thanks, I agree. DynamicPIDSelector is showing more and more that it fits as its own thing and not quite into the existing flows. Even more so, I am picturing this being expanded to more than just PID selection as an overall dynamic matcher. On the surface, that seems like it would fit into the existing flows but in practice I think it stands alone. I'll try out the refactor you suggested and ping your for another round. Thanks for the example too! |
de5c7a9 to
9d0e32f
Compare
9d0e32f to
924dabf
Compare
|
@mariomac thanks for the feedback, I moved DynamicSelector to its own swarm node like you suggested and cleaned a lot out of the static matcher that was dynamic-related in the process There is still some decent code duplication between the two. But I think without a clear need for more types of selectors, it might be overengineering to try and break out the common code. This DynamicSelector can be expanded for more criteria itself and I think over time that drift will become evident if so. but let me know what you think |
|
fyi, made a change to pipe/queue to make subscribe() thread safe, after this failure in the new test that caught a data race between the 2 nodes subscribed to the same channel: https://github.com/open-telemetry/opentelemetry-ebpf-instrumentation/actions/runs/23570678888/job/68632644708?pr=1584#step:7:123 even though in practice those two nodes are mutually exclusive on actually consuming the channel, it still caught the potential |
#1388 added DynamicPIDSelector to be able to restrict instrumentation to specific processes.
The problem I was seeing is that even when selecting just 1 pid with the Dynamic selector, I was still seeing OBI traces from other processes in the cluster. It seems like there are several default/empty fallback matchers that needed to be excluded when using DynamicPIDSelector.
Trying to add more checks for that was complicated, and @mariomac suggested moving DynamicSelector to its own swarm node, with the static matcher being mutually exclusive based on if dynamicselector != nil.
This PR does that, along with an updated change to pipe/msg/queue to ensure thread safety, that was caught by go race tests for both consumers subscribing to the same input queue
Checklist