Skip to content

Move DynamicPIDSelector to its own mutually exclusive swarm node#1584

Merged
mariomac merged 4 commits into
open-telemetry:mainfrom
damemi:dynamic-pid-no-default-selector
Mar 26, 2026
Merged

Move DynamicPIDSelector to its own mutually exclusive swarm node#1584
mariomac merged 4 commits into
open-telemetry:mainfrom
damemi:dynamic-pid-no-default-selector

Conversation

@damemi
Copy link
Copy Markdown
Member

@damemi damemi commented Mar 18, 2026

#1388 added DynamicPIDSelector to be able to restrict instrumentation to specific processes.

The problem I was seeing is that even when selecting just 1 pid with the Dynamic selector, I was still seeing OBI traces from other processes in the cluster. It seems like there are several default/empty fallback matchers that needed to be excluded when using DynamicPIDSelector.

Trying to add more checks for that was complicated, and @mariomac suggested moving DynamicSelector to its own swarm node, with the static matcher being mutually exclusive based on if dynamicselector != nil.

This PR does that, along with an updated change to pipe/msg/queue to ensure thread safety, that was caught by go race tests for both consumers subscribing to the same input queue

Checklist

@damemi damemi requested a review from a team as a code owner March 18, 2026 15:17
@damemi damemi added the bug Something isn't working label Mar 18, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 18, 2026

Codecov Report

❌ Patch coverage is 62.93706% with 53 lines in your changes missing coverage. Please review.
✅ Project coverage is 44.63%. Comparing base (a3b7b67) to head (24eed0b).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
pkg/appolly/discover/matcher_dynamic.go 57.50% 45 Missing and 6 partials ⚠️
pkg/pipe/msg/queue.go 75.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1584      +/-   ##
==========================================
+ Coverage   44.59%   44.63%   +0.03%     
==========================================
  Files         333      334       +1     
  Lines       35928    36026      +98     
==========================================
+ Hits        16022    16079      +57     
- Misses      18900    18938      +38     
- Partials     1006     1009       +3     
Flag Coverage Δ
integration-test 21.31% <17.35%> (+0.28%) ⬆️
integration-test-arm 0.00% <0.00%> (ø)
integration-test-vm-x86_64-5.15.152 ?
integration-test-vm-x86_64-6.10.6 ?
k8s-integration-test 2.18% <0.00%> (+<0.01%) ⬆️
oats-test 0.00% <0.00%> (ø)
unittests 47.89% <74.38%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@MrAlias MrAlias added this to the v0.7.0 milestone Mar 18, 2026
@damemi damemi force-pushed the dynamic-pid-no-default-selector branch from bc10a1e to c1624f4 Compare March 18, 2026 16:20
@damemi damemi marked this pull request as draft March 18, 2026 18:10
@damemi
Copy link
Copy Markdown
Member Author

damemi commented Mar 18, 2026

converting back to wip need to do a little more digging on this

@damemi damemi force-pushed the dynamic-pid-no-default-selector branch 2 times, most recently from a489255 to 2452b79 Compare March 20, 2026 14:42
@damemi damemi changed the title Enable AppO11y with only DynamicPIDSelector DynamicPIDSelector: enforce exclusivity from default/empty config criteria Mar 20, 2026
@damemi damemi force-pushed the dynamic-pid-no-default-selector branch from 2452b79 to 04c0525 Compare March 20, 2026 16:04
@damemi damemi marked this pull request as ready for review March 20, 2026 16:17
@damemi
Copy link
Copy Markdown
Member Author

damemi commented Mar 20, 2026

This is ready for review now

Copy link
Copy Markdown
Contributor

@mariomac mariomac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good finding! I wonder if we can simplify the implementation. Instead of adding the dynamic PIDs logic in the criteria matcher, and fill it with if-elses to consider separating the DyamicPIDSelector to another node:

  1. Change the signature of criteriaMatcherProvider to return swarm.Instancer
  2. If DynamicPidSelector != nil, criteriaMatcherProvider will just return swarm.EmptyRunFunc, that would prevent any message from flowing through there.
  3. Move the PIDs filtering logic to a simple node with similar input/output channels. It would return the PIDs matching function, or swarm.EmptyRunFunc if the dynamic PID selector is null.

The architecture would be something like:

flowchart LR
    DockerEnricher --> |if pidsMatcher == nil| CriteriaMatcher --> ExecTyper
    DockerEnricher --> |if pidsMatcher != nil| PidsMatcher --> ExecTyper
Loading

Comment thread pkg/appolly/discover/finder.go Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes DynamicPIDSelector an exclusive discovery criterion so that PID-targeted instrumentation can’t be broadened by default/empty config selectors, preventing unintended instrumentation of unrelated processes.

Changes:

  • Forces the App O11y pipeline on when WithDynamicPIDSelector is provided (even if FeatureAppO11y is disabled) to support PID-only instrumentation.
  • In dynamic-PID mode, prevents config-based discovery criteria from being used alongside the dynamic selector (avoids OR-matching unintended processes).
  • Ensures an empty dynamic PID set (or PID not in set) never falls through to attribute-based matching.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
pkg/instrumenter/instrumenter.go Enables App O11y when a dynamic PID selector is present.
pkg/appolly/discover/finder.go Drops config criteria in dynamic-PID mode to avoid parallel default discovery.
pkg/appolly/discover/matcher.go Uses only the dynamic selector when present and prevents empty-selector fallthrough matches.

Comment thread pkg/appolly/discover/matcher.go Outdated
Comment thread pkg/instrumenter/instrumenter.go
Comment thread pkg/appolly/discover/finder.go Outdated
Comment thread pkg/appolly/discover/matcher.go Outdated
@mariomac
Copy link
Copy Markdown
Contributor

mariomac commented Mar 24, 2026

Regarding my previous suggestion, this PR shows an example on how to use the EmptyRunFunc to run two mutually exclusive nodes #1636

@damemi
Copy link
Copy Markdown
Member Author

damemi commented Mar 24, 2026

@mariomac thanks, I agree. DynamicPIDSelector is showing more and more that it fits as its own thing and not quite into the existing flows. Even more so, I am picturing this being expanded to more than just PID selection as an overall dynamic matcher. On the surface, that seems like it would fit into the existing flows but in practice I think it stands alone.

I'll try out the refactor you suggested and ping your for another round. Thanks for the example too!

@damemi damemi force-pushed the dynamic-pid-no-default-selector branch 2 times, most recently from de5c7a9 to 9d0e32f Compare March 25, 2026 21:15
@damemi damemi force-pushed the dynamic-pid-no-default-selector branch from 9d0e32f to 924dabf Compare March 26, 2026 00:09
@damemi
Copy link
Copy Markdown
Member Author

damemi commented Mar 26, 2026

@mariomac thanks for the feedback, I moved DynamicSelector to its own swarm node like you suggested and cleaned a lot out of the static matcher that was dynamic-related in the process

There is still some decent code duplication between the two. But I think without a clear need for more types of selectors, it might be overengineering to try and break out the common code. This DynamicSelector can be expanded for more criteria itself and I think over time that drift will become evident if so. but let me know what you think

@damemi
Copy link
Copy Markdown
Member Author

damemi commented Mar 26, 2026

fyi, made a change to pipe/queue to make subscribe() thread safe, after this failure in the new test that caught a data race between the 2 nodes subscribed to the same channel: https://github.com/open-telemetry/opentelemetry-ebpf-instrumentation/actions/runs/23570678888/job/68632644708?pr=1584#step:7:123

even though in practice those two nodes are mutually exclusive on actually consuming the channel, it still caught the potential

@damemi damemi changed the title DynamicPIDSelector: enforce exclusivity from default/empty config criteria Move DynamicPIDSelector to its own mutually exclusive swarm node Mar 26, 2026
@mariomac mariomac merged commit 3d31c23 into open-telemetry:main Mar 26, 2026
68 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants