-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate odigos-instrumentation
label to Source CRD
#2231
base: main
Are you sure you want to change the base?
Conversation
8fda813
to
bd21fe1
Compare
fd744ba
to
804b47a
Compare
// isDeleteInstrumentationConfigSource returns true if the Source object is relevant to starting language detection. | ||
// This means that the Source must be either: | ||
// 1) A normal (non-excluding) Source AND terminating, or | ||
// 2) An excluding Source AND NOT terminating | ||
// In either of these cases, we want to check if workloads should start to be instrumented. | ||
var SourcePredicates = predicate.Funcs{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment does not match the name.
Don't we check in the reconciler for IsActiveSource
? since we are not looking at the diff here, maybe worth just removing this predicate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, a predicate makes sense to filter out checking irrelevant Sources in the reconciler. We don't need to look at the diff to tell whether this Source is relevant. If anything I think we can remove the same check in the Reconcile function
Otherwise, both startlangdetection and deleteinstrumentationconfig simultaneously try to reconcile() every Source. It's a no-op on the irrelevant controller because of the check, but moving the check to a predicate shifts it further up the chain
76fc703
to
773dc66
Compare
instrumentor/controllers/deleteinstrumentationconfig/source_controller.go
Outdated
Show resolved
Hide resolved
instrumentor/controllers/deleteinstrumentationconfig/source_controller.go
Show resolved
Hide resolved
newReplicas := wNew.AvailableReplicas() | ||
oldReplicas := wOld.AvailableReplicas() | ||
|
||
// 1. workload became enabled and has available (running) replicas | ||
if becameEnabled && newReplicas > 0 { | ||
// 1. workload has available (running) replicas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will increase the amount of events the reconciler handles significantly (from only workloads instrumented by Odigos to all workloads), It doesn't matter fro correctness, I'm just wondering about performance.
In the reconcilers it looks like we'll do a few operations for each workload even those that are not instrumented (and for every update) - maybe we should call sourceutils.IsObjectInstrumentedBySource
as early as possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was originally leaning away from calling IsObjectInstrumentedBySource
because it means an extra API call before we even get to the reconciler. The concern is API server traffic. And since that's one of the first API calls we make inside the reconciler anyway, that tradeoff seemed like a wash.
But, for non-instrumented workloads, not filtering them would actually mean 3 unnecessary API calls: one to Reconcile()
, one to getWorkloadObject()
, and then IsObjectInstrumentedBySource
. In contrast, we already have the object in the Predicate, so we could filter out non-instrumented workloads with just 1 API call, like you're suggesting.
However, that means an extra unnecessary API call for every instrumented workload. And instrumented workloads are really what we care about. In fact, we hope that instrumented workloads should largely outnumber non-instrumented workloads for Odigos users :)
To put it in numbers, if a cluster has at least 4 instrumented workloads for every 1 non-instrumented workload, calling IsObjectInstrumentedBySource
in the predicate will be a net increase to unnecessary API server traffic
So I think the efficiency gain for non-instrumented workloads isn't worth the efficiency loss for workloads we actually care about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this analysis.
However, I have seen cases of instrumenting single or a few workloads in a cluster for a few reasons. It might be a new client just starting and doesn't want to commit to instrumenting all the applications, or maybe most of the apps are not relevant/ written in unsupported languages.
So, in those cases (which I think are not rare) - we will have some regression in terms of performance. I guess it also depends on how often deployments are updated in the cluster. If some automatic tools are involved I think that can also increase the amount of times the reconciler will trigger.
Not saying this should block or anything, just worth to consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's definitely a good consideration. I also think that the "extra" API call should usually be using the controller's cache, while the event getting passed to Reconcile()
is actual traffic every time. So I think my point above isn't always valid anyway.
I added a basic version of the previous check in b6ad6ac, ptal. Some notes:
- We don't have a context to use in the requests, so the
IsObjectInstrumentedBySource
call needs to just usecontext.Background
. - We can't retry on an error, so we need to either ignore the error and let the object through (where the controller can retry it -- say a client hiccup) or just block the request entirely. I think the former is safer so we don't miss relevant objects.
- While we're migrating, we still need to check the label here. That's fine, but I simplified the check in
Update
to just look at if the new object is instrumented instead of comparing if the old object was not instrumented. There's no equivalent way with Sources to know if the old object didn't have a Source, so I think that's fine.
Let me know what you think
…levant, check errs before deleting finalizers, update test README
b6ad6ac
to
cc31bbe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few nits, this looks great 🥇
if !instrumented || !workload.IsObjectLabeledForInstrumentation(e.Object) { | ||
return false | ||
} | ||
|
||
return w.AvailableReplicas() > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can move this check before the API calls, to potentially save calls.
instrumented, _ := sourceutils.IsObjectInstrumentedBySource(context.Background(), i.Client, e.ObjectNew) | ||
if !instrumented || !workload.IsObjectLabeledForInstrumentation(e.ObjectNew) { | ||
return false | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same nit as above: maybe better to do the replicas check before this.
|
||
sourceutils "github.com/odigos-io/odigos/k8sutils/pkg/source" | ||
k8sutils "github.com/odigos-io/odigos/k8sutils/pkg/utils" | ||
"github.com/odigos-io/odigos/k8sutils/pkg/workload" | ||
) | ||
|
||
type NamespacesReconciler struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe worth marking this as deprecated reconciler
just making sure, we are going to remove it once we fully remove support for labels, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we should still have workload/namespace reconcilers. Since Sources are detached from the workloads, a workload can be redeployed or come up after the Source (this is by design). If that happens, we should trigger on the event for that workload too. Maybe we could deprecate the Namespace reconciler, since Sources need to exist in a namespace. But if we ever support cross-namespace Sources we'll have similar uses for it. It's also just good to have in case we miss a Source event
This adds a migration for the current
odigos-instrumentation
label to Source CRDs.If a workload or namespace has
odigos-instrumentation: enabled
, the startlangdetection controller will automatically create a Source for it (if one doesn't already exist) and log a deprecation warning.If a workload or namespace has
odigos-instrumentation: disabled
, the deleteinstrumentationconfig controller will automatically create a Source withdisableInstrumentation: true
.In both cases, if a Source already exists, the controller will update the existing Source. This means that the label can be toggled back and forth.
However, deleting the label will not inherently un-instrument a workload that doesn't have inheriting Namespace instrumentation. We can't tell if the label was deleted without checking the change during the event, which is bad to do. This is the only drop in feature parity from the current label.