-
Notifications
You must be signed in to change notification settings - Fork 130
Fix bug causing status to be cleared after adding Finalizer #804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Christian Artin <[email protected]>
Signed-off-by: Christian Artin <[email protected]>
Signed-off-by: Christian Artin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gravufo for the PR and sorry for the late response here!
The PR looks good to me in general, except adding Statused
to the managed
interface feels a bit off since there is ConditionedStatus
which also controls part of status and related to why unit test not working as expected (see my comment below).
If we want to proceed with the proposed approach, we will also need a controller tools PR so that all managed resources satisfy the extended interface.
Another solution could be, instead of changing managed reconciler, we could fix APIFinalizer
's AddFinalizer()
method such that it won't override the status. A simple deepcopy and using copied object in the Update call would be helpful, also eliminating the need to update managed interface/reconciler.
func (a *APIFinalizer) AddFinalizer(ctx context.Context, obj Object) error {
if meta.FinalizerExists(obj, a.finalizer) {
return nil
}
meta.AddFinalizer(obj, a.finalizer)
- return errors.Wrap(a.client.Update(ctx, obj), errUpdateObject)
+ // Copy object to ensure we don't modify the original object, e.g. not to lose status.
+ o := obj.DeepCopyObject().(Object)
+ return errors.Wrap(a.client.Update(ctx, o), errUpdateObject)
}
While this change looks simpler (and wouldn't require a crossplane tools), I am not comfortable since it will have a bigger blast radius due to APIFinalizer has other usages. If there are callers of that method assuming they got the latest object after the add finalizer call, they might get broken. For example, the next update call will fail since the object don't have the latest resource version any more 🤔 We could patch status in flight to latest meta+spec, but again things are getting weird as we are working with Objects (no schema).
Sooo, probably the direction in this PR is the best way to go providing a very scoped fix to the problem.
@@ -0,0 +1,29 @@ | |||
/* | |||
Copyright 2019 The Crossplane Authors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
Copyright 2019 The Crossplane Authors. | |
Copyright 2025 The Crossplane Authors. |
// See https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#typical-status-properties | ||
|
||
// A Status reflects the observed status of a resource. | ||
type Status map[string]runtime.RawExtension |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not simply runtime.RawExtension
? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think to be able to reach in and get/set conditions directly?
}, | ||
want: want{result: reconcile.Result{Requeue: true}}, | ||
}, | ||
"AddFinalizerShouldNotClearStatus": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to comment out managed.SetStatus(managedPreOp.GetStatus())
and expected this test to fail. But it passed again which implies that we don't actually test what we want. Upon further investigation, issue seems to be related how our mock stores the conditions (i.e. not changes status at all).
No problem, I didn't push too much on this one, because I am very hesitant with my proposed fix. The problem is that this fixes only a single use-case, but doesn't really fix all the other use-cases (wanting to modify status during a Create/Update for instance). Basically, any call that will update the object spec makes us lose the status sub-object modifications. In this case it's a little worse because we literally lose the full sub-object since it's a brand new MR which had no status inside kubernetes yet. I really don't know what the proper way forward would be to fix all use-cases. Fixing them one by one with a "hack" kind of feels weird and also it would be easy to reintroduce the problem if there are any new calls done during this flow in the future. I am open to ideas :) |
// GetStatus of this Composed resource. | ||
func (cr *Unstructured) GetStatus() xpv1.Status { | ||
status := xpv1.Statused{} | ||
if err := fieldpath.Pave(cr.Object).GetValueInto("status", &status); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am concerned about status.conditions
and .status
being stored independently.
|
||
// A Statused may have status set or retrieved. Status is typically | ||
// a struct containing various runtime status properties. | ||
type Statused interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if we need this... perhaps we just need Pave.SyncStatus(old, new)
?
// The AddFinalizer call may overwrite the managed object with what is | ||
// currently in the kube API which would make us lose information that may | ||
// have been populated by the Observe call. | ||
managed.SetStatus(managedPreOp.GetStatus()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we try to make a new pave function that copies the original pre-op status into managed rather than introduce the .Set/GetStatus?
I think it has the same effect and does not spread out the API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I absolutely don't mind, I simply followed initial advice provided on the issue in this comment.
Either way, I'm really unsure about this fix overall.
FWIW, we are good with other cases in the managed reconciler, and this (after observe, patching finalizer) is the only remaining one.
It is clear what we need to do, the question is simply how :D
I still lean on (1) but no strong opinions if we chose (2) or (3). @negz, interested to hear your thoughts. |
Maybe I didn't understand properly, but usually the create is only called once, so if that first time it's called we lose the status, we'll never restore what the Create wanted to save, no? |
You're right. However, in practice, I am not aware of any information that is unique to the Create call and not obtained in an Observe call. The only exception could be sensitive information that external API returning only after Create (e.g. API secret key), but those are stored into secrets as connection details instead of object status. So, the next Observe is expected to form the latest status by making a fresh observation and we're not worried about the status after Create getting lost. |
@turkenh and I chatted about this a bit the other day. I agree updating resource.Managed to have GetStatus and SetStatus methods would be the most idiomatic way to do it. That'd require a lot of updates across the ecosystem though, and ultimately those methods would need to return and accept opaque data. So maybe we may as well just use fieldpath.Pave? It kinda kills me that we have to do this (save and restore status) at all though. It seems like an anti-pattern in that it basically works around optimistic concurrency. If we ever added another controller that wanted to update MR status we'd have a race:
|
Description of your changes
This PR will fix the issue where the status sub-object would be cleared when adding the Finalizer. This would cause Observe-only resources to wait until the next SyncPeriod/PollPeriod to become Ready=true even though it should have been set at the first loop.
Fixes #791
Note: I am really unsure if these are the right places to do those changes, so please don't hesitate to leave comments pointing to the right locations.
I have:
earthly +reviewable
to ensure this PR is ready for review.backport release-x.y
labels to auto-backport this PR.