Bug 1870158: Use mergo to merge webhooks instead of custom apply#707
Conversation
This replaces our custom apply logic with a simpler merging algorithm which is implemented using mergo. The idea behind this is that we should be able to preserve things like the CABundle which is set outside the scope of this reconcilliation, but overwrite the webhooks to a know expected configuration whenever we reconcile and find a difference
|
I've done some manual testing of this and am happy with the way it's working, I can update fields and it puts everything back to the way we are expecting it to. If I modify a field that isn't managed, it doesn't replace it, this is expected and desired. Anything we deem critical to the configuration, should be managed by the operator |
michaelgugino
left a comment
There was a problem hiding this comment.
I don't understand why we're doing merges. We know what field we want to persist. We should scrape any info we want to persist (ca certs) from the old copy if present and inject into the new copy, and we should just apply the object definition.
Danil-Grigorev
left a comment
There was a problem hiding this comment.
I mostly like this implementation for its simplicity, but I have concerns about its flexibility, and the fact that we go away from both "spec hashing" and generation comparison, when objectively the last one with the generation comparison will allow us to simplify the code even more, and delegate the task of resource comparison to the server with all defaulting, etc.. If we are ok with some things being preserved just because we don't set them... Then it is fine, I guess
| } | ||
|
|
||
| // The webhook already exists, so merge the existing fields with the desired fields | ||
| if err := mergo.Merge(expected, current); err != nil { |
There was a problem hiding this comment.
Is it going to preserve custom labels we don't set on from the current resource? Are we ok with that? cc @michaelgugino
There was a problem hiding this comment.
Yes it will, custom labels and annotations or even finalizers should someone wish to add one, will be preserved. I think this is desirable. We should try to be nice and allow users to modify things if they so desire, just as long as it doesn't affect the operation of our webhooks. Which in this case, I don't think it will.
| withCABundle := defaultConfiguration.DeepCopy() | ||
| for i, webhook := range withCABundle.Webhooks { | ||
| webhook.ClientConfig.CABundle = []byte("test") | ||
| webhook.TimeoutSeconds = pointer.Int32Ptr(10) |
There was a problem hiding this comment.
I don't think this should be there. If someone changes default timeout onto 10 seconds, we should bring it back to the one we want to see there (even if it is equal, but we don't set it, we need to apply)
There was a problem hiding this comment.
Does the timeout affect the operation of the webhook? If you think having the timeout as a particular value is critical to the operation of the webhooks then we should be forcing it. I'm currently leaning towards we don't really care about this value (hence using it in the tests), but am open to discussing it if you feel strongly about this.
|
/retest |
We discussed this yesterday in the arch call so would be good to know if you've changed your mind at all, but to summarise my thoughts on this:
I think being more prescriptive in this case is actually beneficial, for example, in the current merge algorithm, users can (IIRC) add extra verbs to the rules and the algorithm will keep that, or they can add extra webhooks to the list and these will be preserved, I would argue that this is not desirable behaviour and that we should be dropping these extra items that the users are adding. We already have generation comparison, though this PR is removing it. I don't think having just generation comparison is a viable solution here. With or without generation comparison, we still need some way to merge the current and desired specifications. All generation comparison gets us (which we could include with this method) is that we short circuit earlier if we notice that the generation hasn't changed since the last time we observed it. I'm on the fence about how much we should or shouldn't be relying on the server to do the defaulting, bear in mind that if everyone takes the approach of throw it at the API server and have it handle all of the logic, the API server in a cluster will become much busier than it may need to be. Long term if we can handle some of this logic ourself and determine that actually we have nothing to patch, we can do some of the work and take this load off of the API server, spreading the workload out and theoretically making the entire system more stable. |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: enxebre The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@JoelSpeed: This pull request references Bugzilla bug 1870158, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/lgtm |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
5 similar comments
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
@JoelSpeed: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
@JoelSpeed: All pull requests linked via external trackers have merged: Bugzilla bug 1870158 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This replaces our custom apply logic with a simpler merging algorithm which is implemented using mergo.
The idea behind this is that we should be able to preserve things like the CABundle which is set outside the scope of this reconcilliation, but overwrite the webhooks to a know expected configuration whenever we reconcile and find a difference.
I will update this description once I have tested this on a real cluster, hopefully today, but I think this should be ok as it is.
I have one concern which is that we are using
Updateto update the resource, this could cause the webhooks to always be increasing their generation as we will always call update even if it's a no-op. An alternative which would reduce API calls is to switch to using a patch, If the testing reveals that this is an issue, I will investigate switching to patching which should allow us to not send an api call whenever there is nothing to be done