Update Apply{DaemonSet,Deployment} to rely on a hash of the spec by marun · Pull Request #773 · openshift/library-go

marun · 2020-04-16T05:10:37Z

This removes the need for the caller to supply an expected generation and be able to force a rollout. Any state that is not present in the spec should instead be added as an annotation so that a rollout will be triggered when the annotation changes.

This is intentionally a breaking change to encourage switching to the revised function. Callers who want to continue using the previous (deprecated) implementation will need to change the name of the function they are calling to ApplyDeploymentWithForce.

Canaries:

marun · 2020-04-16T05:10:51Z

/cc @deads2k @jsafrane

bertinatto

@marun also, we probably want the behaviour of ApplyDaemonSet() to match ApplyDeployment().

Instead of doing a breaking change, what about introducing a new function adds the annotation and calls Apply[Deployment|DaemonSet]()?

marun · 2020-04-16T18:59:06Z

@marun also, we probably want the behaviour of ApplyDaemonSet() to match ApplyDeployment().

For sure. Want to make sure the approach is sound first.

Instead of doing a breaking change, what about introducing a new function adds the annotation and calls Apply[Deployment|DaemonSet]()?

As per the PR description, the breakage is intentional to force callers to choose between keeping the old behavior (for now) or adapting to the new. I don't think this is too much to ask - a quick survey of the code of existing operators shows that majority have only a single call to ApplyDeployment while some have at most a few.

jsafrane · 2020-04-17T07:43:14Z

While I am in favor of this change, it breaks unit tests - a new annotations appears and needs to be be added on operator input (when the tests checks that operator does nothing when nothing is expected) and ignored when comparing expected / actual objects on output.

Exposing the hash computation as a public function would help to fix the unit tests.

bertinatto · 2020-04-17T08:22:49Z

@marun also, we probably want the behaviour of ApplyDaemonSet() to match ApplyDeployment().

For sure. Want to make sure the approach is sound first.

Instead of doing a breaking change, what about introducing a new function adds the annotation and calls Apply[Deployment|DaemonSet]()?

As per the PR description, the breakage is intentional to force callers to choose between keeping the old behavior (for now) or adapting to the new. I don't think this is too much to ask - a quick survey of the code of existing operators shows that majority have only a single call to ApplyDeployment while some have at most a few.

Right, I guess I'm not a big fan of a breaking change here, but we'll use whatever approach you decide.

bertinatto · 2020-04-17T08:35:20Z

pkg/operator/resource/resourceapply/apps.go

+	if err != nil {
+		return nil, false, err
+	}
+	specHash := fmt.Sprintf("%x", sha256.Sum256(jsonBytes))


The deployment controller creates the replicaSet with a hash in its name. This hash is calculated from the deployment's pod template [0]. Same for the daemonSet controller.

It's not the same as we're doing here, as we're hashing the whole deployment's spec, but it'd be nice to use a similar approach. For that, we could either:

Hash the deployment.Spec.Template as well, and use the public available utility function [1] that does that. The downside is that ApplyDeployment() wouldn't update the deployment if we change the number of replicas, for instance.

Create our own hashing function based on the approach of[1] that calculates the deployment.Spec hash (as opposed to deployment.Spec.Template.

Does this make sense?

[0] https://github.com/kubernetes/kubernetes/blob/98e65951dccfd40d3b4f31949c2ab8df5912d93e/pkg/controller/deployment/sync.go#L189
[1] https://github.com/kubernetes/kubernetes/blob/98e65951dccfd40d3b4f31949c2ab8df5912d93e/pkg/controller/controller_utils.go#L1129

deployment controller: hash to come up with a unique replicaset name
operator: hash to ensure that a change in intent results in a rollout

Given that the use cases are different, why would it be desirable to match the implementation of the deployment controller?

marun · 2020-04-17T14:57:24Z

While I am in favor of this change, it breaks unit tests - a new annotations appears and needs to be be added on operator input (when the tests checks that operator does nothing when nothing is expected) and ignored when comparing expected / actual objects on output.

I think a unit test that cares about an internal detail of ApplyDeployment is a brittle test. Why not just strip the annotation for the purposes of comparison?

jsafrane · 2020-04-17T16:40:56Z

The unit test needs to add the annotation to fake API server object, so when ApplyDeployment gets it (as existing), the annotation is there.

Stripping the annotation after the test, when comparing result is a good idea, but it's orthogonal to ^.

deads2k · 2020-04-17T19:43:45Z

pkg/operator/resource/resourceapply/apps.go

+// ApplyDeployment merges ObjectMeta of the provided deployment with an existing one
+// if it exists and updates the API if the deployment spec and metadata differ. To
+// ensure an update in response to state external to the deployment spec, the caller
+// should set an annotation representing that external state.


give an example of such an annotation

deads2k · 2020-04-17T19:45:34Z

pkg/operator/resource/resourceapply/apps.go

+const specHashAnnotation = "operator.openshift.io/spec-hash"
+
+// ApplyDeployment merges ObjectMeta of the provided deployment with an existing one
+// if it exists and updates the API if the deployment spec and metadata differ. To


if the deployment spec or metadata differ from the previously required spec and metadata. To be reliable, the input of the required spec from the operator should be stable. It does not need to set all fields, since some fields are defaulted server-side.

Detection of spec drift from intent by other actors is determined by generation, not by spec comparison.

deads2k · 2020-04-17T19:47:51Z

pkg/operator/resource/resourceapply/apps.go

+	}
+	required.Annotations[specHashAnnotation] = specHash
+
+	return ApplyDeploymentWithForce(client, recorder, required, 0, false)


I think you need an expected generation in order to know if another actor mutated the spec. The hash of the input spec lets you know if the operator decided a new value was needed (--loglevel for instance) and the generation lets you know if an external actor changed the deployment spec.

Because defaults can change on the server (new field appearing in a new level of kube for instance), you cannot reliably read and compare a previous hash.

deads2k · 2020-04-17T20:21:11Z

@marun also, we probably want the behaviour of ApplyDaemonSet() to match ApplyDeployment().

For sure. Want to make sure the approach is sound first.

Instead of doing a breaking change, what about introducing a new function adds the annotation and calls Apply[Deployment|DaemonSet]()?

As per the PR description, the breakage is intentional to force callers to choose between keeping the old behavior (for now) or adapting to the new. I don't think this is too much to ask - a quick survey of the code of existing operators shows that majority have only a single call to ApplyDeployment while some have at most a few.

Right, I guess I'm not a big fan of a breaking change here, but we'll use whatever approach you decide.

@bertinatto we're sensitive to the fact that this causes some pain, but working with @jsafrane I think we've found a better pattern for handling updates in a reliable way that works how our callers really expect them to run.

To make the change "easy" to adopt, we'll first make the change obvious and @marun should point to the renamed legacy method for a release before we remove or hide it. Given that the old method defied expectations, I think it's better in this case to break compilation and make callers look at the doc briefly to decide about updating right then or switching to the older method.

marun · 2020-04-18T22:09:21Z

The unit test needs to add the annotation to fake API server object, so when ApplyDeployment gets it (as existing), the annotation is there.

Stripping the annotation after the test, when comparing result is a good idea, but it's orthogonal to ^.

Understood. Updated to expose SetSpecHashAnnotation.

deads2k · 2020-04-20T12:03:30Z

pkg/operator/resource/resourceapply/apps.go

+func ApplyDeployment(client appsclientv1.DeploymentsGetter, recorder events.Recorder,
+	required *appsv1.Deployment, expectedGeneration int64) (*appsv1.Deployment, bool, error) {
+
+	err := SetSpecHashAnnotation(&required.ObjectMeta, required.Spec)


need to make a copy of required so you don't mutate your input to ApplyDeployment which would be unexpected.

Done. Also added a commit to ensure ApplyDeploymentWithForce and ApplyDaemonset copy before mutating.

This removes the need for the caller to be able to force a rollout. Any state that is not present in the spec should instead be added as an annotation so that a rollout will occur when the external state changes. This is an intentional breaking change to encourage switching to the revised function. Callers who want to continue using the previous (deprecated) implementation will need to change the name of the function they are calling to ApplyDeploymentWithForce.

This removes the need for the caller to be able to force a rollout. Any state that is not present in the spec should instead be added as an annotation so that a rollout will occur when the external state changes. This is an intentional breaking change to encourage switching to the revised function. Callers who want to continue using the previous (deprecated) implementation will need to change the name of the function they are calling to ApplyDaemonSetWithForce.

marun · 2020-04-20T19:51:02Z

Added a new commit to perform the same change to ApplyDaemonSet.

deads2k · 2020-04-21T12:09:14Z

/lgtm

openshift-ci-robot · 2020-04-21T12:10:38Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, marun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [deads2k]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

marun · 2020-04-23T07:15:20Z

/cherry-pick release-4.4

openshift-cherrypick-robot · 2020-04-23T07:15:34Z

@marun: #773 failed to apply on top of branch "release-4.4":

Applying: Update ApplyDeployment to rely on a hash of the spec
error: Failed to merge in the changes.
Using index info to reconstruct a base tree...
M	pkg/operator/resource/resourceapply/apps.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/operator/resource/resourceapply/apps.go
CONFLICT (content): Merge conflict in pkg/operator/resource/resourceapply/apps.go
Patch failed at 0002 Update ApplyDeployment to rely on a hash of the spec

Details

In response to this:

/cherry-pick release-4.4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Update Apply{DaemonSet,Deployment} to rely on a hash of the spec

openshift-ci-robot requested review from deads2k, jsafrane and smarterclayton April 16, 2020 05:10

bertinatto reviewed Apr 16, 2020

View reviewed changes

bertinatto reviewed Apr 17, 2020

View reviewed changes

deads2k reviewed Apr 17, 2020

View reviewed changes

deads2k reviewed Apr 20, 2020

View reviewed changes

marun added 3 commits April 20, 2020 12:06

Fix Apply{Deployment,Deployment} to copy input before mutatating

a20f1d9

marun changed the title ~~Update ApplyDeployment to rely on a hash of the spec~~ Update Apply{DaemonSet,Deployment} to rely on a hash of the spec Apr 20, 2020

openshift-ci-robot assigned deads2k Apr 21, 2020

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 21, 2020

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 21, 2020

openshift-merge-robot merged commit 346ac43 into openshift:master Apr 21, 2020

marun mentioned this pull request Apr 23, 2020

[release-4.4] Update Apply{DaemonSet,Deployment} to rely on a hash of the spec #782

Merged

jsafrane mentioned this pull request Apr 23, 2020

Update library-go for Apply{DaemonSet,Deployment} fixes openshift/aws-ebs-csi-driver-operator#48

Closed

bertinatto pushed a commit to bertinatto/library-go that referenced this pull request Jul 2, 2020

Merge pull request openshift#773 from marun/apply-deployment

2f0bcdb

Update Apply{DaemonSet,Deployment} to rely on a hash of the spec

Conversation

marun commented Apr 16, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marun commented Apr 16, 2020

Uh oh!

bertinatto left a comment

Choose a reason for hiding this comment

Uh oh!

marun commented Apr 16, 2020

Uh oh!

jsafrane commented Apr 17, 2020

Uh oh!

bertinatto commented Apr 17, 2020

Uh oh!

bertinatto Apr 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

marun commented Apr 17, 2020

Uh oh!

jsafrane commented Apr 17, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

deads2k commented Apr 17, 2020

Uh oh!

marun commented Apr 18, 2020

Uh oh!

deads2k Apr 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

marun commented Apr 20, 2020

Uh oh!

deads2k commented Apr 21, 2020

Uh oh!

openshift-ci-robot commented Apr 21, 2020

Uh oh!

marun commented Apr 23, 2020

Uh oh!

openshift-cherrypick-robot commented Apr 23, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

marun commented Apr 16, 2020 •

edited

Loading

bertinatto Apr 17, 2020 •

edited

Loading

deads2k Apr 20, 2020 •

edited

Loading