Skip to content

Update Apply{DaemonSet,Deployment} to rely on a hash of the spec#773

Merged
openshift-merge-robot merged 3 commits intoopenshift:masterfrom
marun:apply-deployment
Apr 21, 2020
Merged

Update Apply{DaemonSet,Deployment} to rely on a hash of the spec#773
openshift-merge-robot merged 3 commits intoopenshift:masterfrom
marun:apply-deployment

Conversation

@marun
Copy link
Contributor

@marun marun commented Apr 16, 2020

This removes the need for the caller to supply an expected generation and be able to force a rollout. Any state that is not present in the spec should instead be added as an annotation so that a rollout will be triggered when the annotation changes.

This is intentionally a breaking change to encourage switching to the revised function. Callers who want to continue using the previous (deprecated) implementation will need to change the name of the function they are calling to ApplyDeploymentWithForce.

Canaries:

@marun
Copy link
Contributor Author

marun commented Apr 16, 2020

/cc @deads2k @jsafrane

Copy link
Member

@bertinatto bertinatto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marun also, we probably want the behaviour of ApplyDaemonSet() to match ApplyDeployment().

Instead of doing a breaking change, what about introducing a new function adds the annotation and calls Apply[Deployment|DaemonSet]()?

@marun
Copy link
Contributor Author

marun commented Apr 16, 2020

@marun also, we probably want the behaviour of ApplyDaemonSet() to match ApplyDeployment().

For sure. Want to make sure the approach is sound first.

Instead of doing a breaking change, what about introducing a new function adds the annotation and calls Apply[Deployment|DaemonSet]()?

As per the PR description, the breakage is intentional to force callers to choose between keeping the old behavior (for now) or adapting to the new. I don't think this is too much to ask - a quick survey of the code of existing operators shows that majority have only a single call to ApplyDeployment while some have at most a few.

@jsafrane
Copy link
Contributor

While I am in favor of this change, it breaks unit tests - a new annotations appears and needs to be be added on operator input (when the tests checks that operator does nothing when nothing is expected) and ignored when comparing expected / actual objects on output.

Exposing the hash computation as a public function would help to fix the unit tests.

@bertinatto
Copy link
Member

@marun also, we probably want the behaviour of ApplyDaemonSet() to match ApplyDeployment().

For sure. Want to make sure the approach is sound first.

Instead of doing a breaking change, what about introducing a new function adds the annotation and calls Apply[Deployment|DaemonSet]()?

As per the PR description, the breakage is intentional to force callers to choose between keeping the old behavior (for now) or adapting to the new. I don't think this is too much to ask - a quick survey of the code of existing operators shows that majority have only a single call to ApplyDeployment while some have at most a few.

Right, I guess I'm not a big fan of a breaking change here, but we'll use whatever approach you decide.

if err != nil {
return nil, false, err
}
specHash := fmt.Sprintf("%x", sha256.Sum256(jsonBytes))
Copy link
Member

@bertinatto bertinatto Apr 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deployment controller creates the replicaSet with a hash in its name. This hash is calculated from the deployment's pod template [0]. Same for the daemonSet controller.

It's not the same as we're doing here, as we're hashing the whole deployment's spec, but it'd be nice to use a similar approach. For that, we could either:

  1. Hash the deployment.Spec.Template as well, and use the public available utility function [1] that does that. The downside is that ApplyDeployment() wouldn't update the deployment if we change the number of replicas, for instance.

  2. Create our own hashing function based on the approach of[1] that calculates the deployment.Spec hash (as opposed to deployment.Spec.Template.

Does this make sense?

[0] https://github.com/kubernetes/kubernetes/blob/98e65951dccfd40d3b4f31949c2ab8df5912d93e/pkg/controller/deployment/sync.go#L189
[1] https://github.com/kubernetes/kubernetes/blob/98e65951dccfd40d3b4f31949c2ab8df5912d93e/pkg/controller/controller_utils.go#L1129

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deployment controller: hash to come up with a unique replicaset name
operator: hash to ensure that a change in intent results in a rollout

Given that the use cases are different, why would it be desirable to match the implementation of the deployment controller?

@marun
Copy link
Contributor Author

marun commented Apr 17, 2020

While I am in favor of this change, it breaks unit tests - a new annotations appears and needs to be be added on operator input (when the tests checks that operator does nothing when nothing is expected) and ignored when comparing expected / actual objects on output.

I think a unit test that cares about an internal detail of ApplyDeployment is a brittle test. Why not just strip the annotation for the purposes of comparison?

@jsafrane
Copy link
Contributor

The unit test needs to add the annotation to fake API server object, so when ApplyDeployment gets it (as existing), the annotation is there.

Stripping the annotation after the test, when comparing result is a good idea, but it's orthogonal to ^.

// ApplyDeployment merges ObjectMeta of the provided deployment with an existing one
// if it exists and updates the API if the deployment spec and metadata differ. To
// ensure an update in response to state external to the deployment spec, the caller
// should set an annotation representing that external state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

give an example of such an annotation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

const specHashAnnotation = "operator.openshift.io/spec-hash"

// ApplyDeployment merges ObjectMeta of the provided deployment with an existing one
// if it exists and updates the API if the deployment spec and metadata differ. To
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the deployment spec or metadata differ from the previously required spec and metadata. To be reliable, the input of the required spec from the operator should be stable. It does not need to set all fields, since some fields are defaulted server-side.

Detection of spec drift from intent by other actors is determined by generation, not by spec comparison.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
required.Annotations[specHashAnnotation] = specHash

return ApplyDeploymentWithForce(client, recorder, required, 0, false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need an expected generation in order to know if another actor mutated the spec. The hash of the input spec lets you know if the operator decided a new value was needed (--loglevel for instance) and the generation lets you know if an external actor changed the deployment spec.

Because defaults can change on the server (new field appearing in a new level of kube for instance), you cannot reliably read and compare a previous hash.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@deads2k
Copy link
Contributor

deads2k commented Apr 17, 2020

@marun also, we probably want the behaviour of ApplyDaemonSet() to match ApplyDeployment().

For sure. Want to make sure the approach is sound first.

Instead of doing a breaking change, what about introducing a new function adds the annotation and calls Apply[Deployment|DaemonSet]()?

As per the PR description, the breakage is intentional to force callers to choose between keeping the old behavior (for now) or adapting to the new. I don't think this is too much to ask - a quick survey of the code of existing operators shows that majority have only a single call to ApplyDeployment while some have at most a few.

Right, I guess I'm not a big fan of a breaking change here, but we'll use whatever approach you decide.

@bertinatto we're sensitive to the fact that this causes some pain, but working with @jsafrane I think we've found a better pattern for handling updates in a reliable way that works how our callers really expect them to run.

To make the change "easy" to adopt, we'll first make the change obvious and @marun should point to the renamed legacy method for a release before we remove or hide it. Given that the old method defied expectations, I think it's better in this case to break compilation and make callers look at the doc briefly to decide about updating right then or switching to the older method.

@marun
Copy link
Contributor Author

marun commented Apr 18, 2020

The unit test needs to add the annotation to fake API server object, so when ApplyDeployment gets it (as existing), the annotation is there.

Stripping the annotation after the test, when comparing result is a good idea, but it's orthogonal to ^.

Understood. Updated to expose SetSpecHashAnnotation.

func ApplyDeployment(client appsclientv1.DeploymentsGetter, recorder events.Recorder,
required *appsv1.Deployment, expectedGeneration int64) (*appsv1.Deployment, bool, error) {

err := SetSpecHashAnnotation(&required.ObjectMeta, required.Spec)
Copy link
Contributor

@deads2k deads2k Apr 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to make a copy of required so you don't mutate your input to ApplyDeployment which would be unexpected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also added a commit to ensure ApplyDeploymentWithForce and ApplyDaemonset copy before mutating.

marun added 3 commits April 20, 2020 12:06
This removes the need for the caller to be able to force a
rollout. Any state that is not present in the spec should instead be
added as an annotation so that a rollout will occur when the external
state changes.

This is an intentional breaking change to encourage switching to the
revised function. Callers who want to continue using the
previous (deprecated) implementation will need to change the name of
the function they are calling to ApplyDeploymentWithForce.
This removes the need for the caller to be able to force a
rollout. Any state that is not present in the spec should instead be
added as an annotation so that a rollout will occur when the external
state changes.

This is an intentional breaking change to encourage switching to the
revised function. Callers who want to continue using the
previous (deprecated) implementation will need to change the name of
the function they are calling to ApplyDaemonSetWithForce.
@marun marun changed the title Update ApplyDeployment to rely on a hash of the spec Update Apply{DaemonSet,Deployment} to rely on a hash of the spec Apr 20, 2020
@marun
Copy link
Contributor Author

marun commented Apr 20, 2020

Added a new commit to perform the same change to ApplyDaemonSet.

@deads2k
Copy link
Contributor

deads2k commented Apr 21, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 21, 2020
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, marun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 21, 2020
@openshift-merge-robot openshift-merge-robot merged commit 346ac43 into openshift:master Apr 21, 2020
@marun
Copy link
Contributor Author

marun commented Apr 23, 2020

/cherry-pick release-4.4

@openshift-cherrypick-robot

@marun: #773 failed to apply on top of branch "release-4.4":

Applying: Update ApplyDeployment to rely on a hash of the spec
error: Failed to merge in the changes.
Using index info to reconstruct a base tree...
M	pkg/operator/resource/resourceapply/apps.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/operator/resource/resourceapply/apps.go
CONFLICT (content): Merge conflict in pkg/operator/resource/resourceapply/apps.go
Patch failed at 0002 Update ApplyDeployment to rely on a hash of the spec

Details

In response to this:

/cherry-pick release-4.4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

bertinatto pushed a commit to bertinatto/library-go that referenced this pull request Jul 2, 2020
Update Apply{DaemonSet,Deployment} to rely on a hash of the spec
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants