Skip to content

Update the operand when image changes#32

Merged
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
jsafrane:add-image-check
Apr 18, 2020
Merged

Update the operand when image changes#32
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
jsafrane:add-image-check

Conversation

@jsafrane
Copy link
Contributor

@jsafrane jsafrane commented Apr 14, 2020

The operator may need to update the operands (Deployment, DaemonSet) when:

  • The images are changed by env. variables.
  • Log level changes in CR.
  • New operator starts and has updated bindata.

It's nearly impossible to detect these changes, therefore hash whole Deployment / DaemonSet .Spec and make sure the new objects are applied when the hash gets different.

cc @bertinatto @gnufied

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 14, 2020
if c.containerImagesNeedsUpdate(&existingDeployment.Spec.Template.Spec, &deploy.Spec.Template.Spec) {
forceRollout = true
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered doing this in library-go's ApplyDeployment()? I think it'd be cleaner, as ApplyDeployment needs to Get the object anyway.

Updating the object when there's a new container image seems to be desired, so it shouldn't be a problem for other users of ApplyDeployment()?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, perhaps even better would be to compare the Spec, something like:

diff --git a/pkg/operator/resource/resourceapply/apps.go b/pkg/operator/resource/resourceapply/apps.go
index 253b17ef..1a8ecb0a 100644
--- a/pkg/operator/resource/resourceapply/apps.go
+++ b/pkg/operator/resource/resourceapply/apps.go
 	if !*modified && existingCopy.ObjectMeta.Generation == expectedGeneration && !forceRollout {
-		return existingCopy, false, nil
+		if equality.Semantic.DeepEqual(required.Spec, existingCopy.Spec) {
+			return existingCopy, false, nil
+		}
 	}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has bothered me for awhile but none of the Apply* functions in library-go check for spec. Are you proposing that we patch it in library-go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO you can't check Spec, some fields gets defaulted by API server. That's IMO the reason why Generation was introduced.

And each team will have different opinion what should be checked - now we want conainer images, but someone will want also cmdline arguments (that's where the log level is) and whatnot.

It would be better to change resourceapply functions to accept existing object, if it is known.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k, are we missing anything? This PR tries to re-deploy operand (DaemonSet + Deployment) when the operator is restarted with env. variables pointing to new images. Someone must have it solved already. How other operators detect this?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

library-go's ApplyDeployment uses generation, not spec to determine whether it needs to be applied. See the check here: https://github.com/openshift/library-go/blob/master/pkg/operator/resource/resourceapply/apps.go#L39-L42

The recommended .status for operators keep track of the last applied generation. You can see usage here: https://github.com/openshift/cluster-openshift-apiserver-operator/blob/1bc63aca70a3258876befa7db6aeb8e6e05cee87/pkg/operator/workload/workload_openshiftapiserver_v311_00_sync.go#L396

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k Is there a point to tracking the generation if ApplyDeployment sets the deployment spec as an annotation?

@gnufied
Copy link
Member

gnufied commented Apr 15, 2020

@jsafrane can we remove operand and operator image version checks then?

@gnufied
Copy link
Member

gnufied commented Apr 15, 2020

namely - https://github.com/openshift/aws-ebs-csi-driver-operator/blob/master/pkg/operator/operator.go#L45 if we are not going to use them. We also need to update the CSV

@jsafrane
Copy link
Contributor Author

can we remove operand and operator image version checks then?

Maybe we can remove the operand check. I would keep the operator check in place - a new operator version may introduce changed assets in its bindata and we should apply them.

@jsafrane
Copy link
Contributor Author

jsafrane commented Apr 15, 2020

Talked on Slack with @deads2k and @marun, some operators add "configuration" that was used to render the required Deployment to the Deployment.Annotations

ApplyDeployment then spots the change in metadata and applies the new version. We should do the same. Hashing whole json of Deployment.Spec and putting it into annotation might be an interesting option - we could forget to add a new item to the "configuration annotation", but hash of whole Deployment.Spec will catch it.

@gnufied
Copy link
Member

gnufied commented Apr 15, 2020

@jsafrane if I understand this correctly, are we planning to store all the imagePullSpec or just of the operator based on the fact that if operand image changes, operator image changes too?

@gnufied
Copy link
Member

gnufied commented Apr 15, 2020

I can get behind generating SHA of spec too, because that way we may not have to track each operand individually..

@bertinatto
Copy link
Member

Talked on Slack with @deads2k and @marun, some operators add "configuration" that was used to render the required Deployment to the Deployment.Annotations

ApplyDeployment then spots the change in metadata and applies the new version. We should do the same. Hashing whole json of Deployment.Spec and putting it into annotation might be an interesting option - we could forget to add a new item to the "configuration annotation", but hash of whole Deployment.Spec will catch it.

Sounds like an elegant solution to what we need. Even if some spec fields are changed/defaulted by the apiserver, the hash won't change and Apply*() won't update the object.

@marun
Copy link

marun commented Apr 16, 2020

I have a PR up if you want to test with it: openshift/library-go#773

if daemonSet.Annotations == nil {
daemonSet.Annotations = map[string]string{}
}
daemonSet.Annotations[specHashAnnotation] = specHash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest moving this to the getDaemonSet() function down below (`getDeployment() for the deployment). So far we've been "patching" the objects only in those functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect the hashing to be done by ApplyDaemonSet soon, so I want to keep it close.

return deployment
}

func (c *csiDriverOperator) containerImagesNeedsUpdate(actual, required *corev1.PodSpec) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is no longer used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@jsafrane
Copy link
Contributor Author

Fixed unit tests with a bit of refactoring.

@jsafrane jsafrane changed the title Add image check Update the operand when image changes Apr 17, 2020
@jsafrane
Copy link
Contributor Author

Squashed and updated commit message.

The operator may need to update the operands (Deployment, DaemonSet) when:

- The images are changed by env. variables.
- Log level changes in CR.
- New operator starts and has updated bindata.

It's nearly impossible to detect these changes, therefore hash whole
Deployment / DaemonSet .Spec and make sure the new objects are applied when
the hash gets different.
needApply := forceRollout
if existing.GetGeneration() != expectedGeneration {
needApply = true
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check the generation if we're checking the spec hash?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need both hash and generation. Spec hash checks for changes in the operator (new images, new bindata, ...). Generation checks for changes in the cluster (user changing the object).

@bertinatto
Copy link
Member

Onde minor comment, otherwise LGTM.

}

crClient := client.Resource(credentialsRequestResourceGVR).Namespace(required.GetNamespace())
if err := addCredetialsRequestHash(required); err != nil {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/Credetials/Credentials/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, fixed

@gnufied
Copy link
Member

gnufied commented Apr 18, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 18, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gnufied, jsafrane

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 9d6998e into openshift:master Apr 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants