-
Notifications
You must be signed in to change notification settings - Fork 70
Allow CVO to deploy CBO #25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
de0d1a6 to
3a6172e
Compare
manifests/0000_30_cluster-baremetal-operator_03_serviceaccount.yaml
Outdated
Show resolved
Hide resolved
sadasu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a generic question inline. Other than that this looks good.
| @@ -0,0 +1,8 @@ | |||
| apiVersion: config.openshift.io/v1 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably come before the deployment, since the deployment uses it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure whether it matters. Shouldn't CBO create the CO deployement if not found anyways> FYI, MAO uses the same runlevel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the note in this section of the docs I think the ClusterOperator resource has to come after the Deployment. The CVO doesn't use the manifest to actually create the resource, it only uses it to know which CO resources it should wait for before continuing with the next step of the install/upgrade. So, I think this is correct.
| - name: cluster-baremetal-operator | ||
| env: | ||
| - name: RELEASE_VERSION | ||
| value: "0.0.1-snapshot" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does something in the build process change this value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% sure. But this value is used in the CO to signal upgrades.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section of the docs includes an example with a comment that the value "is substituted in the manifests when the payload is built" so I guess something about building the release image modifies the value.
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andfasano, dhellmann The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/lgtm |
…ader election The option has been available for years: $ git blame main.go | grep enable-leader-election dcbe86f (Sandhya Dasu 2020-08-18 21:09:29 -0400 72) flag.BoolVar(&enableLeaderElection, "enable-leader-election", false, and without it overlapping operator pods can crash-loop [1]: : [sig-cluster-lifecycle] pathological event should not see excessive Back-off restarting failed containers 0s { event [namespace/openshift-machine-api node/ip-10-0-62-147.us-west-2.compute.internal pod/cluster-baremetal-operator-574577fbcb-z8nd4 hmsg/bf39bb17ae - Back-off restarting failed container cluster-baremetal-operator in pod cluster-baremetal-operator-574577fbcb-z8nd4_openshift-machine-api(441969c1-b430-412c-b67f-4ae2f7797f4f)] happened 26 times event [namespace/openshift-machine-api node/ip-10-0-62-147.us-west-2.compute.internal pod/cluster-baremetal-operator-574577fbcb-z8nd4 hmsg/bf39bb17ae - Back-off restarting failed container cluster-baremetal-operator in pod cluster-baremetal-operator-574577fbcb-z8nd4_openshift-machine-api(441969c1-b430-412c-b67f-4ae2f7797f4f)] happened 51 times} while fighting each other over the same ClusterOperator status: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.16-upgrade-from-stable-4.15-e2e-aws-ovn-upgrade/1737335551998038016/artifacts/e2e-aws-ovn-upgrade/gather-audit-logs/artifacts/audit-logs.tar | tar -xz --strip-components=2 $ zgrep -h '"resource":"clusteroperators","name":"baremetal"' kube-apiserver/*audit*log.gz | jq -r 'select(.verb == "create" or .verb == "update") | .stageTimestamp + " " + .verb + " " + (.responseStatus.code | tostring) + " " + (.objectRef.subresource) + " " + .user.username + " " + .user.extra["authentication.kubernetes.io/pod-name"][0]' | grep 'T06:08:.*cluster-baremetal-operator' | sort 2023-12-20T06:08:21.757799Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-574577fbcb-z8nd4 2023-12-20T06:08:21.778638Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-7fbb57959b-s9v9g 2023-12-20T06:08:21.780378Z update 409 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-574577fbcb-z8nd4 2023-12-20T06:08:21.790000Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-7fbb57959b-s9v9g 2023-12-20T06:08:21.802780Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-7fbb57959b-s9v9g Using a leader lock will avoid this contention, and the system should be able to coast through brief moments after an outgoing leader leaves until a replacement leader picks things back up. I'm also setting a Recreate strategy [2], because: 1. Incoming pod surged by the default Deployment strategy. 2. Incoming pod attempts to acquire the Lease, but the outgoing pod is holding it. 3. Outgoing pod releases the lease and exits. 4. Incoming pod tries again, and this time acquires the lease. can be slow in the 3-to-4 phase, while: 1. Outgoing pod releases the lease and exits. 2. Incoming pod created, scheduled, and acquires the lease. tends to be faster. And again, the component should be able to coast through small durations without a functioning leader. See openshift/machine-config-operator@7530ded86 (install: Recreate and delayed default ServiceAccount deletion, 2023-08-29, openshift/machine-config-operator#3895) for another example of how Recreate can help that way. To get the leader election working, I'm also setting LeaderElectionID and LeaderElectionNamespace, pattern-matching from [3]. Using cluster-baremetal-operator as the ID avoids collisions with baremetal-operator [4]. COMPONENT_NAMESPACE is from: - name: COMPONENT_NAMESPACE valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.namespace using the pattern documented in [5]. We've been asking for that environment variable since f368a4f (Add sa, deployment and co manifests, 2020-09-29, openshift#25), although grepping through history, I don't see anything consuming it. But with this commit, we now have a consumer :). [1]: https://issues.redhat.com/browse/OCPBUGS-25766 [2]: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#recreate-deployment [3]: https://github.com/openshift/baremetal-operator/blob/749609dd0fb858be72a783c5d87853cad2d303db/main.go#L207-L208 [4]: https://github.com/openshift/baremetal-operator/blob/749609dd0fb858be72a783c5d87853cad2d303db/main.go#L71 [5]: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
…ader election The option has been available for years: $ git blame main.go | grep enable-leader-election dcbe86f (Sandhya Dasu 2020-08-18 21:09:29 -0400 72) flag.BoolVar(&enableLeaderElection, "enable-leader-election", false, and without it overlapping operator pods can crash-loop [1]: : [sig-cluster-lifecycle] pathological event should not see excessive Back-off restarting failed containers 0s { event [namespace/openshift-machine-api node/ip-10-0-62-147.us-west-2.compute.internal pod/cluster-baremetal-operator-574577fbcb-z8nd4 hmsg/bf39bb17ae - Back-off restarting failed container cluster-baremetal-operator in pod cluster-baremetal-operator-574577fbcb-z8nd4_openshift-machine-api(441969c1-b430-412c-b67f-4ae2f7797f4f)] happened 26 times event [namespace/openshift-machine-api node/ip-10-0-62-147.us-west-2.compute.internal pod/cluster-baremetal-operator-574577fbcb-z8nd4 hmsg/bf39bb17ae - Back-off restarting failed container cluster-baremetal-operator in pod cluster-baremetal-operator-574577fbcb-z8nd4_openshift-machine-api(441969c1-b430-412c-b67f-4ae2f7797f4f)] happened 51 times} while fighting each other over the same ClusterOperator status: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.16-upgrade-from-stable-4.15-e2e-aws-ovn-upgrade/1737335551998038016/artifacts/e2e-aws-ovn-upgrade/gather-audit-logs/artifacts/audit-logs.tar | tar -xz --strip-components=2 $ zgrep -h '"resource":"clusteroperators","name":"baremetal"' kube-apiserver/*audit*log.gz | jq -r 'select(.verb == "create" or .verb == "update") | .stageTimestamp + " " + .verb + " " + (.responseStatus.code | tostring) + " " + (.objectRef.subresource) + " " + .user.username + " " + .user.extra["authentication.kubernetes.io/pod-name"][0]' | grep 'T06:08:.*cluster-baremetal-operator' | sort 2023-12-20T06:08:21.757799Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-574577fbcb-z8nd4 2023-12-20T06:08:21.778638Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-7fbb57959b-s9v9g 2023-12-20T06:08:21.780378Z update 409 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-574577fbcb-z8nd4 2023-12-20T06:08:21.790000Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-7fbb57959b-s9v9g 2023-12-20T06:08:21.802780Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-7fbb57959b-s9v9g Using a leader lock will avoid this contention, and the system should be able to coast through brief moments after an outgoing leader leaves until a replacement leader picks things back up. I'm also setting a Recreate strategy [2], because: 1. Incoming pod surged by the default Deployment strategy. 2. Incoming pod attempts to acquire the Lease, but the outgoing pod is holding it. 3. Outgoing pod releases the lease and exits. 4. Incoming pod tries again, and this time acquires the lease. can be slow in the 3-to-4 phase, while: 1. Outgoing pod releases the lease and exits. 2. Incoming pod created, scheduled, and acquires the lease. tends to be faster. And again, the component should be able to coast through small durations without a functioning leader. See openshift/machine-config-operator@7530ded86 (install: Recreate and delayed default ServiceAccount deletion, 2023-08-29, openshift/machine-config-operator#3895) for another example of how Recreate can help that way. To get the leader election working, I'm also setting LeaderElectionID and LeaderElectionNamespace, pattern-matching from [3]. Using cluster-baremetal-operator as the ID avoids collisions with baremetal-operator [4]. There is a COMPONENT_NAMESPACE is from: - name: COMPONENT_NAMESPACE valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.namespace using the pattern documented in [5]. We've been asking for that environment variable since f368a4f (Add sa, deployment and co manifests, 2020-09-29, openshift#25), although grepping through history, I don't see anything consuming it. Perhaps it's somehow feeding controllers.ComponentNamespace, which is what I'm using. [1]: https://issues.redhat.com/browse/OCPBUGS-25766 [2]: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#recreate-deployment [3]: https://github.com/openshift/baremetal-operator/blob/749609dd0fb858be72a783c5d87853cad2d303db/main.go#L207-L208 [4]: https://github.com/openshift/baremetal-operator/blob/749609dd0fb858be72a783c5d87853cad2d303db/main.go#L71 [5]: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
…ader election The option has been available for years: $ git blame main.go | grep enable-leader-election dcbe86f (Sandhya Dasu 2020-08-18 21:09:29 -0400 72) flag.BoolVar(&enableLeaderElection, "enable-leader-election", false, and without it overlapping operator pods can crash-loop [1]: : [sig-cluster-lifecycle] pathological event should not see excessive Back-off restarting failed containers 0s { event [namespace/openshift-machine-api node/ip-10-0-62-147.us-west-2.compute.internal pod/cluster-baremetal-operator-574577fbcb-z8nd4 hmsg/bf39bb17ae - Back-off restarting failed container cluster-baremetal-operator in pod cluster-baremetal-operator-574577fbcb-z8nd4_openshift-machine-api(441969c1-b430-412c-b67f-4ae2f7797f4f)] happened 26 times event [namespace/openshift-machine-api node/ip-10-0-62-147.us-west-2.compute.internal pod/cluster-baremetal-operator-574577fbcb-z8nd4 hmsg/bf39bb17ae - Back-off restarting failed container cluster-baremetal-operator in pod cluster-baremetal-operator-574577fbcb-z8nd4_openshift-machine-api(441969c1-b430-412c-b67f-4ae2f7797f4f)] happened 51 times} while fighting each other over the same ClusterOperator status: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.16-upgrade-from-stable-4.15-e2e-aws-ovn-upgrade/1737335551998038016/artifacts/e2e-aws-ovn-upgrade/gather-audit-logs/artifacts/audit-logs.tar | tar -xz --strip-components=2 $ zgrep -h '"resource":"clusteroperators","name":"baremetal"' kube-apiserver/*audit*log.gz | jq -r 'select(.verb == "create" or .verb == "update") | .stageTimestamp + " " + .verb + " " + (.responseStatus.code | tostring) + " " + (.objectRef.subresource) + " " + .user.username + " " + .user.extra["authentication.kubernetes.io/pod-name"][0]' | grep 'T06:08:.*cluster-baremetal-operator' | sort 2023-12-20T06:08:21.757799Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-574577fbcb-z8nd4 2023-12-20T06:08:21.778638Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-7fbb57959b-s9v9g 2023-12-20T06:08:21.780378Z update 409 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-574577fbcb-z8nd4 2023-12-20T06:08:21.790000Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-7fbb57959b-s9v9g 2023-12-20T06:08:21.802780Z update 200 status system:serviceaccount:openshift-machine-api:cluster-baremetal-operator cluster-baremetal-operator-7fbb57959b-s9v9g Using a leader lock will avoid this contention, and the system should be able to coast through brief moments after an outgoing leader leaves until a replacement leader picks things back up. I'm also setting a Recreate strategy [2], because: 1. Incoming pod surged by the default Deployment strategy. 2. Incoming pod attempts to acquire the Lease, but the outgoing pod is holding it. 3. Outgoing pod releases the lease and exits. 4. Incoming pod tries again, and this time acquires the lease. can be slow in the 3-to-4 phase, while: 1. Outgoing pod releases the lease and exits. 2. Incoming pod created, scheduled, and acquires the lease. tends to be faster. And again, the component should be able to coast through small durations without a functioning leader. See openshift/machine-config-operator@7530ded86 (install: Recreate and delayed default ServiceAccount deletion, 2023-08-29, openshift/machine-config-operator#3895) for another example of how Recreate can help that way. To get the leader election working, I'm also setting LeaderElectionID and LeaderElectionNamespace, pattern-matching from [3]. Using cluster-baremetal-operator as the ID avoids collisions with baremetal-operator [4]. There is a COMPONENT_NAMESPACE is from: - name: COMPONENT_NAMESPACE valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.namespace using the pattern documented in [5]. We've been asking for that environment variable since f368a4f (Add sa, deployment and co manifests, 2020-09-29, openshift#25), although grepping through history, I don't see anything consuming it. Perhaps it's somehow feeding controllers.ComponentNamespace, which is what I'm using. [1]: https://issues.redhat.com/browse/OCPBUGS-25766 [2]: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#recreate-deployment [3]: https://github.com/openshift/baremetal-operator/blob/749609dd0fb858be72a783c5d87853cad2d303db/main.go#L207-L208 [4]: https://github.com/openshift/baremetal-operator/blob/749609dd0fb858be72a783c5d87853cad2d303db/main.go#L71 [5]: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
This PR adds the manifests required by CVO for the deployment of CBO, as specified in the related CVO documentation