-
Notifications
You must be signed in to change notification settings - Fork 434
OTA-1349: *: Propagate RetrievedUpdates from ClusterVersion up to HostedCluster #4744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTA-1349: *: Propagate RetrievedUpdates from ClusterVersion up to HostedCluster #4744
Conversation
To make it easier for folks working at the management-cluster level to notice and address issues with update-recommendation retrieval. Especially useful since de4bcbe (api/v1beta1/hostedcluster_types: Add spec.updateService, 2024-03-29, openshift#3576) made updateService a configurable knob, because: * Users might configure a broken updateService, and the new condition would tell them that it wasn't working. * Users might install a 4.15 or earlier HostedCluster, where the HostedControlPlane controller (which is extracted from the HostedCluster's release image) was too old to have de4bcbe, so it doesn't propagate the configured updateService, and the cluster-version operator ends up using the default upstream update service. That doesn't work in disconnected/restricted-network environments where api.openshift.com is unreachable. The new condition would tell them that it wasn't working, and that the CVO was using the api.openshift.com service. They'd be left on their own to realize that the updateService feature required a 4.16 or later HostedCluster release image.
…Updates Generated with: $ make api-docs
✅ Deploy Preview for hypershift-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
… to pick up ClusterVersionRetrievedUpdates Generated with: $ go mod tidy $ go mod vendor using: $ go version go version go1.23.1 linux/amd64
8ff1494 to
5e8cf46
Compare
|
/test e2e-aws |
|
e2e-aws-4.17 - lastTransitionTime: "2024-09-18T01:45:13Z"
message: Condition not found in the CVO.
observedGeneration: 3
reason: StatusUnknown
status: Unknown
type: ClusterVersionRetrievedUpdatesThe HostedCluster doesn't set |
|
Ok, and in e2e-aws, where the HostedControlPlane controller understands the piping, - lastTransitionTime: "2024-09-18T02:55:19Z"
message: The update channel has not been configured.
observedGeneration: 3
reason: NoChannel
status: "False"
type: ClusterVersionRetrievedUpdatesSo now we have examples of the new HyperShift operator code vs. both new (works) and old ( |
…wn ClusterVersionRetrievedUpdates ClusterVersion has had the RetrievedUpdates condition since 2018 [1], before OpenShift v4 went GA. But until e438076 (api/v1beta1/hostedcluster_types: Add channel, availableUpdates, and conditionalUpdates, 2023-01-21, openshift#1954), HostedCluster wasn't interested in update-service recommendations. And the HostedControlPlane only learned how to propagate the condition in 5216915 (*: Propagate RetrievedUpdates from ClusterVersion up to HostedCluster, 2024-09-17, openshift#4744). To avoid long-running status like [2]: - lastTransitionTime: "2024-09-18T01:45:13Z" message: Condition not found in the CVO. observedGeneration: 3 reason: StatusUnknown status: Unknown type: ClusterVersionRetrievedUpdates when a new HyperShift operator (with e438076) is running a 4.17 or earlier HostedCluster (and its HostedControlPlane operator, without e438076), this commit silently drops the StatusUnknown condition from HostedCluster. We can revert this commit once all still-supported HostedControlPlane operator versions understand how to propagate the condition. This commit's "if we could find it, pass it along" approach is also compatible with 4.17.z HostedControlPlane controllers picking up the ability to propagate the condition in a backport, if for some reason we decide that behavior is worth backporting. [1]: openshift/cluster-version-operator@286641d#diff-4229ccef40cdb3dd7a8e5ca230d85fa0e74bbc265511ddd94f53acffbcd19b79R100 [2]: openshift#4744 (comment)
f16d279 to
f8f5ceb
Compare
…wn ClusterVersionRetrievedUpdates ClusterVersion has had the RetrievedUpdates condition since 2018 [1], before OpenShift v4 went GA. But until e438076 (api/v1beta1/hostedcluster_types: Add channel, availableUpdates, and conditionalUpdates, 2023-01-21, openshift#1954), HostedCluster wasn't interested in update-service recommendations. And the HostedControlPlane only learned how to propagate the condition in 5216915 (*: Propagate RetrievedUpdates from ClusterVersion up to HostedCluster, 2024-09-17, openshift#4744). To avoid long-running status like [2]: - lastTransitionTime: "2024-09-18T01:45:13Z" message: Condition not found in the CVO. observedGeneration: 3 reason: StatusUnknown status: Unknown type: ClusterVersionRetrievedUpdates when a new HyperShift operator (with e438076) is running a 4.17 or earlier HostedCluster (and its HostedControlPlane operator, without e438076), this commit silently drops the StatusUnknown condition from HostedCluster. We can revert this commit once all still-supported HostedControlPlane operator versions understand how to propagate the condition. This commit's "if we could find it, pass it along" approach is also compatible with 4.17.z HostedControlPlane controllers picking up the ability to propagate the condition in a backport, if for some reason we decide that behavior is worth backporting. [1]: openshift/cluster-version-operator@286641d#diff-4229ccef40cdb3dd7a8e5ca230d85fa0e74bbc265511ddd94f53acffbcd19b79R100 [2]: openshift#4744 (comment)
…wn ClusterVersionRetrievedUpdates ClusterVersion has had the RetrievedUpdates condition since 2018 [1], before OpenShift v4 went GA. But until e438076 (api/v1beta1/hostedcluster_types: Add channel, availableUpdates, and conditionalUpdates, 2023-01-21, openshift#1954), HostedCluster wasn't interested in update-service recommendations. And the HostedControlPlane only learned how to propagate the condition in 5216915 (*: Propagate RetrievedUpdates from ClusterVersion up to HostedCluster, 2024-09-17, openshift#4744). To avoid long-running status like [2]: - lastTransitionTime: "2024-09-18T01:45:13Z" message: Condition not found in the CVO. observedGeneration: 3 reason: StatusUnknown status: Unknown type: ClusterVersionRetrievedUpdates when a new HyperShift operator (with e438076) is running a 4.17 or earlier HostedCluster (and its HostedControlPlane operator, without e438076), this commit silently drops the StatusUnknown condition from HostedCluster. We can revert this commit once all still-supported HostedControlPlane operator versions understand how to propagate the condition. This commit's "if we could find it, pass it along" approach is also compatible with 4.17.z HostedControlPlane controllers picking up the ability to propagate the condition in a backport, if for some reason we decide that behavior is worth backporting. [1]: openshift/cluster-version-operator@286641d#diff-4229ccef40cdb3dd7a8e5ca230d85fa0e74bbc265511ddd94f53acffbcd19b79R100 [2]: openshift#4744 (comment)
f8f5ceb to
fa5743e
Compare
- lastTransitionTime: "2024-09-18T23:37:31Z"
message: The update channel has not been configured.
observedGeneration: 3
reason: NoChannel
status: "False"
type: ClusterVersionRetrievedUpdatesand e2e-aws-4.17 has: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_hypershift/4744/pull-ci-openshift-hypershift-main-e2e-aws-4-17/1836539839235756032/artifacts/e2e-aws-4-17/hypershift-aws-run-e2e/artifacts/TestCreateCluster/namespaces/e2e-clusters-6lmhs/hypershift.openshift.io/hostedclusters/example-tjwjg.yaml | yaml2json | jq -r '.status.conditions[].type' | grep ClusterVersion | sort
ClusterVersionAvailable
ClusterVersionProgressing
ClusterVersionReleaseAccepted
ClusterVersionSucceeding
ClusterVersionUpgradeableso fa5743e is working as intended. |
|
/test e2e-aks |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sjenning, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sjenning, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@wking: This pull request references OTA-1349 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest-required |
|
ah, possibly we were failing until openshift/cloud-provider-aws#95 merged. Trying again now: /retest-required |
|
@wking: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
[ART PR BUILD NOTIFIER] Distgit: hypershift |
…wn ClusterVersionRetrievedUpdates ClusterVersion has had the RetrievedUpdates condition since 2018 [1], before OpenShift v4 went GA. But until e438076 (api/v1beta1/hostedcluster_types: Add channel, availableUpdates, and conditionalUpdates, 2023-01-21, openshift#1954), HostedCluster wasn't interested in update-service recommendations. And the HostedControlPlane only learned how to propagate the condition in 5216915 (*: Propagate RetrievedUpdates from ClusterVersion up to HostedCluster, 2024-09-17, openshift#4744). To avoid long-running status like [2]: - lastTransitionTime: "2024-09-18T01:45:13Z" message: Condition not found in the CVO. observedGeneration: 3 reason: StatusUnknown status: Unknown type: ClusterVersionRetrievedUpdates when a new HyperShift operator (with e438076) is running a 4.17 or earlier HostedCluster (and its HostedControlPlane operator, without e438076), this commit silently drops the StatusUnknown condition from HostedCluster. We can revert this commit once all still-supported HostedControlPlane operator versions understand how to propagate the condition. This commit's "if we could find it, pass it along" approach is also compatible with 4.17.z HostedControlPlane controllers picking up the ability to propagate the condition in a backport, if for some reason we decide that behavior is worth backporting. [1]: openshift/cluster-version-operator@286641d#diff-4229ccef40cdb3dd7a8e5ca230d85fa0e74bbc265511ddd94f53acffbcd19b79R100 [2]: openshift#4744 (comment)
…wn ClusterVersionRetrievedUpdates ClusterVersion has had the RetrievedUpdates condition since 2018 [1], before OpenShift v4 went GA. But until e438076 (api/v1beta1/hostedcluster_types: Add channel, availableUpdates, and conditionalUpdates, 2023-01-21, openshift#1954), HostedCluster wasn't interested in update-service recommendations. And the HostedControlPlane only learned how to propagate the condition in 5216915 (*: Propagate RetrievedUpdates from ClusterVersion up to HostedCluster, 2024-09-17, openshift#4744). To avoid long-running status like [2]: - lastTransitionTime: "2024-09-18T01:45:13Z" message: Condition not found in the CVO. observedGeneration: 3 reason: StatusUnknown status: Unknown type: ClusterVersionRetrievedUpdates when a new HyperShift operator (with e438076) is running a 4.17 or earlier HostedCluster (and its HostedControlPlane operator, without e438076), this commit silently drops the StatusUnknown condition from HostedCluster. We can revert this commit once all still-supported HostedControlPlane operator versions understand how to propagate the condition. This commit's "if we could find it, pass it along" approach is also compatible with 4.17.z HostedControlPlane controllers picking up the ability to propagate the condition in a backport, if for some reason we decide that behavior is worth backporting. [1]: openshift/cluster-version-operator@286641d#diff-4229ccef40cdb3dd7a8e5ca230d85fa0e74bbc265511ddd94f53acffbcd19b79R100 [2]: openshift#4744 (comment)
What this PR does / why we need it:
The pull propagates
RetrievedUpdatesfrom ClusterVersion up to HostedClusterTo make it easier for folks working at the management-cluster level to notice and address issues with update-recommendation retrieval.
Especially useful since de4bcbe (#3576) made
updateServicea configurable knob, because:updateService, and the new condition would tell them that it wasn't working.updateService, and the cluster-version operator ends up using the default upstream update service. That doesn't work in disconnected/restricted-network environments whereapi.openshift.comis unreachable. The new condition would tell them that it wasn't working, and that the CVO was using the api.openshift.com service. They'd be left on their own to realize that theupdateServicefeature required a 4.16 or later HostedCluster release image.Checklist