-
Notifications
You must be signed in to change notification settings - Fork 215
OTA-1210: *: Add --update-service command-line option #1035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTA-1210: *: Add --update-service command-line option #1035
Conversation
|
@wking: This pull request references OTA-1210 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
[1] includes [2] pitching local update services in managed regions. However, HyperShift currently forces ClusterVersion spec.upstream empty [3]. This commit is part of making the upstream update service configurable on HyperShift, so those clusters can hear about and assess known issues with updates in environments where the default https://api.openshift.com/api/upgrades_info update service is not accessible, or in which an alternative update service was desired for testing or policy reasons. This includes [2], mentioned above, but would also include any other instance of disconnected/restricted-network use. The alternative for folks who want HyperShift updates in places where api.openshift.com is inaccessible are: * Don't use an update service and manage that aspect manually. But the update service declares multiple new releases each week, as well as delivering information about known risks/issues for local clusters to evalute their exposure. That's a lot of information to manage manually, if folks decide not to plug into the existing update service tooling. * Run a local update service, and fiddle with DNS and X.509 certs so that packets aimed at api.openshift.com get routed to your local service . This one requires less long-term effort than manually replacing the entire update service system, but it requires your clusters to trust an X.509 Certificate Authority that is willing to sign certificates for your local service saying "yup, that one is definitely api.openshift.com". The implementation is similar to e438076 (api/v1beta1/hostedcluster_types: Add channel, availableUpdates, and conditionalUpdates, 2022-12-14, openshift#1954), where I added update-related status properties piped up via CVO -> ClusterVersion.status -> HostedControlPlane.status -> HostedCluster.status. This commit adds a new spec property that is piped down via: 1. HostedCluster (management cluster) spec.upstream. 2. HostedControlPlane (management cluster) spec.upstream. 3. Cluster-version operator Deployment --upstream command line option (management cluster), and also ClusterVersion spec.upstream. 4. Cluster-version operator container (managment cluster). The CVO's new --upstream option is from [3], and we're using that pathway instead of ClusterVersion spec, because we don't want the update service URI to come via the the customer-accessible hosted API. It's ok for the channel (which is used as a query parameter in the update-service GET requests) to continue to flow in via ClusterVersion spec. I'm still populating ClusterVersion's spec.upstream in this commit for discoverability, although because --upstream is being used, the CVO will ignore the ClusterVersion spec.upstream value. When the HyperShift operator sets this in HostedControlPlane for an older cluster version, the old HostedControlPlane controller (launched from the hosted release payload) will not recognize or propagate the new property. But that's not a terrible thing, and issues with fetching update recommendations from the default update service will still be reported in the RetrievedUpdates condition [4] with a message mentioning the update service URI if there are problems [5]. Although it doesn't look like we capture that condition currently: $ grep -r 'ConditionType = "ClusterVersion' api/hypershift/v1beta1/ api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionSucceeding ConditionType = "ClusterVersionSucceeding" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionUpgradeable ConditionType = "ClusterVersionUpgradeable" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionFailing ConditionType = "ClusterVersionFailing" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionProgressing ConditionType = "ClusterVersionProgressing" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionAvailable ConditionType = "ClusterVersionAvailable" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionReleaseAccepted ConditionType = "ClusterVersionReleaseAccepted" We could opt to collect it in follow-up work if issues occur frequently enough that dropping down to the ClusterVersion level to debug becomes tedious. If, in the future, we grow an option to give the hosted CVO kubeconfig access to both the management and hosted Kubernetes APIs, we could drop --upstream and have the hosted CVO reach out and read this configuration off HostedControlPlane or HostedCluster directly. I'm bumping v1alpha1 as described in 4af5ffa (api/v1alpha1: Catch up with channel, availableUpdates, and conditionalUpdates, 2023-01-17, default security group in worker security groups, 2024-02-05, openshift#3527) suggests that the "v1alpha1 is a superset of v1beta1" policy remains current. [1]: https://issues.redhat.com/browse/XCMSTRAT-513 [2]: https://issues.redhat.com/browse/OTA-1185 [3]: openshift/cluster-version-operator#1035 [4]: https://github.com/openshift/api/blob/54b3334bfac52883d515c11118ca191bffba5db7/config/v1/types_cluster_version.go#L702-L706 [5]: https://github.com/openshift/cluster-version-operator/blob/ce6169c7b9b0d44c2e41342e6414ed9db0a31a63/pkg/cvo/availableupdates.go#L356-L365
[1] includes [2] pitching local update services in managed regions. However, HyperShift had (until this commit) been forcing ClusterVersion spec.upstream empty. This commit is part of making the upstream update service configurable on HyperShift, so those clusters can hear about and assess known issues with updates in environments where the default https://api.openshift.com/api/upgrades_info update service is not accessible, or in which an alternative update service was desired for testing or policy reasons. This includes [2], mentioned above, but would also include any other instance of disconnected/restricted-network use. The alternative for folks who want HyperShift updates in places where api.openshift.com is inaccessible are: * Don't use an update service and manage that aspect manually. But the update service declares multiple new releases each week, as well as delivering information about known risks/issues for local clusters to evalute their exposure. That's a lot of information to manage manually, if folks decide not to plug into the existing update service tooling. * Run a local update service, and fiddle with DNS and X.509 certs so that packets aimed at api.openshift.com get routed to your local service . This one requires less long-term effort than manually replacing the entire update service system, but it requires your clusters to trust an X.509 Certificate Authority that is willing to sign certificates for your local service saying "yup, that one is definitely api.openshift.com". The implementation is similar to e438076 (api/v1beta1/hostedcluster_types: Add channel, availableUpdates, and conditionalUpdates, 2022-12-14, openshift#1954), where I added update-related status properties piped up via CVO -> ClusterVersion.status -> HostedControlPlane.status -> HostedCluster.status. This commit adds a new spec property that is piped down via: 1. HostedCluster (management cluster) spec.upstream. 2. HostedControlPlane (management cluster) spec.upstream. 3. Cluster-version operator Deployment --upstream command line option (management cluster), and also ClusterVersion spec.upstream. 4. Cluster-version operator container (managment cluster). The CVO's new --upstream option is from [3], and we're using that pathway instead of ClusterVersion spec, because we don't want the update service URI to come via the the customer-accessible hosted API. It's ok for the channel (which is used as a query parameter in the update-service GET requests) to continue to flow in via ClusterVersion spec. I'm still populating ClusterVersion's spec.upstream in this commit for discoverability, although because --upstream is being used, the CVO will ignore the ClusterVersion spec.upstream value. When the HyperShift operator sets this in HostedControlPlane for an older cluster version, the old HostedControlPlane controller (launched from the hosted release payload) will not recognize or propagate the new property. But that's not a terrible thing, and issues with fetching update recommendations from the default update service will still be reported in the RetrievedUpdates condition [4] with a message mentioning the update service URI if there are problems [5]. Although it doesn't look like we capture that condition currently: $ grep -r 'ConditionType = "ClusterVersion' api/hypershift/v1beta1/ api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionSucceeding ConditionType = "ClusterVersionSucceeding" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionUpgradeable ConditionType = "ClusterVersionUpgradeable" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionFailing ConditionType = "ClusterVersionFailing" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionProgressing ConditionType = "ClusterVersionProgressing" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionAvailable ConditionType = "ClusterVersionAvailable" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionReleaseAccepted ConditionType = "ClusterVersionReleaseAccepted" We could opt to collect it in follow-up work if issues occur frequently enough that dropping down to the ClusterVersion level to debug becomes tedious. If, in the future, we grow an option to give the hosted CVO kubeconfig access to both the management and hosted Kubernetes APIs, we could drop --upstream and have the hosted CVO reach out and read this configuration off HostedControlPlane or HostedCluster directly. I'm bumping v1alpha1 as described in 4af5ffa (api/v1alpha1: Catch up with channel, availableUpdates, and conditionalUpdates, 2023-01-17, default security group in worker security groups, 2024-02-05, openshift#3527) suggests that the "v1alpha1 is a superset of v1beta1" policy remains current. [1]: https://issues.redhat.com/browse/XCMSTRAT-513 [2]: https://issues.redhat.com/browse/OTA-1185 [3]: openshift/cluster-version-operator#1035 [4]: https://github.com/openshift/api/blob/54b3334bfac52883d515c11118ca191bffba5db7/config/v1/types_cluster_version.go#L702-L706 [5]: https://github.com/openshift/cluster-version-operator/blob/ce6169c7b9b0d44c2e41342e6414ed9db0a31a63/pkg/cvo/availableupdates.go#L356-L365
[1] includes [2] pitching local update services in managed regions. However, HyperShift had (until this commit) been forcing ClusterVersion spec.upstream empty. This commit is part of making the upstream update service configurable on HyperShift, so those clusters can hear about and assess known issues with updates in environments where the default https://api.openshift.com/api/upgrades_info update service is not accessible, or in which an alternative update service was desired for testing or policy reasons. This includes [2], mentioned above, but would also include any other instance of disconnected/restricted-network use. The alternative for folks who want HyperShift updates in places where api.openshift.com is inaccessible are: * Don't use an update service and manage that aspect manually. But the update service declares multiple new releases each week, as well as delivering information about known risks/issues for local clusters to evalute their exposure. That's a lot of information to manage manually, if folks decide not to plug into the existing update service tooling. * Run a local update service, and fiddle with DNS and X.509 certs so that packets aimed at api.openshift.com get routed to your local service . This one requires less long-term effort than manually replacing the entire update service system, but it requires your clusters to trust an X.509 Certificate Authority that is willing to sign certificates for your local service saying "yup, that one is definitely api.openshift.com". The implementation is similar to e438076 (api/v1beta1/hostedcluster_types: Add channel, availableUpdates, and conditionalUpdates, 2022-12-14, openshift#1954), where I added update-related status properties piped up via CVO -> ClusterVersion.status -> HostedControlPlane.status -> HostedCluster.status. This commit adds a new spec property that is piped down via: 1. HostedCluster (management cluster) spec.upstream. 2. HostedControlPlane (management cluster) spec.upstream. 3. Cluster-version operator Deployment --upstream command line option (management cluster), and also ClusterVersion spec.upstream. 4. Cluster-version operator container (managment cluster). The CVO's new --upstream option is from [3], and we're using that pathway instead of ClusterVersion spec, because we don't want the update service URI to come via the the customer-accessible hosted API. It's ok for the channel (which is used as a query parameter in the update-service GET requests) to continue to flow in via ClusterVersion spec. I'm still populating ClusterVersion's spec.upstream in this commit for discoverability, although because --upstream is being used, the CVO will ignore the ClusterVersion spec.upstream value. When the HyperShift operator sets this in HostedControlPlane for an older cluster version, the old HostedControlPlane controller (launched from the hosted release payload) will not recognize or propagate the new property. But that's not a terrible thing, and issues with fetching update recommendations from the default update service will still be reported in the RetrievedUpdates condition [4] with a message mentioning the update service URI if there are problems [5]. Although it doesn't look like we capture that condition currently: $ grep -r 'ConditionType = "ClusterVersion' api/hypershift/v1beta1/ api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionSucceeding ConditionType = "ClusterVersionSucceeding" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionUpgradeable ConditionType = "ClusterVersionUpgradeable" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionFailing ConditionType = "ClusterVersionFailing" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionProgressing ConditionType = "ClusterVersionProgressing" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionAvailable ConditionType = "ClusterVersionAvailable" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionReleaseAccepted ConditionType = "ClusterVersionReleaseAccepted" We could opt to collect it in follow-up work if issues occur frequently enough that dropping down to the ClusterVersion level to debug becomes tedious. If, in the future, we grow an option to give the hosted CVO kubeconfig access to both the management and hosted Kubernetes APIs, we could drop --upstream and have the hosted CVO reach out and read this configuration off HostedControlPlane or HostedCluster directly. I'm bumping v1alpha1 as described in 4af5ffa (api/v1alpha1: Catch up with channel, availableUpdates, and conditionalUpdates, 2023-01-17, openshift#1954). The recent 1ae3a36 (Always include AWS default security group in worker security groups, 2024-02-05, openshift#3527) suggests that the "v1alpha1 is a superset of v1beta1" policy remains current. [1]: https://issues.redhat.com/browse/XCMSTRAT-513 [2]: https://issues.redhat.com/browse/OTA-1185 [3]: openshift/cluster-version-operator#1035 [4]: https://github.com/openshift/api/blob/54b3334bfac52883d515c11118ca191bffba5db7/config/v1/types_cluster_version.go#L702-L706 [5]: https://github.com/openshift/cluster-version-operator/blob/ce6169c7b9b0d44c2e41342e6414ed9db0a31a63/pkg/cvo/availableupdates.go#L356-L365
[1] includes [2] pitching local update services in managed regions. However, HyperShift had (until this commit) been forcing ClusterVersion spec.upstream empty. This commit is part of making the upstream update service configurable on HyperShift, so those clusters can hear about and assess known issues with updates in environments where the default https://api.openshift.com/api/upgrades_info update service is not accessible, or in which an alternative update service was desired for testing or policy reasons. This includes [2], mentioned above, but would also include any other instance of disconnected/restricted-network use. The alternative for folks who want HyperShift updates in places where api.openshift.com is inaccessible are: * Don't use an update service and manage that aspect manually. But the update service declares multiple new releases each week, as well as delivering information about known risks/issues for local clusters to evalute their exposure. That's a lot of information to manage manually, if folks decide not to plug into the existing update service tooling. * Run a local update service, and fiddle with DNS and X.509 certs so that packets aimed at api.openshift.com get routed to your local service . This one requires less long-term effort than manually replacing the entire update service system, but it requires your clusters to trust an X.509 Certificate Authority that is willing to sign certificates for your local service saying "yup, that one is definitely api.openshift.com". The implementation is similar to e438076 (api/v1beta1/hostedcluster_types: Add channel, availableUpdates, and conditionalUpdates, 2022-12-14, openshift#1954), where I added update-related status properties piped up via CVO -> ClusterVersion.status -> HostedControlPlane.status -> HostedCluster.status. This commit adds a new spec property that is piped down via: 1. HostedCluster (management cluster) spec.upstream. 2. HostedControlPlane (management cluster) spec.upstream. 3. Cluster-version operator Deployment --upstream command line option (management cluster), and also ClusterVersion spec.upstream. 4. Cluster-version operator container (managment cluster). The CVO's new --upstream option is from [3], and we're using that pathway instead of ClusterVersion spec, because we don't want the update service URI to come via the the customer-accessible hosted API. It's ok for the channel (which is used as a query parameter in the update-service GET requests) to continue to flow in via ClusterVersion spec. I'm still populating ClusterVersion's spec.upstream in this commit for discoverability, although because --upstream is being used, the CVO will ignore the ClusterVersion spec.upstream value. When the HyperShift operator sets this in HostedControlPlane for an older cluster version, the old HostedControlPlane controller (launched from the hosted release payload) will not recognize or propagate the new property. But that's not a terrible thing, and issues with fetching update recommendations from the default update service will still be reported in the RetrievedUpdates condition [4] with a message mentioning the update service URI if there are problems [5]. Although it doesn't look like we capture that condition currently: $ grep -r 'ConditionType = "ClusterVersion' api/hypershift/v1beta1/ api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionSucceeding ConditionType = "ClusterVersionSucceeding" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionUpgradeable ConditionType = "ClusterVersionUpgradeable" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionFailing ConditionType = "ClusterVersionFailing" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionProgressing ConditionType = "ClusterVersionProgressing" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionAvailable ConditionType = "ClusterVersionAvailable" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionReleaseAccepted ConditionType = "ClusterVersionReleaseAccepted" We could opt to collect it in follow-up work if issues occur frequently enough that dropping down to the ClusterVersion level to debug becomes tedious. If, in the future, we grow an option to give the hosted CVO kubeconfig access to both the management and hosted Kubernetes APIs, we could drop --upstream and have the hosted CVO reach out and read this configuration off HostedControlPlane or HostedCluster directly. I'm bumping v1alpha1 as described in 4af5ffa (api/v1alpha1: Catch up with channel, availableUpdates, and conditionalUpdates, 2023-01-17, openshift#1954). The recent 1ae3a36 (Always include AWS default security group in worker security groups, 2024-02-05, openshift#3527) suggests that the "v1alpha1 is a superset of v1beta1" policy remains current. [1]: https://issues.redhat.com/browse/XCMSTRAT-513 [2]: https://issues.redhat.com/browse/OTA-1185 [3]: openshift/cluster-version-operator#1035 [4]: https://github.com/openshift/api/blob/54b3334bfac52883d515c11118ca191bffba5db7/config/v1/types_cluster_version.go#L702-L706 [5]: https://github.com/openshift/cluster-version-operator/blob/ce6169c7b9b0d44c2e41342e6414ed9db0a31a63/pkg/cvo/availableupdates.go#L356-L365
4aff8b4 to
922014d
Compare
[1] includes [2] pitching local update services in managed regions. However, HyperShift had (until this commit) been forcing ClusterVersion spec.upstream empty. This commit is part of making the upstream update service configurable on HyperShift, so those clusters can hear about and assess known issues with updates in environments where the default https://api.openshift.com/api/upgrades_info update service is not accessible, or in which an alternative update service was desired for testing or policy reasons. This includes [2], mentioned above, but would also include any other instance of disconnected/restricted-network use. The alternative for folks who want HyperShift updates in places where api.openshift.com is inaccessible are: * Don't use an update service and manage that aspect manually. But the update service declares multiple new releases each week, as well as delivering information about known risks/issues for local clusters to evalute their exposure. That's a lot of information to manage manually, if folks decide not to plug into the existing update service tooling. * Run a local update service, and fiddle with DNS and X.509 certs so that packets aimed at api.openshift.com get routed to your local service . This one requires less long-term effort than manually replacing the entire update service system, but it requires your clusters to trust an X.509 Certificate Authority that is willing to sign certificates for your local service saying "yup, that one is definitely api.openshift.com". The implementation is similar to e438076 (api/v1beta1/hostedcluster_types: Add channel, availableUpdates, and conditionalUpdates, 2022-12-14, openshift#1954), where I added update-related status properties piped up via CVO -> ClusterVersion.status -> HostedControlPlane.status -> HostedCluster.status. This commit adds a new spec property that is piped down via: 1. HostedCluster (management cluster) spec.upstream. 2. HostedControlPlane (management cluster) spec.upstream. 3. Cluster-version operator Deployment --upstream command line option (management cluster), and also ClusterVersion spec.upstream. 4. Cluster-version operator container (managment cluster). The CVO's new --upstream option is from [3], and we're using that pathway instead of ClusterVersion spec, because we don't want the update service URI to come via the the customer-accessible hosted API. It's ok for the channel (which is used as a query parameter in the update-service GET requests) to continue to flow in via ClusterVersion spec. I'm still populating ClusterVersion's spec.upstream in this commit for discoverability, although because --upstream is being used, the CVO will ignore the ClusterVersion spec.upstream value. When the HyperShift operator sets this in HostedControlPlane for an older cluster version, the old HostedControlPlane controller (launched from the hosted release payload) will not recognize or propagate the new property. But that's not a terrible thing, and issues with fetching update recommendations from the default update service will still be reported in the RetrievedUpdates condition [4] with a message mentioning the update service URI if there are problems [5]. Although it doesn't look like we capture that condition currently: $ grep -r 'ConditionType = "ClusterVersion' api/hypershift/v1beta1/ api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionSucceeding ConditionType = "ClusterVersionSucceeding" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionUpgradeable ConditionType = "ClusterVersionUpgradeable" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionFailing ConditionType = "ClusterVersionFailing" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionProgressing ConditionType = "ClusterVersionProgressing" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionAvailable ConditionType = "ClusterVersionAvailable" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionReleaseAccepted ConditionType = "ClusterVersionReleaseAccepted" We could opt to collect it in follow-up work if issues occur frequently enough that dropping down to the ClusterVersion level to debug becomes tedious. If, in the future, we grow an option to give the hosted CVO kubeconfig access to both the management and hosted Kubernetes APIs, we could drop --upstream and have the hosted CVO reach out and read this configuration off HostedControlPlane or HostedCluster directly. I'm bumping v1alpha1 as described in 4af5ffa (api/v1alpha1: Catch up with channel, availableUpdates, and conditionalUpdates, 2023-01-17, openshift#1954). The recent 1ae3a36 (Always include AWS default security group in worker security groups, 2024-02-05, openshift#3527) suggests that the "v1alpha1 is a superset of v1beta1" policy remains current. The cmd/install/assets, docs/content/reference/api.md, and vendor changes are automatic via: $ make update [1]: https://issues.redhat.com/browse/XCMSTRAT-513 [2]: https://issues.redhat.com/browse/OTA-1185 [3]: openshift/cluster-version-operator#1035 [4]: https://github.com/openshift/api/blob/54b3334bfac52883d515c11118ca191bffba5db7/config/v1/types_cluster_version.go#L702-L706 [5]: https://github.com/openshift/cluster-version-operator/blob/ce6169c7b9b0d44c2e41342e6414ed9db0a31a63/pkg/cvo/availableupdates.go#L356-L365
[1] includes [2] pitching local update services in managed regions. However, HyperShift had (until this commit) been forcing ClusterVersion spec.upstream empty. This commit is part of making the upstream update service configurable on HyperShift, so those clusters can hear about and assess known issues with updates in environments where the default https://api.openshift.com/api/upgrades_info update service is not accessible, or in which an alternative update service was desired for testing or policy reasons. This includes [2], mentioned above, but would also include any other instance of disconnected/restricted-network use. The alternative for folks who want HyperShift updates in places where api.openshift.com is inaccessible are: * Don't use an update service and manage that aspect manually. But the update service declares multiple new releases each week, as well as delivering information about known risks/issues for local clusters to evalute their exposure. That's a lot of information to manage manually, if folks decide not to plug into the existing update service tooling. * Run a local update service, and fiddle with DNS and X.509 certs so that packets aimed at api.openshift.com get routed to your local service . This one requires less long-term effort than manually replacing the entire update service system, but it requires your clusters to trust an X.509 Certificate Authority that is willing to sign certificates for your local service saying "yup, that one is definitely api.openshift.com". The implementation is similar to e438076 (api/v1beta1/hostedcluster_types: Add channel, availableUpdates, and conditionalUpdates, 2022-12-14, openshift#1954), where I added update-related status properties piped up via CVO -> ClusterVersion.status -> HostedControlPlane.status -> HostedCluster.status. This commit adds a new spec property that is piped down via: 1. HostedCluster (management cluster) spec.upstream. 2. HostedControlPlane (management cluster) spec.upstream. 3. Cluster-version operator Deployment --upstream command line option (management cluster), and also ClusterVersion spec.upstream. 4. Cluster-version operator container (managment cluster). The CVO's new --upstream option is from [3], and we're using that pathway instead of ClusterVersion spec, because we don't want the update service URI to come via the the customer-accessible hosted API. It's ok for the channel (which is used as a query parameter in the update-service GET requests) to continue to flow in via ClusterVersion spec. I'm still populating ClusterVersion's spec.upstream in this commit for discoverability, although because --upstream is being used, the CVO will ignore the ClusterVersion spec.upstream value. When the HyperShift operator sets this in HostedControlPlane for an older cluster version, the old HostedControlPlane controller (launched from the hosted release payload) will not recognize or propagate the new property. But that's not a terrible thing, and issues with fetching update recommendations from the default update service will still be reported in the RetrievedUpdates condition [4] with a message mentioning the update service URI if there are problems [5]. Although it doesn't look like we capture that condition currently: $ grep -r 'ConditionType = "ClusterVersion' api/hypershift/v1beta1/ api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionSucceeding ConditionType = "ClusterVersionSucceeding" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionUpgradeable ConditionType = "ClusterVersionUpgradeable" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionFailing ConditionType = "ClusterVersionFailing" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionProgressing ConditionType = "ClusterVersionProgressing" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionAvailable ConditionType = "ClusterVersionAvailable" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionReleaseAccepted ConditionType = "ClusterVersionReleaseAccepted" We could opt to collect it in follow-up work if issues occur frequently enough that dropping down to the ClusterVersion level to debug becomes tedious. If, in the future, we grow an option to give the hosted CVO kubeconfig access to both the management and hosted Kubernetes APIs, we could drop --upstream and have the hosted CVO reach out and read this configuration off HostedControlPlane or HostedCluster directly. I'm bumping v1alpha1 as described in 4af5ffa (api/v1alpha1: Catch up with channel, availableUpdates, and conditionalUpdates, 2023-01-17, openshift#1954). The recent 1ae3a36 (Always include AWS default security group in worker security groups, 2024-02-05, openshift#3527) suggests that the "v1alpha1 is a superset of v1beta1" policy remains current. The cmd/install/assets, client/applyconfiguration, docs/content/reference/api.md, hack/app-sre/saas_template.yaml, and vendor changes are automatic via: $ make update [1]: https://issues.redhat.com/browse/XCMSTRAT-513 [2]: https://issues.redhat.com/browse/OTA-1185 [3]: openshift/cluster-version-operator#1035 [4]: https://github.com/openshift/api/blob/54b3334bfac52883d515c11118ca191bffba5db7/config/v1/types_cluster_version.go#L702-L706 [5]: https://github.com/openshift/cluster-version-operator/blob/ce6169c7b9b0d44c2e41342e6414ed9db0a31a63/pkg/cvo/availableupdates.go#L356-L365
20c0fa3 to
64fb226
Compare
|
/cc |
petr-muller
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/hold
I'm not a great fan of the upstream name so I'm holding the PR for the case you or anyone has similar doubts, but feel free to ignore that and unhold.
64fb226 to
66cc580
Compare
[1] includes [2] pitching local update services in managed regions. However, HyperShift currently forces ClusterVersion spec.upstream empty [3]. This commit is part of making the upstream update service configurable on HyperShift, so those clusters can hear about and assess known issues with updates in environments where the default https://api.openshift.com/api/upgrades_info update service is not accessible, or in which an alternative update service was desired for testing or policy reasons. This includes [2], mentioned above, but would also include any other instance of disconnected/restricted-network use. The alternative for folks who want HyperShift updates in places where api.openshift.com is inaccessible are: * Don't use an update service and manage that aspect manually. But the update service declares multiple new releases each week, as well as delivering information about known risks/issues for local clusters to evalute their exposure. That's a lot of information to manage manually, if folks decide not to plug into the existing update service tooling. * Run a local update service, and fiddle with DNS and X.509 certs so that packets aimed at api.openshift.com get routed to your local service . This one requires less long-term effort than manually replacing the entire update service system, but it requires your clusters to trust an X.509 Certificate Authority that is willing to sign certificates for your local service saying "yup, that one is definitely api.openshift.com". One possible data path would be: 1. HostedCluster (management cluster). 2. HostedControlPlane (management cluster). 3. ClusterVersion spec.upstream (hosted API). 4. Cluster-version operator container (managment cluster). that pathway would only require changes to the HyperShift repo. But to avoid URI passing through the customer-accessible hosted API, this commit adds a new --update-service command-line option to support: 1. HostedCluster (management cluster). 2. HostedControlPlane (management cluster). 3. Cluster-version operator Deployment --update-service command line option (management cluster). 4. Cluster-version operator container (managment cluster). If, in the future, we grow an option to give the hosted CVO kubeconfig access to both the management and hosted Kubernetes APIs, we could drop --update-service and have the hosted CVO reach out and read this configuration off HostedControlPlane or HostedCluster directly. I'd initially gone with --upstream, but moved to --update-service and updateService, etc. based on Petr's reasonable point that "upstream" is pretty generic, "update service" is not much longer and is much more specific, and diverging from the ClusterVersion spec.upstream precedent isn't that terrible [4]. [1]: https://issues.redhat.com/browse/XCMSTRAT-513 [2]: https://issues.redhat.com/browse/OTA-1185 [3]: https://github.com/openshift/hypershift/blob/5e50e633fefd88aab9588d660c4b5daddd950d9a/control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go#L1059 [4]: openshift#1035 (review)
66cc580 to
04c32b5
Compare
[1] includes [2] pitching local update services in managed regions. However, HyperShift currently forces ClusterVersion spec.upstream empty [3]. This commit is part of making the upstream update service configurable on HyperShift, so those clusters can hear about and assess known issues with updates in environments where the default https://api.openshift.com/api/upgrades_info update service is not accessible, or in which an alternative update service was desired for testing or policy reasons. This includes [2], mentioned above, but would also include any other instance of disconnected/restricted-network use. The alternative for folks who want HyperShift updates in places where api.openshift.com is inaccessible are: * Don't use an update service and manage that aspect manually. But the update service declares multiple new releases each week, as well as delivering information about known risks/issues for local clusters to evalute their exposure. That's a lot of information to manage manually, if folks decide not to plug into the existing update service tooling. * Run a local update service, and fiddle with DNS and X.509 certs so that packets aimed at api.openshift.com get routed to your local service . This one requires less long-term effort than manually replacing the entire update service system, but it requires your clusters to trust an X.509 Certificate Authority that is willing to sign certificates for your local service saying "yup, that one is definitely api.openshift.com". One possible data path would be: 1. HostedCluster (management cluster). 2. HostedControlPlane (management cluster). 3. ClusterVersion spec.upstream (hosted API). 4. Cluster-version operator container (managment cluster). that pathway would only require changes to the HyperShift repo. But to avoid URI passing through the customer-accessible hosted API, this commit adds a new --update-service command-line option to support: 1. HostedCluster (management cluster). 2. HostedControlPlane (management cluster). 3. Cluster-version operator Deployment --update-service command line option (management cluster). 4. Cluster-version operator container (managment cluster). If, in the future, we grow an option to give the hosted CVO kubeconfig access to both the management and hosted Kubernetes APIs, we could drop --update-service and have the hosted CVO reach out and read this configuration off HostedControlPlane or HostedCluster directly. I'd initially gone with --upstream, but moved to --update-service and updateService, etc. based on Petr's reasonable point that "upstream" is pretty generic, "update service" is not much longer and is much more specific, and diverging from the ClusterVersion spec.upstream precedent isn't that terrible [4]. [1]: https://issues.redhat.com/browse/XCMSTRAT-513 [2]: https://issues.redhat.com/browse/OTA-1185 [3]: https://github.com/openshift/hypershift/blob/5e50e633fefd88aab9588d660c4b5daddd950d9a/control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go#L1059 [4]: openshift#1035 (review)
04c32b5 to
b4abe48
Compare
|
/retest-required |
|
Rename request address; thanks! /hold cancel |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: petr-muller, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
PR pre-merge tested and passed.
|
|
/label qe-approved |
|
@wking: This pull request references OTA-1210 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/override ci/prow/e2e-agnostic-ovn |
|
@wking: Overrode contexts on behalf of wking: ci/prow/e2e-agnostic-ovn DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@wking: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
[1] includes [2] pitching local update services in managed regions. However, HyperShift had (until this commit) been forcing ClusterVersion spec.upstream empty. This commit is part of making the upstream update service configurable on HyperShift, so those clusters can hear about and assess known issues with updates in environments where the default https://api.openshift.com/api/upgrades_info update service is not accessible, or in which an alternative update service was desired for testing or policy reasons. This includes [2], mentioned above, but would also include any other instance of disconnected/restricted-network use. The alternative for folks who want HyperShift updates in places where api.openshift.com is inaccessible are: * Don't use an update service and manage that aspect manually. But the update service declares multiple new releases each week, as well as delivering information about known risks/issues for local clusters to evalute their exposure. That's a lot of information to manage manually, if folks decide not to plug into the existing update service tooling. * Run a local update service, and fiddle with DNS and X.509 certs so that packets aimed at api.openshift.com get routed to your local service . This one requires less long-term effort than manually replacing the entire update service system, but it requires your clusters to trust an X.509 Certificate Authority that is willing to sign certificates for your local service saying "yup, that one is definitely api.openshift.com". The implementation is similar to e438076 (api/v1beta1/hostedcluster_types: Add channel, availableUpdates, and conditionalUpdates, 2022-12-14, openshift#1954), where I added update-related status properties piped up via CVO -> ClusterVersion.status -> HostedControlPlane.status -> HostedCluster.status. This commit adds a new spec property that is piped down via: 1. HostedCluster (management cluster) spec.updateService. 2. HostedControlPlane (management cluster) spec.updateService. 3. Cluster-version operator Deployment --update-service command line option (management cluster), and also ClusterVersion spec.upstream. 4. Cluster-version operator container (managment cluster). The CVO's new --update-service option is from [3], and we're using that pathway instead of ClusterVersion spec, because we don't want the update service URI to come via the the customer-accessible hosted API. It's ok for the channel (which is used as a query parameter in the update-service GET requests) to continue to flow in via ClusterVersion spec. I'm still populating ClusterVersion's spec.upstream in this commit for discoverability, although because --update-service is being used, the CVO will ignore the ClusterVersion spec.upstream value. When the HyperShift operator sets this in HostedControlPlane for an older cluster version, the old HostedControlPlane controller (launched from the hosted release payload) will not recognize or propagate the new property. But that's not a terrible thing, and issues with fetching update recommendations from the default update service will still be reported in the RetrievedUpdates condition [4] with a message mentioning the update service URI if there are problems [5]. Although it doesn't look like we capture that condition currently: $ grep -r 'ConditionType = "ClusterVersion' api/hypershift/v1beta1/ api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionSucceeding ConditionType = "ClusterVersionSucceeding" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionUpgradeable ConditionType = "ClusterVersionUpgradeable" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionFailing ConditionType = "ClusterVersionFailing" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionProgressing ConditionType = "ClusterVersionProgressing" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionAvailable ConditionType = "ClusterVersionAvailable" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionReleaseAccepted ConditionType = "ClusterVersionReleaseAccepted" We could opt to collect it in follow-up work if issues occur frequently enough that dropping down to the ClusterVersion level to debug becomes tedious. If, in the future, we grow an option to give the hosted CVO kubeconfig access to both the management and hosted Kubernetes APIs, we could drop --update-service and have the hosted CVO reach out and read this configuration off HostedControlPlane or HostedCluster directly. I'm bumping v1alpha1 as described in 4af5ffa (api/v1alpha1: Catch up with channel, availableUpdates, and conditionalUpdates, 2023-01-17, openshift#1954). The recent 1ae3a36 (Always include AWS default security group in worker security groups, 2024-02-05, openshift#3527) suggests that the "v1alpha1 is a superset of v1beta1" policy remains current. The cmd/install/assets, client/applyconfiguration, docs/content/reference/api.md, hack/app-sre/saas_template.yaml, and vendor changes are automatic via: $ make update [1]: https://issues.redhat.com/browse/XCMSTRAT-513 [2]: https://issues.redhat.com/browse/OTA-1185 [3]: openshift/cluster-version-operator#1035 [4]: https://github.com/openshift/api/blob/54b3334bfac52883d515c11118ca191bffba5db7/config/v1/types_cluster_version.go#L702-L706 [5]: https://github.com/openshift/cluster-version-operator/blob/ce6169c7b9b0d44c2e41342e6414ed9db0a31a63/pkg/cvo/availableupdates.go#L356-L365
|
[ART PR BUILD NOTIFIER] This PR has been included in build cluster-version-operator-container-v4.16.0-202403292312.p0.g69bee91.assembly.stream.el9 for distgit cluster-version-operator. |
[1] includes [2] pitching local update services in managed regions. However, HyperShift had (until this commit) been forcing ClusterVersion spec.upstream empty. This commit is part of making the upstream update service configurable on HyperShift, so those clusters can hear about and assess known issues with updates in environments where the default https://api.openshift.com/api/upgrades_info update service is not accessible, or in which an alternative update service was desired for testing or policy reasons. This includes [2], mentioned above, but would also include any other instance of disconnected/restricted-network use. The alternative for folks who want HyperShift updates in places where api.openshift.com is inaccessible are: * Don't use an update service and manage that aspect manually. But the update service declares multiple new releases each week, as well as delivering information about known risks/issues for local clusters to evalute their exposure. That's a lot of information to manage manually, if folks decide not to plug into the existing update service tooling. * Run a local update service, and fiddle with DNS and X.509 certs so that packets aimed at api.openshift.com get routed to your local service . This one requires less long-term effort than manually replacing the entire update service system, but it requires your clusters to trust an X.509 Certificate Authority that is willing to sign certificates for your local service saying "yup, that one is definitely api.openshift.com". The implementation is similar to e438076 (api/v1beta1/hostedcluster_types: Add channel, availableUpdates, and conditionalUpdates, 2022-12-14, openshift#1954), where I added update-related status properties piped up via CVO -> ClusterVersion.status -> HostedControlPlane.status -> HostedCluster.status. This commit adds a new spec property that is piped down via: 1. HostedCluster (management cluster) spec.updateService. 2. HostedControlPlane (management cluster) spec.updateService. 3. Cluster-version operator Deployment --update-service command line option (management cluster), and also ClusterVersion spec.upstream. 4. Cluster-version operator container (managment cluster). The CVO's new --update-service option is from [3], and we're using that pathway instead of ClusterVersion spec, because we don't want the update service URI to come via the the customer-accessible hosted API. It's ok for the channel (which is used as a query parameter in the update-service GET requests) to continue to flow in via ClusterVersion spec. I'm still populating ClusterVersion's spec.upstream in this commit for discoverability, although because --update-service is being used, the CVO will ignore the ClusterVersion spec.upstream value. When the HyperShift operator sets this in HostedControlPlane for an older cluster version, the old HostedControlPlane controller (launched from the hosted release payload) will not recognize or propagate the new property. But that's not a terrible thing, and issues with fetching update recommendations from the default update service will still be reported in the RetrievedUpdates condition [4] with a message mentioning the update service URI if there are problems [5]. Although it doesn't look like we capture that condition currently: $ grep -r 'ConditionType = "ClusterVersion' api/hypershift/v1beta1/ api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionSucceeding ConditionType = "ClusterVersionSucceeding" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionUpgradeable ConditionType = "ClusterVersionUpgradeable" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionFailing ConditionType = "ClusterVersionFailing" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionProgressing ConditionType = "ClusterVersionProgressing" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionAvailable ConditionType = "ClusterVersionAvailable" api/hypershift/v1beta1/hostedcluster_conditions.go: ClusterVersionReleaseAccepted ConditionType = "ClusterVersionReleaseAccepted" We could opt to collect it in follow-up work if issues occur frequently enough that dropping down to the ClusterVersion level to debug becomes tedious. If, in the future, we grow an option to give the hosted CVO kubeconfig access to both the management and hosted Kubernetes APIs, we could drop --update-service and have the hosted CVO reach out and read this configuration off HostedControlPlane or HostedCluster directly. I'm bumping v1alpha1 as described in 4af5ffa (api/v1alpha1: Catch up with channel, availableUpdates, and conditionalUpdates, 2023-01-17, openshift#1954). The recent 1ae3a36 (Always include AWS default security group in worker security groups, 2024-02-05, openshift#3527) suggests that the "v1alpha1 is a superset of v1beta1" policy remains current. The cmd/install/assets, client/applyconfiguration, docs/content/reference/api.md, hack/app-sre/saas_template.yaml, and vendor changes are automatic via: $ make update [1]: https://issues.redhat.com/browse/XCMSTRAT-513 [2]: https://issues.redhat.com/browse/OTA-1185 [3]: openshift/cluster-version-operator#1035 [4]: https://github.com/openshift/api/blob/54b3334bfac52883d515c11118ca191bffba5db7/config/v1/types_cluster_version.go#L702-L706 [5]: https://github.com/openshift/cluster-version-operator/blob/ce6169c7b9b0d44c2e41342e6414ed9db0a31a63/pkg/cvo/availableupdates.go#L356-L365
XCMSTRAT-513 includes OTA-1185 pitching local update services in managed regions. However, HyperShift currently forces ClusterVersion
spec.upstreamempty. This commit is part of making the upstream update service configurable on HyperShift, so those clusters can hear about and assess known issues with updates in environments where the default https://api.openshift.com/api/upgrades_info update service is not accessible, or in which an alternative update service was desired for testing or policy reasons. This includes OTA-1185, mentioned above, but would also include any other instance of disconnected/restricted-network use. The alternative for folks who want HyperShift updates in places where api.openshift.com is inaccessible are:One possible data path would be:
spec.upstream(hosted API).that pathway would only require changes to the HyperShift repo. But to avoid URI passing through the customer-accessible hosted API, this commit adds a new
--upstreamcommand-line option to support:--upstreamcommand line option (management cluster).If, in the future, we grow an option to give the hosted CVO kubeconfig access to both the management and hosted Kubernetes APIs, we could drop
--upstreamand have the hosted CVO reach out and read this configuration off HostedControlPlane or HostedCluster directly.