-
Notifications
You must be signed in to change notification settings - Fork 185
Bug 1998552: Enforce OpenShift's defined kubelet version skew policies #1199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1998552: Enforce OpenShift's defined kubelet version skew policies #1199
Conversation
|
Skipping CI for Draft Pull Request. |
d00cb9b to
208ebf1
Compare
|
/test all |
208ebf1 to
e67f925
Compare
|
/test all |
4d4d707 to
b51007f
Compare
|
/retest |
2 similar comments
|
/retest |
|
/retest |
pkg/operator/kubeletversionskewcontroller/kubelet_version_skew_controller.go
Outdated
Show resolved
Hide resolved
pkg/operator/kubeletversionskewcontroller/kubelet_version_skew_controller.go
Outdated
Show resolved
Hide resolved
pkg/operator/kubeletversionskewcontroller/kubelet_version_skew_controller.go
Outdated
Show resolved
Hide resolved
b51007f to
81714f5
Compare
pkg/operator/kubeletversionskewcontroller/kubelet_version_skew_controller.go
Outdated
Show resolved
Hide resolved
pkg/operator/kubeletversionskewcontroller/kubelet_version_skew_controller.go
Show resolved
Hide resolved
pkg/operator/kubeletversionskewcontroller/kubelet_version_skew_controller.go
Outdated
Show resolved
Hide resolved
pkg/operator/kubeletversionskewcontroller/kubelet_version_skew_controller.go
Show resolved
Hide resolved
pkg/operator/kubeletversionskewcontroller/kubelet_version_skew_controller.go
Outdated
Show resolved
Hide resolved
pkg/operator/kubeletversionskewcontroller/kubelet_version_skew_controller.go
Outdated
Show resolved
Hide resolved
pkg/operator/kubeletversionskewcontroller/kubelet_version_skew_controller.go
Outdated
Show resolved
Hide resolved
75efab5 to
48d70e4
Compare
pkg/operator/kubeletversionskewcontroller/kubelet_version_skew_controller.go
Show resolved
Hide resolved
|
LGTM (will let others to give their feedback) |
4d0aa4e to
20ca271
Compare
|
Reviewers, please see the description for the change to a skew policy based on even/odd OCP versions. (FYI @sdodson) |
pkg/operator/kubeletversionskewcontroller/kubelet_version_skew_controller.go
Outdated
Show resolved
Hide resolved
pkg/operator/kubeletversionskewcontroller/kubelet_version_skew_controller.go
Outdated
Show resolved
Hide resolved
|
@sanchezl: All pull requests linked via external trackers have merged: Bugzilla bug 1998552 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@sanchezl: new pull request created: #1223 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@sanchezl: new pull request created: #1224 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
…xtUpgrade Before running the post-update unpause monitor. Without this, 4.8->4.9->4.10 jobs are updating successfully to 4.10, unpausing compute, running the post-unpause compute-settling monitor, and failing on [1,2]: : [sig-arch][Early] Managed cluster should start all core operators [Skipped:Disconnected] [Suite:openshift/conformance/parallel] 0s fail [github.com/onsi/ginkgo@v4.7.0-origin.0+incompatible/internal/leafnodes/runner.go:113]: Oct 17 23:28:57.284: Some cluster operators are not ready: kube-apiserver (Upgradeable=False KubeletMinorVersion_KubeletMinorVersionUnsupportedNextUpgrade: KubeletMinorVersionUpgradeable: Kubelet minor versions on nodes ip-10-0-135-91.ec2.internal, ip-10-0-168-151.ec2.internal, and ip-10-0-192-244.ec2.internal will not be supported in the next OpenShift minor version upgrade.) With this change, that skew guard from [3] is allowed to trip early in the suite. But if it trips for long enough to set off alerts, we'd still fail on that. [1]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.10-informing#periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused [2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused/1449821870344900608 [3]: openshift/cluster-kube-apiserver-operator#1199
…xtUpgrade Before running the post-update unpause monitor. Without this, 4.8->4.9->4.10 jobs are updating successfully to 4.10, unpausing compute, running the post-unpause compute-settling monitor, and failing on [1,2]: : [sig-arch][Early] Managed cluster should start all core operators [Skipped:Disconnected] [Suite:openshift/conformance/parallel] 0s fail [github.com/onsi/ginkgo@v4.7.0-origin.0+incompatible/internal/leafnodes/runner.go:113]: Oct 17 23:28:57.284: Some cluster operators are not ready: kube-apiserver (Upgradeable=False KubeletMinorVersion_KubeletMinorVersionUnsupportedNextUpgrade: KubeletMinorVersionUpgradeable: Kubelet minor versions on nodes ip-10-0-135-91.ec2.internal, ip-10-0-168-151.ec2.internal, and ip-10-0-192-244.ec2.internal will not be supported in the next OpenShift minor version upgrade.) With this change, that skew guard from [3] is allowed to trip early in the suite. But if it trips for long enough to set off alerts, we'd still fail on that. [1]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.10-informing#periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused [2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused/1449821870344900608 [3]: openshift/cluster-kube-apiserver-operator#1199
…xtUpgrade Before running the post-update unpause monitor. Without this, 4.8->4.9->4.10 jobs are updating successfully to 4.10, unpausing compute, running the post-unpause compute-settling monitor, and failing on [1,2]: : [sig-arch][Early] Managed cluster should start all core operators [Skipped:Disconnected] [Suite:openshift/conformance/parallel] 0s fail [github.com/onsi/ginkgo@v4.7.0-origin.0+incompatible/internal/leafnodes/runner.go:113]: Oct 17 23:28:57.284: Some cluster operators are not ready: kube-apiserver (Upgradeable=False KubeletMinorVersion_KubeletMinorVersionUnsupportedNextUpgrade: KubeletMinorVersionUpgradeable: Kubelet minor versions on nodes ip-10-0-135-91.ec2.internal, ip-10-0-168-151.ec2.internal, and ip-10-0-192-244.ec2.internal will not be supported in the next OpenShift minor version upgrade.) With this change, that skew guard from [3] is allowed to trip early in the suite. But if it trips for long enough to set off alerts, we'd still fail on that. Personally, I'd much rather have the motivational message in the commit, and not in the inline comment. Folks could answer "why is this line here?" with 'git blame ...'. But David wanted an inline comment [4], so that's what I've done here. [1]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.10-informing#periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused [2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused/1449821870344900608 [3]: openshift/cluster-kube-apiserver-operator#1199 [4]: openshift#26531 (comment)
…xtUpgrade Before running the post-update unpause monitor. Without this, 4.8->4.9->4.10 jobs are updating successfully to 4.10, unpausing compute, running the post-unpause compute-settling monitor, and failing on [1,2]: : [sig-arch][Early] Managed cluster should start all core operators [Skipped:Disconnected] [Suite:openshift/conformance/parallel] 0s fail [github.com/onsi/ginkgo@v4.7.0-origin.0+incompatible/internal/leafnodes/runner.go:113]: Oct 17 23:28:57.284: Some cluster operators are not ready: kube-apiserver (Upgradeable=False KubeletMinorVersion_KubeletMinorVersionUnsupportedNextUpgrade: KubeletMinorVersionUpgradeable: Kubelet minor versions on nodes ip-10-0-135-91.ec2.internal, ip-10-0-168-151.ec2.internal, and ip-10-0-192-244.ec2.internal will not be supported in the next OpenShift minor version upgrade.) With this change, that skew guard from [3] is allowed to trip early in the suite. But if it trips for long enough to set off alerts, we'd still fail on that. Personally, I'd much rather have the motivational message in the commit, and not in the inline comment. Folks could answer "why is this line here?" with 'git blame ...'. But David wanted an inline comment [4], so that's what I've done here. [1]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.10-informing#periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused [2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused/1449821870344900608 [3]: openshift/cluster-kube-apiserver-operator#1199 [4]: openshift#26531 (comment)
…xtUpgrade Before running the post-update unpause monitor. Without this, 4.8->4.9->4.10 jobs are updating successfully to 4.10, unpausing compute, running the post-unpause compute-settling monitor, and failing on [1,2]: : [sig-arch][Early] Managed cluster should start all core operators [Skipped:Disconnected] [Suite:openshift/conformance/parallel] 0s fail [github.com/onsi/ginkgo@v4.7.0-origin.0+incompatible/internal/leafnodes/runner.go:113]: Oct 17 23:28:57.284: Some cluster operators are not ready: kube-apiserver (Upgradeable=False KubeletMinorVersion_KubeletMinorVersionUnsupportedNextUpgrade: KubeletMinorVersionUpgradeable: Kubelet minor versions on nodes ip-10-0-135-91.ec2.internal, ip-10-0-168-151.ec2.internal, and ip-10-0-192-244.ec2.internal will not be supported in the next OpenShift minor version upgrade.) With this change, that skew guard from [3] is allowed to trip early in the suite. But if it trips for long enough to set off alerts, we'd still fail on that. Personally, I'd much rather have the motivational message in the commit, and not in the inline comment. Folks could answer "why is this line here?" with 'git blame ...'. But David wanted an inline comment [4], so that's what I've done here. [1]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.10-informing#periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused [2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused/1449821870344900608 [3]: openshift/cluster-kube-apiserver-operator#1199 [4]: openshift#26531 (comment)
The kubelet skew guards are from 1471d2c (Bug 1986453: Check for API server and node versions skew, 2021-07-27, openshift#2658). But the Kube API server also landed similar guards in openshift/cluster-kube-apiserver-operator@9ce4f74775 (add KubeletVersionSkewController, 2021-08-26, openshift/cluster-kube-apiserver-operator#1199). openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted the proposal form MCO-guards to KAS-guards, so I'm not entirely clear on why the MCO guards landed at all. But it's convenient for me that they did, because while I'm dropping them here, I'm recycling the Node lister for a new check. 4.19 is dropping bare-RHEL support, and I want the Node lister to look for RHEL entries like: osImage: Red Hat Enterprise Linux 8.6 (Ootpa) but we are ok with RHCOS entries like: osImage: Red Hat Enterprise Linux CoreOS 419.96.202503032242-0
The kubelet skew guards are from 1471d2c (Bug 1986453: Check for API server and node versions skew, 2021-07-27, openshift#2658). But the Kube API server also landed similar guards in openshift/cluster-kube-apiserver-operator@9ce4f74775 (add KubeletVersionSkewController, 2021-08-26, openshift/cluster-kube-apiserver-operator#1199). openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted the proposal form MCO-guards to KAS-guards, so I'm not entirely clear on why the MCO guards landed at all. But it's convenient for me that they did, because while I'm dropping them here, I'm recycling the Node lister for a new check. 4.19 is dropping bare-RHEL support, and I want the Node lister to look for RHEL entries like: osImage: Red Hat Enterprise Linux 8.6 (Ootpa) but we are ok with RHCOS entries like: osImage: Red Hat Enterprise Linux CoreOS 419.96.202503032242-0
The kubelet skew guards are from 1471d2c (Bug 1986453: Check for API server and node versions skew, 2021-07-27, openshift#2658). But the Kube API server also landed similar guards in openshift/cluster-kube-apiserver-operator@9ce4f74775 (add KubeletVersionSkewController, 2021-08-26, openshift/cluster-kube-apiserver-operator#1199). openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted the proposal form MCO-guards to KAS-guards, so I'm not entirely clear on why the MCO guards landed at all. But it's convenient for me that they did, because while I'm dropping them here, I'm recycling the Node lister for a new check. 4.19 is dropping bare-RHEL support, and I want the Node lister to look for RHEL entries like: osImage: Red Hat Enterprise Linux 8.6 (Ootpa) but we are ok with RHCOS entries like: osImage: Red Hat Enterprise Linux CoreOS 419.96.202503032242-0
The kubelet skew guard is from 1471d2c (Bug 1986453: Check for API server and node versions skew, 2021-07-27, openshift#2658). But the Kube API server also landed a similar guard in openshift/cluster-kube-apiserver-operator@9ce4f74775 (add KubeletVersionSkewController, 2021-08-26, openshift/cluster-kube-apiserver-operator#1199). openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted the proposal from MCO-guards to KAS-guards, so I'm not clear on why the MCO guard landed. This commit drops it, to consolidate around the KAS-side guard.
The kubelet skew guards are from 1471d2c (Bug 1986453: Check for API server and node versions skew, 2021-07-27, openshift#2658). But the Kube API server also landed similar guards in openshift/cluster-kube-apiserver-operator@9ce4f74775 (add KubeletVersionSkewController, 2021-08-26, openshift/cluster-kube-apiserver-operator#1199). openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted the proposal form MCO-guards to KAS-guards, so I'm not entirely clear on why the MCO guards landed at all. But it's convenient for me that they did, because while I'm dropping them here, I'm recycling the Node lister for a new check. 4.19 is dropping bare-RHEL support, and I want the Node lister to look for RHEL entries like: osImage: Red Hat Enterprise Linux 8.6 (Ootpa) but we are ok with RHCOS entries like: osImage: Red Hat Enterprise Linux CoreOS 419.96.202503032242-0
The kubelet skew guard is from 1471d2c (Bug 1986453: Check for API server and node versions skew, 2021-07-27, openshift#2658). But the Kube API server also landed a similar guard in openshift/cluster-kube-apiserver-operator@9ce4f74775 (add KubeletVersionSkewController, 2021-08-26, openshift/cluster-kube-apiserver-operator#1199). openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted the proposal from MCO-guards to KAS-guards, so I'm not clear on why the MCO guard landed. This commit drops it, to consolidate around the KAS-side guard.
The kubelet skew guard is from 1471d2c (Bug 1986453: Check for API server and node versions skew, 2021-07-27, openshift#2658). But the Kube API server also landed a similar guard in openshift/cluster-kube-apiserver-operator@9ce4f74775 (add KubeletVersionSkewController, 2021-08-26, openshift/cluster-kube-apiserver-operator#1199). openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted the proposal from MCO-guards to KAS-guards, so I'm not clear on why the MCO guard landed. This commit drops it, to consolidate around the KAS-side guard.
The kubelet skew guards are from 1471d2c (Bug 1986453: Check for API server and node versions skew, 2021-07-27, openshift#2658). But the Kube API server also landed similar guards in openshift/cluster-kube-apiserver-operator@9ce4f74775 (add KubeletVersionSkewController, 2021-08-26, openshift/cluster-kube-apiserver-operator#1199). openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted the proposal form MCO-guards to KAS-guards, so I'm not entirely clear on why the MCO guards landed at all. But it's convenient for me that they did, because while I'm dropping them here, I'm recycling the Node lister for a new check. 4.19 is dropping bare, package-managed RHEL support. I'd initially thought about looking for RHEL entries like: osImage: Red Hat Enterprise Linux 8.6 (Ootpa) while excluding RHCOS entries like: osImage: Red Hat Enterprise Linux CoreOS 419.96.202503032242-0 But instead of switching on osImage, I'm using the node.openshift.io/os_id label to find package-managed RHEL Nodes. The machine-config operator is setting up the label [1] based on the ID value in /etc/os-release. On RHCOS instances, the ID value is 'rhcos' [2]. On package-managed RHEL, it's 'rhel' [3,4]. [1]: https://github.com/openshift/machine-config-operator/blob/ddc18e84f4a0650e0e87aa0a4f90f9cf01b5259c/templates/worker/01-worker-kubelet/_base/units/kubelet.service.yaml#L19-L31 [2]: https://github.com/openshift/os/blob/41f6a028d37b750db0bf4257447d809bd9cbe4bf/manifest-ocp-rhel-9.6.yaml#L41 [3]: https://github.com/openshift/enhancements/blob/ea465e192bfb58ec8654f1c904a4af68777f68ec/enhancements/rhcos/split-rhcos-into-layers.md?plain=1#L416 [4]: https://github.com/openshift/machine-config-operator/blob/ddc18e84f4a0650e0e87aa0a4f90f9cf01b5259c/pkg/daemon/osrelease/osrelease.go#L69
The kubelet skew guards are from 1471d2c (Bug 1986453: Check for API server and node versions skew, 2021-07-27, openshift#2658). But the Kube API server also landed similar guards in openshift/cluster-kube-apiserver-operator@9ce4f74775 (add KubeletVersionSkewController, 2021-08-26, openshift/cluster-kube-apiserver-operator#1199). openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted the proposal form MCO-guards to KAS-guards, so I'm not entirely clear on why the MCO guards landed at all. But it's convenient for me that they did, because while I'm dropping them here, I'm recycling the Node lister for a new check. 4.19 is dropping bare, package-managed RHEL support. I'd initially thought about looking for RHEL entries like: osImage: Red Hat Enterprise Linux 8.6 (Ootpa) while excluding RHCOS entries like: osImage: Red Hat Enterprise Linux CoreOS 419.96.202503032242-0 But instead of switching on osImage, I'm using the node.openshift.io/os_id label to find package-managed RHEL Nodes. The machine-config operator is setting up the label [1] based on the ID value in /etc/os-release. On RHCOS instances, the ID value is 'rhcos' [2]. On package-managed RHEL, it's 'rhel' [3,4]. [1]: https://github.com/openshift/machine-config-operator/blob/ddc18e84f4a0650e0e87aa0a4f90f9cf01b5259c/templates/worker/01-worker-kubelet/_base/units/kubelet.service.yaml#L19-L31 [2]: https://github.com/openshift/os/blob/41f6a028d37b750db0bf4257447d809bd9cbe4bf/manifest-ocp-rhel-9.6.yaml#L41 [3]: https://github.com/openshift/enhancements/blob/ea465e192bfb58ec8654f1c904a4af68777f68ec/enhancements/rhcos/split-rhcos-into-layers.md?plain=1#L416 [4]: https://github.com/openshift/machine-config-operator/blob/ddc18e84f4a0650e0e87aa0a4f90f9cf01b5259c/pkg/daemon/osrelease/osrelease.go#L69
The kubelet skew guards are from 1471d2c (Bug 1986453: Check for API server and node versions skew, 2021-07-27, openshift#2658). But the Kube API server also landed similar guards in openshift/cluster-kube-apiserver-operator@9ce4f74775 (add KubeletVersionSkewController, 2021-08-26, openshift/cluster-kube-apiserver-operator#1199). openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted the proposal form MCO-guards to KAS-guards, so I'm not entirely clear on why the MCO guards landed at all. But it's convenient for me that they did, because while I'm dropping them here, I'm recycling the Node lister for a new check. 4.19 is dropping bare, package-managed RHEL support. I'd initially thought about looking for RHEL entries like: osImage: Red Hat Enterprise Linux 8.6 (Ootpa) while excluding RHCOS entries like: osImage: Red Hat Enterprise Linux CoreOS 419.96.202503032242-0 But instead of switching on osImage, I'm using the node.openshift.io/os_id label to find package-managed RHEL Nodes. The machine-config operator is setting up the label [1] based on the ID value in /etc/os-release. On RHCOS instances, the ID value is 'rhcos' [2]. On package-managed RHEL, it's 'rhel' [3,4]. [1]: https://github.com/openshift/machine-config-operator/blob/ddc18e84f4a0650e0e87aa0a4f90f9cf01b5259c/templates/worker/01-worker-kubelet/_base/units/kubelet.service.yaml#L19-L31 [2]: https://github.com/openshift/os/blob/41f6a028d37b750db0bf4257447d809bd9cbe4bf/manifest-ocp-rhel-9.6.yaml#L41 [3]: https://github.com/openshift/enhancements/blob/ea465e192bfb58ec8654f1c904a4af68777f68ec/enhancements/rhcos/split-rhcos-into-layers.md?plain=1#L416 [4]: https://github.com/openshift/machine-config-operator/blob/ddc18e84f4a0650e0e87aa0a4f90f9cf01b5259c/pkg/daemon/osrelease/osrelease.go#L69
The kubelet skew guards are from 1471d2c (Bug 1986453: Check for API server and node versions skew, 2021-07-27, openshift#2658). But the Kube API server also landed similar guards in openshift/cluster-kube-apiserver-operator@9ce4f74775 (add KubeletVersionSkewController, 2021-08-26, openshift/cluster-kube-apiserver-operator#1199). openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted the proposal form MCO-guards to KAS-guards, so I'm not entirely clear on why the MCO guards landed at all. But it's convenient for me that they did, because while I'm dropping them here, I'm recycling the Node lister for a new check. 4.19 is dropping bare, package-managed RHEL support. I'd initially thought about looking for RHEL entries like: osImage: Red Hat Enterprise Linux 8.6 (Ootpa) while excluding RHCOS entries like: osImage: Red Hat Enterprise Linux CoreOS 419.96.202503032242-0 But instead of switching on osImage, I'm using the node.openshift.io/os_id label to find package-managed RHEL Nodes. The machine-config operator is setting up the label [1] based on the ID value in /etc/os-release. On RHCOS instances, the ID value is 'rhcos' [2]. On package-managed RHEL, it's 'rhel' [3,4]. [1]: https://github.com/openshift/machine-config-operator/blob/ddc18e84f4a0650e0e87aa0a4f90f9cf01b5259c/templates/worker/01-worker-kubelet/_base/units/kubelet.service.yaml#L19-L31 [2]: https://github.com/openshift/os/blob/41f6a028d37b750db0bf4257447d809bd9cbe4bf/manifest-ocp-rhel-9.6.yaml#L41 [3]: https://github.com/openshift/enhancements/blob/ea465e192bfb58ec8654f1c904a4af68777f68ec/enhancements/rhcos/split-rhcos-into-layers.md?plain=1#L416 [4]: https://github.com/openshift/machine-config-operator/blob/ddc18e84f4a0650e0e87aa0a4f90f9cf01b5259c/pkg/daemon/osrelease/osrelease.go#L69
The kubelet skew guard is from 1471d2c (Bug 1986453: Check for API server and node versions skew, 2021-07-27, openshift#2658). But the Kube API server also landed a similar guard in openshift/cluster-kube-apiserver-operator@9ce4f74775 (add KubeletVersionSkewController, 2021-08-26, openshift/cluster-kube-apiserver-operator#1199). openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted the proposal from MCO-guards to KAS-guards, so I'm not clear on why the MCO guard landed. This commit drops it, to consolidate around the KAS-side guard.
The kubelet skew guard is from 1471d2c (Bug 1986453: Check for API server and node versions skew, 2021-07-27, openshift#2658). But the Kube API server also landed a similar guard in openshift/cluster-kube-apiserver-operator@9ce4f74775 (add KubeletVersionSkewController, 2021-08-26, openshift/cluster-kube-apiserver-operator#1199). openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted the proposal from MCO-guards to KAS-guards, so I'm not clear on why the MCO guard landed. This commit drops it, to consolidate around the KAS-side guard.
Add
KubeletVersionSkewControllerin support of enhancement user story: APIServer - Enforce OpenShift's defined kubelet version skew policiesCondition
Reasons
Kublet minor version skew limits
Upgradeable=False.Upgradeable=True.Supported kublet minor versions
A re-statement of the of the kubelet minor version skew limits above as supported kubelet minor versions:
Upstream kubelet compatibility
For reference, here is the upstream specification for kubelet/api server compatibility: