Skip to content

Conversation

@sanchezl
Copy link
Contributor

@sanchezl sanchezl commented Aug 2, 2021

Add KubeletVersionSkewController in support of enhancement user story: APIServer - Enforce OpenShift's defined kubelet version skew policies

Condition

// KubeletVersionSkewController sets Upgradeable=False if the kubelet
// version on a node prevents upgrading to a supported OCP version.
//
// For odd OCP minor versions, kubelet versions 0 or 1 minor version
// behind the API server version are supported.
//
// For even OCP minor versions, kubelet versions 0, 1, or 2 minor
// versions behind the API server version are supported.
const KubletVersionSkewLimitUpgradeable = "KubletVersionSkewLimitUpgradeable"

Reasons

Reason Upgradeable? Description
KubeletVersionUnknown Unknown Error determining kubelet version.
KubeletMinorVersionsSynced True Kubelet and API server version minor versions are synced.
KubeletMinorVersionSupportedNextUpgrade True Kubelet version on a node will be supported in the next OCP minor version upgrade.
KubeletMinorVersionUnsupportedNextUpgrade False Kubelet version on a node will not be supported in the next OCP minor version upgrade.
KubeletMinorVersionUnsupported False Unsupported kubelet minor version on a node is too far behind the expected API server version.
KubeletMinorVersionAhead Unknown Unsupported kubelet minor version on a node is newer than expected API server version.

Kublet minor version skew limits

  • If OCP minor version is odd, kubelet versions older than 1 minor version of the API server version set Upgradeable=False.
  • If OCP minor version is even, kubelet versions must be in sync with API server version in order to set Upgradeable=True.

Supported kublet minor versions

A re-statement of the of the kubelet minor version skew limits above as supported kubelet minor versions:

  • If OCP minor version is odd, kubelet versions 0 or 1 minor version behind the API server version are supported.
  • If OCP minor version is even, kubelet versions 0, 1, or 2 minor versions behind the API server version are supported.

Upstream kubelet compatibility

For reference, here is the upstream specification for kubelet/api server compatibility:

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 2, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 2, 2021

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot requested review from soltysh and sttts August 2, 2021 20:25
@sanchezl sanchezl force-pushed the kubelet-version-skew branch from d00cb9b to 208ebf1 Compare August 3, 2021 19:56
@sanchezl
Copy link
Contributor Author

sanchezl commented Aug 4, 2021

/test all

@sanchezl sanchezl force-pushed the kubelet-version-skew branch from 208ebf1 to e67f925 Compare August 4, 2021 15:54
@sanchezl
Copy link
Contributor Author

sanchezl commented Aug 4, 2021

/test all

@sanchezl sanchezl force-pushed the kubelet-version-skew branch 3 times, most recently from 4d4d707 to b51007f Compare August 6, 2021 13:51
@sanchezl sanchezl marked this pull request as ready for review August 6, 2021 15:15
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 6, 2021
@sanchezl
Copy link
Contributor Author

sanchezl commented Aug 6, 2021

/retest

2 similar comments
@sanchezl
Copy link
Contributor Author

sanchezl commented Aug 7, 2021

/retest

@sanchezl
Copy link
Contributor Author

/retest

@sanchezl sanchezl force-pushed the kubelet-version-skew branch from b51007f to 81714f5 Compare August 16, 2021 16:19
@sanchezl sanchezl force-pushed the kubelet-version-skew branch 2 times, most recently from 75efab5 to 48d70e4 Compare August 16, 2021 17:39
@tkashem
Copy link
Contributor

tkashem commented Aug 17, 2021

LGTM

(will let others to give their feedback)

@sanchezl sanchezl force-pushed the kubelet-version-skew branch 2 times, most recently from 4d0aa4e to 20ca271 Compare August 17, 2021 21:00
@sanchezl
Copy link
Contributor Author

Reviewers, please see the description for the change to a skew policy based on even/odd OCP versions. (FYI @sdodson)

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 4, 2021

@sanchezl: All pull requests linked via external trackers have merged:

Bugzilla bug 1998552 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1998552: Enforce OpenShift's defined kubelet version skew policies

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot

@sanchezl: new pull request created: #1223

Details

In response to this:

/cherry-pick release-4.7

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot

@sanchezl: new pull request created: #1224

Details

In response to this:

/cherry-pick release-4.8
/cherry-pick release-4.7

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

wking added a commit to wking/origin that referenced this pull request Oct 19, 2021
…xtUpgrade

Before running the post-update unpause monitor.  Without this,
4.8->4.9->4.10 jobs are updating successfully to 4.10, unpausing
compute, running the post-unpause compute-settling monitor, and
failing on [1,2]:

  : [sig-arch][Early] Managed cluster should start all core operators [Skipped:Disconnected] [Suite:openshift/conformance/parallel]	0s
  fail [github.com/onsi/ginkgo@v4.7.0-origin.0+incompatible/internal/leafnodes/runner.go:113]: Oct 17 23:28:57.284: Some cluster operators are not ready: kube-apiserver (Upgradeable=False KubeletMinorVersion_KubeletMinorVersionUnsupportedNextUpgrade: KubeletMinorVersionUpgradeable: Kubelet minor versions on nodes ip-10-0-135-91.ec2.internal, ip-10-0-168-151.ec2.internal, and ip-10-0-192-244.ec2.internal will not be supported in the next OpenShift minor version upgrade.)

With this change, that skew guard from [3] is allowed to trip early in
the suite.  But if it trips for long enough to set off alerts, we'd
still fail on that.

[1]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.10-informing#periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused
[2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused/1449821870344900608
[3]: openshift/cluster-kube-apiserver-operator#1199
wking added a commit to wking/origin that referenced this pull request Oct 19, 2021
…xtUpgrade

Before running the post-update unpause monitor.  Without this,
4.8->4.9->4.10 jobs are updating successfully to 4.10, unpausing
compute, running the post-unpause compute-settling monitor, and
failing on [1,2]:

  : [sig-arch][Early] Managed cluster should start all core operators [Skipped:Disconnected] [Suite:openshift/conformance/parallel]	0s
  fail [github.com/onsi/ginkgo@v4.7.0-origin.0+incompatible/internal/leafnodes/runner.go:113]: Oct 17 23:28:57.284: Some cluster operators are not ready: kube-apiserver (Upgradeable=False KubeletMinorVersion_KubeletMinorVersionUnsupportedNextUpgrade: KubeletMinorVersionUpgradeable: Kubelet minor versions on nodes ip-10-0-135-91.ec2.internal, ip-10-0-168-151.ec2.internal, and ip-10-0-192-244.ec2.internal will not be supported in the next OpenShift minor version upgrade.)

With this change, that skew guard from [3] is allowed to trip early in
the suite.  But if it trips for long enough to set off alerts, we'd
still fail on that.

[1]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.10-informing#periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused
[2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused/1449821870344900608
[3]: openshift/cluster-kube-apiserver-operator#1199
wking added a commit to wking/origin that referenced this pull request Oct 20, 2021
…xtUpgrade

Before running the post-update unpause monitor.  Without this,
4.8->4.9->4.10 jobs are updating successfully to 4.10, unpausing
compute, running the post-unpause compute-settling monitor, and
failing on [1,2]:

  : [sig-arch][Early] Managed cluster should start all core operators [Skipped:Disconnected] [Suite:openshift/conformance/parallel]	0s
  fail [github.com/onsi/ginkgo@v4.7.0-origin.0+incompatible/internal/leafnodes/runner.go:113]: Oct 17 23:28:57.284: Some cluster operators are not ready: kube-apiserver (Upgradeable=False KubeletMinorVersion_KubeletMinorVersionUnsupportedNextUpgrade: KubeletMinorVersionUpgradeable: Kubelet minor versions on nodes ip-10-0-135-91.ec2.internal, ip-10-0-168-151.ec2.internal, and ip-10-0-192-244.ec2.internal will not be supported in the next OpenShift minor version upgrade.)

With this change, that skew guard from [3] is allowed to trip early in
the suite.  But if it trips for long enough to set off alerts, we'd
still fail on that.

Personally, I'd much rather have the motivational message in the
commit, and not in the inline comment.  Folks could answer "why is
this line here?" with 'git blame ...'.  But David wanted an inline
comment [4], so that's what I've done here.

[1]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.10-informing#periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused
[2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused/1449821870344900608
[3]: openshift/cluster-kube-apiserver-operator#1199
[4]: openshift#26531 (comment)
wking added a commit to wking/origin that referenced this pull request Oct 20, 2021
…xtUpgrade

Before running the post-update unpause monitor.  Without this,
4.8->4.9->4.10 jobs are updating successfully to 4.10, unpausing
compute, running the post-unpause compute-settling monitor, and
failing on [1,2]:

  : [sig-arch][Early] Managed cluster should start all core operators [Skipped:Disconnected] [Suite:openshift/conformance/parallel]	0s
  fail [github.com/onsi/ginkgo@v4.7.0-origin.0+incompatible/internal/leafnodes/runner.go:113]: Oct 17 23:28:57.284: Some cluster operators are not ready: kube-apiserver (Upgradeable=False KubeletMinorVersion_KubeletMinorVersionUnsupportedNextUpgrade: KubeletMinorVersionUpgradeable: Kubelet minor versions on nodes ip-10-0-135-91.ec2.internal, ip-10-0-168-151.ec2.internal, and ip-10-0-192-244.ec2.internal will not be supported in the next OpenShift minor version upgrade.)

With this change, that skew guard from [3] is allowed to trip early in
the suite.  But if it trips for long enough to set off alerts, we'd
still fail on that.

Personally, I'd much rather have the motivational message in the
commit, and not in the inline comment.  Folks could answer "why is
this line here?" with 'git blame ...'.  But David wanted an inline
comment [4], so that's what I've done here.

[1]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.10-informing#periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused
[2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused/1449821870344900608
[3]: openshift/cluster-kube-apiserver-operator#1199
[4]: openshift#26531 (comment)
jsafrane pushed a commit to jsafrane/origin that referenced this pull request Nov 11, 2021
…xtUpgrade

Before running the post-update unpause monitor.  Without this,
4.8->4.9->4.10 jobs are updating successfully to 4.10, unpausing
compute, running the post-unpause compute-settling monitor, and
failing on [1,2]:

  : [sig-arch][Early] Managed cluster should start all core operators [Skipped:Disconnected] [Suite:openshift/conformance/parallel]	0s
  fail [github.com/onsi/ginkgo@v4.7.0-origin.0+incompatible/internal/leafnodes/runner.go:113]: Oct 17 23:28:57.284: Some cluster operators are not ready: kube-apiserver (Upgradeable=False KubeletMinorVersion_KubeletMinorVersionUnsupportedNextUpgrade: KubeletMinorVersionUpgradeable: Kubelet minor versions on nodes ip-10-0-135-91.ec2.internal, ip-10-0-168-151.ec2.internal, and ip-10-0-192-244.ec2.internal will not be supported in the next OpenShift minor version upgrade.)

With this change, that skew guard from [3] is allowed to trip early in
the suite.  But if it trips for long enough to set off alerts, we'd
still fail on that.

Personally, I'd much rather have the motivational message in the
commit, and not in the inline comment.  Folks could answer "why is
this line here?" with 'git blame ...'.  But David wanted an inline
comment [4], so that's what I've done here.

[1]: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.10-informing#periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused
[2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-upgrade-from-stable-4.8-e2e-aws-upgrade-paused/1449821870344900608
[3]: openshift/cluster-kube-apiserver-operator#1199
[4]: openshift#26531 (comment)
wking added a commit to wking/machine-config-operator that referenced this pull request Mar 26, 2025
The kubelet skew guards are from 1471d2c (Bug 1986453: Check for
API server and node versions skew, 2021-07-27, openshift#2658).  But the Kube
API server also landed similar guards in
openshift/cluster-kube-apiserver-operator@9ce4f74775 (add
KubeletVersionSkewController, 2021-08-26,
openshift/cluster-kube-apiserver-operator#1199).
openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce
skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted
the proposal form MCO-guards to KAS-guards, so I'm not entirely clear
on why the MCO guards landed at all.  But it's convenient for me that
they did, because while I'm dropping them here, I'm recycling the Node
lister for a new check.

4.19 is dropping bare-RHEL support, and I want the Node lister to look
for RHEL entries like:

  osImage: Red Hat Enterprise Linux 8.6 (Ootpa)

but we are ok with RHCOS entries like:

  osImage: Red Hat Enterprise Linux CoreOS 419.96.202503032242-0
wking added a commit to wking/machine-config-operator that referenced this pull request Mar 27, 2025
The kubelet skew guards are from 1471d2c (Bug 1986453: Check for
API server and node versions skew, 2021-07-27, openshift#2658).  But the Kube
API server also landed similar guards in
openshift/cluster-kube-apiserver-operator@9ce4f74775 (add
KubeletVersionSkewController, 2021-08-26,
openshift/cluster-kube-apiserver-operator#1199).
openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce
skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted
the proposal form MCO-guards to KAS-guards, so I'm not entirely clear
on why the MCO guards landed at all.  But it's convenient for me that
they did, because while I'm dropping them here, I'm recycling the Node
lister for a new check.

4.19 is dropping bare-RHEL support, and I want the Node lister to look
for RHEL entries like:

  osImage: Red Hat Enterprise Linux 8.6 (Ootpa)

but we are ok with RHCOS entries like:

  osImage: Red Hat Enterprise Linux CoreOS 419.96.202503032242-0
wking added a commit to wking/machine-config-operator that referenced this pull request Mar 27, 2025
The kubelet skew guards are from 1471d2c (Bug 1986453: Check for
API server and node versions skew, 2021-07-27, openshift#2658).  But the Kube
API server also landed similar guards in
openshift/cluster-kube-apiserver-operator@9ce4f74775 (add
KubeletVersionSkewController, 2021-08-26,
openshift/cluster-kube-apiserver-operator#1199).
openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce
skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted
the proposal form MCO-guards to KAS-guards, so I'm not entirely clear
on why the MCO guards landed at all.  But it's convenient for me that
they did, because while I'm dropping them here, I'm recycling the Node
lister for a new check.

4.19 is dropping bare-RHEL support, and I want the Node lister to look
for RHEL entries like:

  osImage: Red Hat Enterprise Linux 8.6 (Ootpa)

but we are ok with RHCOS entries like:

  osImage: Red Hat Enterprise Linux CoreOS 419.96.202503032242-0
wking added a commit to wking/machine-config-operator that referenced this pull request Apr 3, 2025
The kubelet skew guard is from 1471d2c (Bug 1986453: Check for API
server and node versions skew, 2021-07-27, openshift#2658).  But the Kube API
server also landed a similar guard in
openshift/cluster-kube-apiserver-operator@9ce4f74775 (add
KubeletVersionSkewController, 2021-08-26,
openshift/cluster-kube-apiserver-operator#1199).
openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce
skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted
the proposal from MCO-guards to KAS-guards, so I'm not clear on why
the MCO guard landed.  This commit drops it, to consolidate around the
KAS-side guard.
wking added a commit to wking/machine-config-operator that referenced this pull request Apr 3, 2025
The kubelet skew guards are from 1471d2c (Bug 1986453: Check for
API server and node versions skew, 2021-07-27, openshift#2658).  But the Kube
API server also landed similar guards in
openshift/cluster-kube-apiserver-operator@9ce4f74775 (add
KubeletVersionSkewController, 2021-08-26,
openshift/cluster-kube-apiserver-operator#1199).
openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce
skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted
the proposal form MCO-guards to KAS-guards, so I'm not entirely clear
on why the MCO guards landed at all.  But it's convenient for me that
they did, because while I'm dropping them here, I'm recycling the Node
lister for a new check.

4.19 is dropping bare-RHEL support, and I want the Node lister to look
for RHEL entries like:

  osImage: Red Hat Enterprise Linux 8.6 (Ootpa)

but we are ok with RHCOS entries like:

  osImage: Red Hat Enterprise Linux CoreOS 419.96.202503032242-0
wking added a commit to wking/machine-config-operator that referenced this pull request Apr 4, 2025
The kubelet skew guard is from 1471d2c (Bug 1986453: Check for API
server and node versions skew, 2021-07-27, openshift#2658).  But the Kube API
server also landed a similar guard in
openshift/cluster-kube-apiserver-operator@9ce4f74775 (add
KubeletVersionSkewController, 2021-08-26,
openshift/cluster-kube-apiserver-operator#1199).
openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce
skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted
the proposal from MCO-guards to KAS-guards, so I'm not clear on why
the MCO guard landed.  This commit drops it, to consolidate around the
KAS-side guard.
wking added a commit to wking/machine-config-operator that referenced this pull request Apr 4, 2025
The kubelet skew guard is from 1471d2c (Bug 1986453: Check for API
server and node versions skew, 2021-07-27, openshift#2658).  But the Kube API
server also landed a similar guard in
openshift/cluster-kube-apiserver-operator@9ce4f74775 (add
KubeletVersionSkewController, 2021-08-26,
openshift/cluster-kube-apiserver-operator#1199).
openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce
skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted
the proposal from MCO-guards to KAS-guards, so I'm not clear on why
the MCO guard landed.  This commit drops it, to consolidate around the
KAS-side guard.
wking added a commit to wking/machine-config-operator that referenced this pull request Apr 15, 2025
The kubelet skew guards are from 1471d2c (Bug 1986453: Check for
API server and node versions skew, 2021-07-27, openshift#2658).  But the Kube
API server also landed similar guards in
openshift/cluster-kube-apiserver-operator@9ce4f74775 (add
KubeletVersionSkewController, 2021-08-26,
openshift/cluster-kube-apiserver-operator#1199).
openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce
skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted
the proposal form MCO-guards to KAS-guards, so I'm not entirely clear
on why the MCO guards landed at all.  But it's convenient for me that
they did, because while I'm dropping them here, I'm recycling the Node
lister for a new check.

4.19 is dropping bare, package-managed RHEL support.  I'd initially
thought about looking for RHEL entries like:

  osImage: Red Hat Enterprise Linux 8.6 (Ootpa)

while excluding RHCOS entries like:

  osImage: Red Hat Enterprise Linux CoreOS 419.96.202503032242-0

But instead of switching on osImage, I'm using the
node.openshift.io/os_id label to find package-managed RHEL Nodes.  The
machine-config operator is setting up the label [1] based on the ID
value in /etc/os-release.  On RHCOS instances, the ID value is 'rhcos'
[2].  On package-managed RHEL, it's 'rhel' [3,4].

[1]: https://github.com/openshift/machine-config-operator/blob/ddc18e84f4a0650e0e87aa0a4f90f9cf01b5259c/templates/worker/01-worker-kubelet/_base/units/kubelet.service.yaml#L19-L31
[2]: https://github.com/openshift/os/blob/41f6a028d37b750db0bf4257447d809bd9cbe4bf/manifest-ocp-rhel-9.6.yaml#L41
[3]: https://github.com/openshift/enhancements/blob/ea465e192bfb58ec8654f1c904a4af68777f68ec/enhancements/rhcos/split-rhcos-into-layers.md?plain=1#L416
[4]: https://github.com/openshift/machine-config-operator/blob/ddc18e84f4a0650e0e87aa0a4f90f9cf01b5259c/pkg/daemon/osrelease/osrelease.go#L69
wking added a commit to wking/machine-config-operator that referenced this pull request Apr 15, 2025
The kubelet skew guards are from 1471d2c (Bug 1986453: Check for
API server and node versions skew, 2021-07-27, openshift#2658).  But the Kube
API server also landed similar guards in
openshift/cluster-kube-apiserver-operator@9ce4f74775 (add
KubeletVersionSkewController, 2021-08-26,
openshift/cluster-kube-apiserver-operator#1199).
openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce
skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted
the proposal form MCO-guards to KAS-guards, so I'm not entirely clear
on why the MCO guards landed at all.  But it's convenient for me that
they did, because while I'm dropping them here, I'm recycling the Node
lister for a new check.

4.19 is dropping bare, package-managed RHEL support.  I'd initially
thought about looking for RHEL entries like:

  osImage: Red Hat Enterprise Linux 8.6 (Ootpa)

while excluding RHCOS entries like:

  osImage: Red Hat Enterprise Linux CoreOS 419.96.202503032242-0

But instead of switching on osImage, I'm using the
node.openshift.io/os_id label to find package-managed RHEL Nodes.  The
machine-config operator is setting up the label [1] based on the ID
value in /etc/os-release.  On RHCOS instances, the ID value is 'rhcos'
[2].  On package-managed RHEL, it's 'rhel' [3,4].

[1]: https://github.com/openshift/machine-config-operator/blob/ddc18e84f4a0650e0e87aa0a4f90f9cf01b5259c/templates/worker/01-worker-kubelet/_base/units/kubelet.service.yaml#L19-L31
[2]: https://github.com/openshift/os/blob/41f6a028d37b750db0bf4257447d809bd9cbe4bf/manifest-ocp-rhel-9.6.yaml#L41
[3]: https://github.com/openshift/enhancements/blob/ea465e192bfb58ec8654f1c904a4af68777f68ec/enhancements/rhcos/split-rhcos-into-layers.md?plain=1#L416
[4]: https://github.com/openshift/machine-config-operator/blob/ddc18e84f4a0650e0e87aa0a4f90f9cf01b5259c/pkg/daemon/osrelease/osrelease.go#L69
wking added a commit to wking/machine-config-operator that referenced this pull request Apr 15, 2025
The kubelet skew guards are from 1471d2c (Bug 1986453: Check for
API server and node versions skew, 2021-07-27, openshift#2658).  But the Kube
API server also landed similar guards in
openshift/cluster-kube-apiserver-operator@9ce4f74775 (add
KubeletVersionSkewController, 2021-08-26,
openshift/cluster-kube-apiserver-operator#1199).
openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce
skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted
the proposal form MCO-guards to KAS-guards, so I'm not entirely clear
on why the MCO guards landed at all.  But it's convenient for me that
they did, because while I'm dropping them here, I'm recycling the Node
lister for a new check.

4.19 is dropping bare, package-managed RHEL support.  I'd initially
thought about looking for RHEL entries like:

  osImage: Red Hat Enterprise Linux 8.6 (Ootpa)

while excluding RHCOS entries like:

  osImage: Red Hat Enterprise Linux CoreOS 419.96.202503032242-0

But instead of switching on osImage, I'm using the
node.openshift.io/os_id label to find package-managed RHEL Nodes.  The
machine-config operator is setting up the label [1] based on the ID
value in /etc/os-release.  On RHCOS instances, the ID value is 'rhcos'
[2].  On package-managed RHEL, it's 'rhel' [3,4].

[1]: https://github.com/openshift/machine-config-operator/blob/ddc18e84f4a0650e0e87aa0a4f90f9cf01b5259c/templates/worker/01-worker-kubelet/_base/units/kubelet.service.yaml#L19-L31
[2]: https://github.com/openshift/os/blob/41f6a028d37b750db0bf4257447d809bd9cbe4bf/manifest-ocp-rhel-9.6.yaml#L41
[3]: https://github.com/openshift/enhancements/blob/ea465e192bfb58ec8654f1c904a4af68777f68ec/enhancements/rhcos/split-rhcos-into-layers.md?plain=1#L416
[4]: https://github.com/openshift/machine-config-operator/blob/ddc18e84f4a0650e0e87aa0a4f90f9cf01b5259c/pkg/daemon/osrelease/osrelease.go#L69
wking added a commit to wking/machine-config-operator that referenced this pull request Apr 29, 2025
The kubelet skew guards are from 1471d2c (Bug 1986453: Check for
API server and node versions skew, 2021-07-27, openshift#2658).  But the Kube
API server also landed similar guards in
openshift/cluster-kube-apiserver-operator@9ce4f74775 (add
KubeletVersionSkewController, 2021-08-26,
openshift/cluster-kube-apiserver-operator#1199).
openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce
skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted
the proposal form MCO-guards to KAS-guards, so I'm not entirely clear
on why the MCO guards landed at all.  But it's convenient for me that
they did, because while I'm dropping them here, I'm recycling the Node
lister for a new check.

4.19 is dropping bare, package-managed RHEL support.  I'd initially
thought about looking for RHEL entries like:

  osImage: Red Hat Enterprise Linux 8.6 (Ootpa)

while excluding RHCOS entries like:

  osImage: Red Hat Enterprise Linux CoreOS 419.96.202503032242-0

But instead of switching on osImage, I'm using the
node.openshift.io/os_id label to find package-managed RHEL Nodes.  The
machine-config operator is setting up the label [1] based on the ID
value in /etc/os-release.  On RHCOS instances, the ID value is 'rhcos'
[2].  On package-managed RHEL, it's 'rhel' [3,4].

[1]: https://github.com/openshift/machine-config-operator/blob/ddc18e84f4a0650e0e87aa0a4f90f9cf01b5259c/templates/worker/01-worker-kubelet/_base/units/kubelet.service.yaml#L19-L31
[2]: https://github.com/openshift/os/blob/41f6a028d37b750db0bf4257447d809bd9cbe4bf/manifest-ocp-rhel-9.6.yaml#L41
[3]: https://github.com/openshift/enhancements/blob/ea465e192bfb58ec8654f1c904a4af68777f68ec/enhancements/rhcos/split-rhcos-into-layers.md?plain=1#L416
[4]: https://github.com/openshift/machine-config-operator/blob/ddc18e84f4a0650e0e87aa0a4f90f9cf01b5259c/pkg/daemon/osrelease/osrelease.go#L69
umohnani8 pushed a commit to umohnani8/machine-config-operator that referenced this pull request Aug 6, 2025
The kubelet skew guard is from 1471d2c (Bug 1986453: Check for API
server and node versions skew, 2021-07-27, openshift#2658).  But the Kube API
server also landed a similar guard in
openshift/cluster-kube-apiserver-operator@9ce4f74775 (add
KubeletVersionSkewController, 2021-08-26,
openshift/cluster-kube-apiserver-operator#1199).
openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce
skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted
the proposal from MCO-guards to KAS-guards, so I'm not clear on why
the MCO guard landed.  This commit drops it, to consolidate around the
KAS-side guard.
umohnani8 pushed a commit to umohnani8/machine-config-operator that referenced this pull request Oct 31, 2025
The kubelet skew guard is from 1471d2c (Bug 1986453: Check for API
server and node versions skew, 2021-07-27, openshift#2658).  But the Kube API
server also landed a similar guard in
openshift/cluster-kube-apiserver-operator@9ce4f74775 (add
KubeletVersionSkewController, 2021-08-26,
openshift/cluster-kube-apiserver-operator#1199).
openshift/enhancements@0ba744e750 (eus-upgrades-mvp: don't enforce
skew check in MCO, 2021-04-29, openshift/enhancements#762) had shifted
the proposal from MCO-guards to KAS-guards, so I'm not clear on why
the MCO guard landed.  This commit drops it, to consolidate around the
KAS-side guard.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants