Skip to content

Conversation

@russellb
Copy link
Contributor

This came up because of a bug report showing that workloads were
scheduled on masters for the baremetal platform regardless of the
schedulableMasters scheduler configuration. This is due to the
NoSchedule taint not being applied by default on this platform. The
baremetal platform specific kubelet caused this behavior. We were
going to remove the custom kubelet config once this behavior was
configurable (see PR #993), but it was never removed because we ended
up needing to make some IPv6 related customizations in this file. We
also forgot to re-add the default taint.

Meanwhile, some other changes to the kubelet unit were not applied to
the baremetal version. I also checked the openstack and vsphere files
and found discrepencies there, as well.

This is the simplest fix, which is to get these files in sync again.
The differences are very minor, so a better follow-up would be to get
back to a single kubelet unit, or at least share the duplicated
content somehow.

I'm leaving that further cleanup as another change, since the most
straight forward fix will be the simpler one to backport.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1828250

- What I did

- How to verify it

- Description for the changelog

This came up because of a bug report showing that workloads were
scheduled on masters for the baremetal platform regardless of the
schedulableMasters scheduler configuration.  This is due to the
NoSchedule taint not being applied by default on this platform.  The
baremetal platform specific kubelet caused this behavior.  We were
going to remove the custom kubelet config once this behavior was
configurable (see PR openshift#993), but it was never removed because we ended
up needing to make some IPv6 related customizations in this file.  We
also forgot to re-add the default taint.

Meanwhile, some other changes to the kubelet unit were not applied to
the baremetal version.  I also checked the openstack and vsphere files
and found discrepencies there, as well.

This is the simplest fix, which is to get these files in sync again.
The differences are very minor, so a better follow-up would be to get
back to a single kubelet unit, or at least share the duplicated
content somehow.

I'm leaving that further cleanup as another change, since the most
straight forward fix will be the simpler one to backport.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1828250
@russellb
Copy link
Contributor Author

/cc @hardys @kikisdeliveryservice

@russellb
Copy link
Contributor Author

/retitle Bug 1828250: Sync kublelet config across platforms

@openshift-ci-robot openshift-ci-robot changed the title Sync kublelet config across platforms Bug 1828250: Sync kublelet config across platforms Jun 11, 2020
@openshift-ci-robot openshift-ci-robot added bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Jun 11, 2020
@openshift-ci-robot
Copy link
Contributor

@russellb: This pull request references Bugzilla bug 1828250, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.6.0) matches configured target release for branch (4.6.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
Details

In response to this:

Bug 1828250: Sync kublelet config across platforms

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@russellb
Copy link
Contributor Author

/test e2e-openstack

@russellb
Copy link
Contributor Author

/test e2e-vsphere

Copy link
Contributor

@kikisdeliveryservice kikisdeliveryservice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 11, 2020
@kikisdeliveryservice
Copy link
Contributor

we should cherrypick to 4.5 as well @russellb ?

@russellb
Copy link
Contributor Author

we should cherrypick to 4.5 as well @russellb ?

yes, and the 4.5 bug is 1846503

@russellb
Copy link
Contributor Author

/cherry-pick release-4.5

@openshift-cherrypick-robot

@russellb: once the present PR merges, I will cherry-pick it on top of release-4.5 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-4.5

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@russellb
Copy link
Contributor Author

/test e2e-openstack

@russellb
Copy link
Contributor Author

/test e2e-aws

@cgwalters
Copy link
Member

Yeah we desperately need kubelet to grow /etc/kubelet.conf.d or so.
/approve

@kikisdeliveryservice
Copy link
Contributor

/retest

@runcom
Copy link
Member

runcom commented Jun 16, 2020

@russellb what's the status here? can we go ahead and merge since this has a dependent 4.5 BZ

/approve

@runcom
Copy link
Member

runcom commented Jun 16, 2020

/retest

@russellb
Copy link
Contributor Author

@russellb what's the status here? can we go ahead and merge since this has a dependent 4.5 BZ

yes, merge away!

@runcom
Copy link
Member

runcom commented Jun 16, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 16, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, kikisdeliveryservice, runcom, russellb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Jun 16, 2020

@russellb: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-aws-scaleup-rhel7 d044c74 link /test e2e-aws-scaleup-rhel7

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@russellb
Copy link
Contributor Author

/test e2e-gcp-upgrade

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 64c4296 into openshift:master Jun 16, 2020
@openshift-ci-robot
Copy link
Contributor

@russellb: All pull requests linked via external trackers have merged: openshift/machine-config-operator#1817. Bugzilla bug 1828250 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1828250: Sync kublelet config across platforms

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot

@russellb: new pull request created: #1835

Details

In response to this:

/cherry-pick release-4.5

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

mandre added a commit to mandre/machine-config-operator that referenced this pull request Sep 1, 2020
Patch at openshift#1817
addressed the drift of kubelet configuration for control plane nodes but
didn't update compute nodes' kubelet to catch up changes made to the
main kubelet unit.

We're currently missing a number of fixes for the worker nodes on
Baremetal, OpenStack and vSphere platforms:
- https://bugzilla.redhat.com/show_bug.cgi?id=1823967
- https://bugzilla.redhat.com/show_bug.cgi?id=1806027
- https://bugzilla.redhat.com/show_bug.cgi?id=1828622

This change sync the custom kubelet unit files accross platforms.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants