-
Notifications
You must be signed in to change notification settings - Fork 462
Bug 1828250: Sync kublelet config across platforms #1817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1828250: Sync kublelet config across platforms #1817
Conversation
This came up because of a bug report showing that workloads were scheduled on masters for the baremetal platform regardless of the schedulableMasters scheduler configuration. This is due to the NoSchedule taint not being applied by default on this platform. The baremetal platform specific kubelet caused this behavior. We were going to remove the custom kubelet config once this behavior was configurable (see PR openshift#993), but it was never removed because we ended up needing to make some IPv6 related customizations in this file. We also forgot to re-add the default taint. Meanwhile, some other changes to the kubelet unit were not applied to the baremetal version. I also checked the openstack and vsphere files and found discrepencies there, as well. This is the simplest fix, which is to get these files in sync again. The differences are very minor, so a better follow-up would be to get back to a single kubelet unit, or at least share the duplicated content somehow. I'm leaving that further cleanup as another change, since the most straight forward fix will be the simpler one to backport. Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1828250
|
/retitle Bug 1828250: Sync kublelet config across platforms |
|
@russellb: This pull request references Bugzilla bug 1828250, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/test e2e-openstack |
|
/test e2e-vsphere |
kikisdeliveryservice
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes make sense to me. Thanks for opening this @russellb !!!
|
we should cherrypick to 4.5 as well @russellb ? |
yes, and the 4.5 bug is 1846503 |
|
/cherry-pick release-4.5 |
|
@russellb: once the present PR merges, I will cherry-pick it on top of release-4.5 in a new PR and assign it to you. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/test e2e-openstack |
|
/test e2e-aws |
|
Yeah we desperately need kubelet to grow |
|
/retest |
|
@russellb what's the status here? can we go ahead and merge since this has a dependent 4.5 BZ /approve |
|
/retest |
yes, merge away! |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cgwalters, kikisdeliveryservice, runcom, russellb The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@russellb: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/test e2e-gcp-upgrade |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
@russellb: All pull requests linked via external trackers have merged: openshift/machine-config-operator#1817. Bugzilla bug 1828250 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@russellb: new pull request created: #1835 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Patch at openshift#1817 addressed the drift of kubelet configuration for control plane nodes but didn't update compute nodes' kubelet to catch up changes made to the main kubelet unit. We're currently missing a number of fixes for the worker nodes on Baremetal, OpenStack and vSphere platforms: - https://bugzilla.redhat.com/show_bug.cgi?id=1823967 - https://bugzilla.redhat.com/show_bug.cgi?id=1806027 - https://bugzilla.redhat.com/show_bug.cgi?id=1828622 This change sync the custom kubelet unit files accross platforms.
This came up because of a bug report showing that workloads were
scheduled on masters for the baremetal platform regardless of the
schedulableMasters scheduler configuration. This is due to the
NoSchedule taint not being applied by default on this platform. The
baremetal platform specific kubelet caused this behavior. We were
going to remove the custom kubelet config once this behavior was
configurable (see PR #993), but it was never removed because we ended
up needing to make some IPv6 related customizations in this file. We
also forgot to re-add the default taint.
Meanwhile, some other changes to the kubelet unit were not applied to
the baremetal version. I also checked the openstack and vsphere files
and found discrepencies there, as well.
This is the simplest fix, which is to get these files in sync again.
The differences are very minor, so a better follow-up would be to get
back to a single kubelet unit, or at least share the duplicated
content somehow.
I'm leaving that further cleanup as another change, since the most
straight forward fix will be the simpler one to backport.
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1828250
- What I did
- How to verify it
- Description for the changelog