Bug 2033751: Return Error when trying to use Scheduler Policy#390
Bug 2033751: Return Error when trying to use Scheduler Policy#390openshift-merge-robot merged 1 commit intoopenshift:masterfrom
Conversation
|
@damemi: This pull request references Bugzilla bug 2033751, which is valid. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/hold |
|
/retest |
| @@ -301,7 +297,7 @@ func managePod_v311_00_to_latest(ctx context.Context, configMapsGetter corev1cli | |||
| configMap.Data["version"] = version.Get().String() | |||
| appliedConfigMap, changed, err := resourceapply.ApplyConfigMap(ctx, configMapsGetter, recorder, configMap) | |||
| if changed && len(config.Spec.Policy.Name) > 0 { | |||
There was a problem hiding this comment.
With manageKubeSchedulerConfigMap_v311_00_to_latest returning error when len(config.Spec.Policy.Name) > 0, the kube-scheduler container will keep crash looping since the configmap with the profile will be missing (assuming the scheduler/cluster object was specified with .spec.policy.name set during the cluster provisioning). I wonder if you took this case into account and whether it would make more sense to move len(config.Spec.Policy.Name) > 0 check alongside the config, err := configSchedulerLister.Get("cluster") line at the beginning of the managePod_v311_00_to_latest function and return an error as well so the pod does not get created and kept crash looping until the policy field is cleared?
The case covers the bootstrapping phase in which (when .spec.policy.name is not empty) the installation fails. So it is unlikely any admin will update the scheduler/cluster object rather than running the installation again. So, the net benefit for the normal installation is quite low. On the other hand when the hypershift topology is used (with one cluster hosting many control planes), postponing the creation of the kube-scheduler pod might safe the step of debugging why the pod is crash looping (depending on how the control plan is provisioned).
There was a problem hiding this comment.
My comment is more for the debugging purposes than the functional/conceptual ones. The operator will go degraded when the policy field is set.
|
/hold cancel |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: damemi, ingvagabund, soltysh The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest-required |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
25 similar comments
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
@damemi: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
@damemi: Bugzilla bug 2033751 is in an unrecognized state (VERIFIED) and will not be moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The scheduler's Policy api was removed in 1.23, see kubernetes/kubernetes#105828
As an alternative we could also keep just using the default (lownodeutilization) here, in order to not break the scheduler on upgrades. In that case, log a message that we're using the default profile instead. Open to ideas