Add priority value tests for default priorityClassNames#30268
Add priority value tests for default priorityClassNames#30268CoreyCook8 wants to merge 9 commits intoopenshift:mainfrom
Conversation
|
Hi @CoreyCook8. Thanks for your PR. I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: CoreyCook8 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/ok-to-test |
|
Job Failure Risk Analysis for sha: 753d94c
Risk analysis has seen new tests most likely introduced by this PR. New Test Risks for sha: 753d94c
New tests seen in this PR at sha: 753d94c
|
|
|
||
| It("system-node-critical=2000001000", func() { | ||
| By("creating the pods") | ||
| err := oc.Run("create").Args("-f", systemNodeCriticalPodFile).Execute() |
There was a problem hiding this comment.
Each pod needs to be cleaned/deleted after it's used. Also, better to create both pods in a temporary namespace.
|
Job Failure Risk Analysis for sha: 389b19a
Risk analysis has seen new tests most likely introduced by this PR. New Test Risks for sha: 389b19a
New tests seen in this PR at sha: 389b19a
|
|
Job Failure Risk Analysis for sha: 47ce8ab
Risk analysis has seen new tests most likely introduced by this PR. New tests seen in this PR at sha: 47ce8ab
|
| spec: | ||
| containers: | ||
| - name: busybox | ||
| image: busybox |
There was a problem hiding this comment.
busybox image has not been used in the tests before. Would you please try image-registry.openshift-image-registry.svc:5000/openshift/tools:latest instead? To reduce the number of pulled images. E.g. https://github.com/openshift/origin/blob/main/test/extended/testdata/cmd/test/cmd/testdata/rollingupdate-daemonset.yaml#L30-L31 runs sleep command through tools:latest as well.
|
@CoreyCook8 other than that this looks good to go. @bertinatto would you please take a look for the approval? |
@ingvagabund @CoreyCook8 I’m deferring approvals in this repo to TRT so they’re aware of new tests being added. This makes it easier to track if the tests start failing. |
Please do not include specific values in test names (2000000000 / 2000001000), as they're subject to change (even if extremely unlikely). You can include the value in the output instead. |
|
Job Failure Risk Analysis for sha: 84ad3df
Risk analysis has seen new tests most likely introduced by this PR. New tests seen in this PR at sha: 84ad3df
|
|
👋 @ingvagabund @stbenjam I believe I have addressed all of the comments if you don't mind taking another look! |
|
It seems like a good idea to have some end-to-end coverage of the priority admission plugin, but have you tried to submit these upstream first? It's unclear to me why we'd want to tie them to downstream. My understanding is that the mirror pods of OpenShift's static pods do get admitted normally and have the numeric priority set, it's just that Kubelet acts based on the static pod itself rather than its mirror. Is that right? |
|
The purpose of these tests is to ensure that the priority-class-name does not diverge from the priority we are setting on the static pods. So, not necessarily that the priority admission plugin works, but rather the constant number we are assuming for these priority class names don't change and if they do change we catch that first in test. |
|
Will the mirror pods fail admission in that case? IIRC one of the behaviors of the priority admission plugin is to reject pods with both number and name set if the number doesn't match. |
We could quickly test that by creating a proof PR similar to this one, setting the priority to a different value. If bootstrapping fails, then we don't need a test here, but it might still be worth proposing it upstream. Edit: it seems like Ben is right: https://github.com/openshift/kubernetes/blob/72c39d96fb2f38c92a220bf269304e508ecaaac5/plugin/pkg/admission/priority/admission.go#L188 By the way, do we have a bug associated with this? Or is this proactively trying to avoid the problem described in the upstream issue? |
|
If that's the case, perhaps we don't need any new test for this? And I have a support case with RH support. But the jist of the issue we have is for SNO openshift, ceph doesn't shut down correctly. We have graceful shutdown enabled, but once the apiserver/etcd go down, obviously all of the apiserver calls fail and things don't end up shutting down gracefully. Then, the host hangs while shutting down because of the leftover PVCs that never got properly unmounted. I believe it's a combination of apiserver calls / KCM state of the world that cause this, but I was able to do a POC in a dev environment that setting the priority explicitly prevents the static pods from shutting down early and allowed us to properly shut down. |
|
@benluddy @bertinatto What do you think? Should I close this PR? |
|
@CoreyCook8: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Job Failure Risk Analysis for sha: 9f18d74
Risk analysis has seen new tests most likely introduced by this PR. New tests seen in this PR at sha: 9f18d74
|
|
@benluddy @bertinatto Pinging here again. Any opinion on this? |
IMO we don't this test because admission should fail if the pod contains a priority that's different from the one from the priority class.
It seems that support case wouldn’t have happened if we had a test asserting that the priority of control plane mirror pods was properly set. Now that you’ve fixed that, adding a test for that might be useful (but that's up to you). |
I think the mirror pods were always setting the numeric priority via admission -- it's the static pod manifests themselves that needed it. IMO, we don't need to maintain a test like this downstream, since part of the behavior of the priority admission plugin is to reject pod creation when the numeric priority and the priority class name don't match. If the upstream coverage is poor for that behavior, then it would benefit everyone to test it upstream. |
|
Okay I think we're on the same page here @benluddy @bertinatto We are still waiting for approvals on these to resolve the issue. @ingvagabund do you mind taking a look if you agree? |
Because of the issue described here: kubernetes/kubernetes#133442
We are setting the priority field on static pods in the PRs:
openshift/cluster-etcd-operator#1476
openshift/cluster-kube-scheduler-operator#572
openshift/cluster-kube-apiserver-operator#1915
openshift/cluster-kube-controller-manager-operator#865
To do this, we need to have a test which verifies that the expected values for priority are in line with the priorityClassName