-
Notifications
You must be signed in to change notification settings - Fork 41.7k
Modify how to check the status in PodRejectionStatus test #130097
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Ayato Tokubi <[email protected]>
|
Hi @bitoku. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/ok-to-test |
|
/retest |
|
I'm not familiar with the kubelet part. I'm unsure whether this commit is possible to resolve the issue. It is based on the pod event to get the old and new pod status. Can you test it in your environment? If it still fails, it seems that it is unlikely to compare other fields. i.e., conditions, hostIPs, etc. |
|
@carlory It didn't work unfortunately. The object of this check is to check if it keeps certain fields after rejection such as QoSClass. |
|
/triage accepted |
|
/priority backlog based on the issue priority |
|
/assign @esotsal |
|
@esotsal You can just put ok-to-test or you can run it on local. This is how I test it in local: export CGROUP_DRIVER=systemd
export CONTAINER_RUNTIME=remote
export CONTAINER_RUNTIME_ENDPOINT=unix:///var/run/crio/crio.sock
cd ~/kubernetes
./hack/local-up-cluster.sh
# in different shell, fix coredns
export KUBECONFIG=/var/run/kubernetes/admin.kubeconfig
k edit cm -n kube-system coredns # delete loop
k delete pods -n kube-system $(k get pods -n kube-system -ojson | jq -r '.items[0].metadata.name')
# run tests
UUID=$(uuid -v4)
make all "WHAT=test/e2e/e2e.test cmd/kubectl vendor/github.com/onsi/ginkgo/v2/ginkgo"
mkdir -p "_rundir/$UUID"
ln -s $(pwd)/_output/bin/kubectl "_rundir/$UUID"
ln -s $(pwd)/_output/bin/e2e.test "_rundir/$UUID"
ln -s $(pwd)/_output/bin/ginkgo "_rundir/$UUID"
kubetest2 noop --run-id="$UUID" --test=ginkgo --kubeconfig /var/run/kubernetes/admin.kubeconfig -- --focus-regex=PodRejectionStatus --use-built-binaries |
|
@bitoku The initial purpose of this e2e test is to detect if there are any new fields in Status that were dropped by the pod rejection. This is to ensure that the kubelet's admission is not dropping any fields that are required by the kubelet to be present in the Status of the pod. The failure result of this commit proves that the test is not working as expected. We can not say other fields wouldn't be changed when calculating the latest status of the pod if the pod were rejected. I'm not sure whether we still need this test. |
LGTM. this PR is enough to fix the e2e failure. |
I agree , thanks for creating the demo PR @bitoku , it demonstrates clearly the issue checking the logs /lgtm @SergeyKanzhelev moving this to need approver, since i understood from last weeks sig-node this is important to be solved soon. |
|
/lgtm |
|
LGTM label has been added. Git tree hash: 003af392396b06352d81552fe06386e16b9adae3
|
|
pinging @SergeyKanzhelev for approval |
|
/assign @SergeyKanzhelev PTAL when you can. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bitoku, mrunalp The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/test pull-kubernetes-cmd |
…97-upstream-release-1.32 Automated cherry pick of #130097: Modify how to check the status in PodRejectionStatus test
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
Currently the PodRejectionStatus test assumes that most fields are the same as the previous status except for a few certain fields. However this assumption is sometimes wrong in case that there's an operator which modifies the pod when it is bound to a node.
Operator with this behaviour is not rare, for example ovn-kubernetes behaves like that, and in that environment, the test always fails.
(cf. failure in openshift org https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_kubernetes/2189/pull-ci-openshift-kubernetes-release-4.19-k8s-e2e-gcp-ovn/1884690562456489984)
This PR changes the way to compare and make the restriction more relaxed so that e2e test in those environment should success.
Which issue(s) this PR fixes:
Fixes #129056
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: