Skip to content

Conversation

@bitoku
Copy link
Contributor

@bitoku bitoku commented Feb 11, 2025

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

Currently the PodRejectionStatus test assumes that most fields are the same as the previous status except for a few certain fields. However this assumption is sometimes wrong in case that there's an operator which modifies the pod when it is bound to a node.
Operator with this behaviour is not rare, for example ovn-kubernetes behaves like that, and in that environment, the test always fails.
(cf. failure in openshift org https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_kubernetes/2189/pull-ci-openshift-kubernetes-release-4.19-k8s-e2e-gcp-ovn/1884690562456489984)

This PR changes the way to compare and make the restriction more relaxed so that e2e test in those environment should success.

Which issue(s) this PR fixes:

Fixes #129056

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 11, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @bitoku. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/test sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 11, 2025
@kannon92
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 11, 2025
@kannon92
Copy link
Contributor

/cc @carlory @SergeyKanzhelev

@bitoku
Copy link
Contributor Author

bitoku commented Feb 11, 2025

/retest

@carlory
Copy link
Member

carlory commented Feb 12, 2025

I'm not familiar with the kubelet part. I'm unsure whether this commit is possible to resolve the issue. It is based on the pod event to get the old and new pod status. Can you test it in your environment? If it still fails, it seems that it is unlikely to compare other fields. i.e., conditions, hostIPs, etc.

@bitoku
Copy link
Contributor Author

bitoku commented Feb 12, 2025

@carlory It didn't work unfortunately.

The object of this check is to check if it keeps certain fields after rejection such as QoSClass.
If so, I think this PR is enough to check that behaviour.

@ffromani
Copy link
Contributor

/triage accepted
/priority important-longterm

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Feb 12, 2025
@ffromani
Copy link
Contributor

/priority backlog

based on the issue priority

@k8s-ci-robot k8s-ci-robot added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Feb 12, 2025
@SergeyKanzhelev
Copy link
Member

/assign @esotsal

@bitoku
Copy link
Contributor Author

bitoku commented Feb 12, 2025

@esotsal
I created a new PR to demonstrate it fails when there's a pod update at the same time as the binding.
#130117

You can just put ok-to-test or you can run it on local.

This is how I test it in local:

export CGROUP_DRIVER=systemd
export CONTAINER_RUNTIME=remote
export CONTAINER_RUNTIME_ENDPOINT=unix:///var/run/crio/crio.sock

cd ~/kubernetes
./hack/local-up-cluster.sh

# in different shell, fix coredns
export KUBECONFIG=/var/run/kubernetes/admin.kubeconfig
k edit cm -n kube-system coredns  # delete loop
k delete pods -n kube-system $(k get pods -n kube-system -ojson | jq -r '.items[0].metadata.name')

# run tests
UUID=$(uuid -v4)
make all "WHAT=test/e2e/e2e.test cmd/kubectl vendor/github.com/onsi/ginkgo/v2/ginkgo"
mkdir -p "_rundir/$UUID"
ln -s $(pwd)/_output/bin/kubectl "_rundir/$UUID"
ln -s $(pwd)/_output/bin/e2e.test "_rundir/$UUID"
ln -s $(pwd)/_output/bin/ginkgo "_rundir/$UUID"
kubetest2 noop --run-id="$UUID" --test=ginkgo --kubeconfig /var/run/kubernetes/admin.kubeconfig -- --focus-regex=PodRejectionStatus --use-built-binaries

@carlory
Copy link
Member

carlory commented Feb 13, 2025

@bitoku The initial purpose of this e2e test is to detect if there are any new fields in Status that were dropped by the pod rejection. This is to ensure that the kubelet's admission is not dropping any fields that are required by the kubelet to be present in the Status of the pod.

The failure result of this commit proves that the test is not working as expected. We can not say other fields wouldn't be changed when calculating the latest status of the pod if the pod were rejected. I'm not sure whether we still need this test.

@carlory
Copy link
Member

carlory commented Feb 13, 2025

If so, I think this PR is enough to check that behaviour.

LGTM. this PR is enough to fix the e2e failure.

@esotsal
Copy link
Contributor

esotsal commented Feb 16, 2025

If so, I think this PR is enough to check that behaviour.

LGTM. this PR is enough to fix the e2e failure.

I agree , thanks for creating the demo PR @bitoku , it demonstrates clearly the issue checking the logs

/lgtm

@SergeyKanzhelev moving this to need approver, since i understood from last weeks sig-node this is important to be solved soon.

@haircommander
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 18, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 003af392396b06352d81552fe06386e16b9adae3

@bitoku
Copy link
Contributor Author

bitoku commented Feb 19, 2025

pinging @SergeyKanzhelev for approval

@kannon92
Copy link
Contributor

/assign @SergeyKanzhelev

PTAL when you can.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bitoku, mrunalp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 1, 2025
@pacoxu
Copy link
Member

pacoxu commented Mar 3, 2025

/test pull-kubernetes-cmd

@k8s-ci-robot k8s-ci-robot merged commit 5d28e86 into kubernetes:master Mar 3, 2025
14 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.33 milestone Mar 3, 2025
@github-project-automation github-project-automation bot moved this from PRs - Needs Approver to Done in SIG Node CI/Test Board Mar 3, 2025
k8s-ci-robot added a commit that referenced this pull request Mar 6, 2025
…97-upstream-release-1.32

Automated cherry pick of #130097: Modify how to check the status in PodRejectionStatus test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/backlog Higher priority than priority/awaiting-more-evidence. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

kubelet: PodRejectionStatus Kubelet should reject pod when the node didn't have enough resource test error