-
Notifications
You must be signed in to change notification settings - Fork 39.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Already API-evicted Pods do not get evicted by the kubelet eviction manager (memory pressure, ephemeral storage pressure) #122297
Comments
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/sig node |
Do you have a |
Yes, I was able to reproduce it with this minimal example. I had 16GB Nodes (RAM) inside the cluster. If your nodes are smaller or larger you have to adjust the memory allocation in order to trigger the memory pressure eviction.
Steps to reproduce
Result: Pod will not get killed by the kubelet eviction manager. If you do the exact same without API evicting the pod in the second step, the kubelet eviction will work. |
Potential duplicate of #118172. |
I am not sure if this is really a duplicate. I was not able to observe any additional logs as "Killing container with a grace period ..." after the prior API-eviction at all in the kubelet logs. So for me it looks like a different issue. But I guess it would be easy for someone with deeper kubelet knowledge to reproduce this and see if there is really no second eviction attempt or just a second eviction event with the wrong grace period. |
I was able to reproduce this issue with k8s 1.27.x and 1.28.4. |
/assign this is indeed looks like a duplicate of the mentioned issue as well as #122222 likely. I will close it as a dup for now, please comment if you do not agree /close |
@SergeyKanzhelev: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What happened?
When pods have a long terminationGracePeriod and get evicted by downscaling or some other reason (via API) the eviction is initiated which respects that long terminationGracePeriod. This works fine.
If the pods need to be evicted due to memory pressure on the node after the already initiated API eviction there is no more eviction done with the specified evictionMaxPodGracePeriod of the kubelet. This means that the second eviction due to e.g. memory pressure cannot be successful at all when you have large terminationGracePeriod on your pod and there was a prior unrelated API eviction.
What did you expect to happen?
The eviction manager triggered eviction due to memory pressure should still be issued with the specified evictionMaxPodGracePeriod even if there was a prior API based eviction with a large terminationGracePeriod.
How can we reproduce it (as minimally and precisely as possible)?
Anything else we need to know?
The bug was not present in k8s 1.25.x
Kubernetes version
1.26.7
Cloud provider
OS version
GardenLinux 934.10.0
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
The text was updated successfully, but these errors were encountered: