Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Upgrade process gets stuck evicting pod when eviction request denied. #4720

Open
Aaron-ML opened this issue Dec 19, 2024 · 1 comment
Assignees

Comments

@Aaron-ML
Copy link

Aaron-ML commented Dec 19, 2024

Describe the bug
We currently utilize strimzi operator for running Apache kafka workloads on Azure AKS.

Part of this process is a tool called drain cleaner which utilizes admission webhooks and denies evictions directly and calls on the operator to evict the kafka pods safely.

It appears that the AKS upgrade process does not tolerate denied evictions safely. During upgrades across multiple clusters we have noticed that the eviction will get denied, then our drain cleaner service will call the operator to roll the kafka pod to a new node. Which it does so successfully within minutes of the denied eviction.

However the AKS upgrade cycle gets stuck in a loop and will continue to request evictions from the same pod even though it no longer lives on the node that's being evicted until eventual time out.

To Reproduce
Steps to reproduce the behavior:

  1. Run pod workload and deny evictions to it.
  2. Trigger AKS upgrade to new version.
  3. Manually move blocking pod to a new node.
  4. Watch AKS continue to evict the pod even though it's not blocking node evictions anymore.

Expected behavior
I'd like to see the AKS upgrade process check the status/location of a pod it requested eviction on before requesting additional evictions. That way if the pod is relocated outside of the upgrade process it won't get stuck trying to recycle pods that don't interfere with the upgrade anymore.

Environment (please complete the following information):

  • Kubernetes version [e.g. 1.24.3]: 1.30.9
  • Strimzi Operator 0.42

Additional context
Kubectl Event log:

16m         Normal    Scheduled                  pod/prod-kafka-1                        Successfully assigned kafka/prod-kafka-1 to aks-kafka-13950104-vmss00000n
16m         Normal    SuccessfulAttachVolume     pod/prod-kafka-1                        AttachVolume.Attach succeeded for volume "pvc-5ef4f5a7-1c38-4809-b1f1-debdd9900e8c"
16m         Normal    Pulling                    pod/prod-kafka-1                        Pulling image "quay.io/strimzi/kafka:0.40.0-kafka-3.7.0"
16m         Normal    Pulled                     pod/prod-kafka-1                        Successfully pulled image "quay.io/strimzi/kafka:0.40.0-kafka-3.7.0" in 8.024s (8.024s including waiting)
16m         Normal    Created                    pod/prod-kafka-1                        Created container kafka
16m         Normal    Started                    pod/prod-kafka-1                        Started container kafka
15m         Warning   Unhealthy                  pod/prod-kafka-1                        Readiness probe failed:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current...
13m         Normal    Killing                    pod/prod-kafka-1                        Stopping container kafka
13m         Warning   Unhealthy                  pod/prod-kafka-1                        Readiness probe failed:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current...
13m         Normal    Scheduled                  pod/prod-kafka-1                        Successfully assigned kafka/prod-kafka-1 to aks-kafka-13950104-vmss00000n
13m         Normal    ManualRollingUpdate        pod/prod-kafka-1                        Pod was manually annotated to be rolled
13m         Normal    Started                    pod/prod-kafka-1                        Started container kafka
13m         Normal    Created                    pod/prod-kafka-1                        Created container kafka
13m         Normal    Pulled                     pod/prod-kafka-1                        Container image "quay.io/strimzi/kafka:0.40.0-kafka-3.7.0" already present on machine
11m         Normal    Killing                    pod/prod-kafka-1                        Stopping container kafka
11m         Warning   Unhealthy                  pod/prod-kafka-1                        Readiness probe failed:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current...
11m         Normal    ManualRollingUpdate        pod/prod-kafka-1                        Pod was manually annotated to be rolled
11m         Normal    Scheduled                  pod/prod-kafka-1                        Successfully assigned kafka/prod-kafka-1 to aks-kafka-13950104-vmss00000n
11m         Normal    Pulled                     pod/prod-kafka-1                        Container image "quay.io/strimzi/kafka:0.40.0-kafka-3.7.0" already present on machine
11m         Normal    Created                    pod/prod-kafka-1                        Created container kafka
11m         Normal    Started                    pod/prod-kafka-1                        Started container kafka
9m26s       Normal    Killing                    pod/prod-kafka-1                        Stopping container kafka
9m24s       Warning   Unhealthy                  pod/prod-kafka-1                        Readiness probe failed:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current...
9m18s       Normal    ManualRollingUpdate        pod/prod-kafka-1                        Pod was manually annotated to be rolled
9m18s       Normal    Scheduled                  pod/prod-kafka-1                        Successfully assigned kafka/prod-kafka-1 to aks-kafka-13950104-vmss00000n
9m16s       Normal    Started                    pod/prod-kafka-1                        Started container kafka
9m16s       Normal    Created                    pod/prod-kafka-1                        Created container kafka
9m16s       Normal    Pulled                     pod/prod-kafka-1                        Container image "quay.io/strimzi/kafka:0.40.0-kafka-3.7.0" already present on machine
7m26s       Normal    Killing                    pod/prod-kafka-1                        Stopping container kafka
7m18s       Normal    Scheduled                  pod/prod-kafka-1                        Successfully assigned kafka/prod-kafka-1 to aks-kafka-13950104-vmss00000n
7m18s       Normal    ManualRollingUpdate        pod/prod-kafka-1                        Pod was manually annotated to be rolled
7m16s       Normal    Pulled                     pod/prod-kafka-1                        Container image "quay.io/strimzi/kafka:0.40.0-kafka-3.7.0" already present on machine
7m16s       Normal    Created                    pod/prod-kafka-1                        Created container kafka
7m16s       Normal    Started                    pod/prod-kafka-1                        Started container kafka
5m26s       Normal    Killing                    pod/prod-kafka-1                        Stopping container kafka
5m16s       Normal    ManualRollingUpdate        pod/prod-kafka-1                        Pod was manually annotated to be rolled
5m16s       Normal    Scheduled                  pod/prod-kafka-1                        Successfully assigned kafka/prod-kafka-1 to aks-kafka-13950104-vmss00000n
5m14s       Normal    Started                    pod/prod-kafka-1                        Started container kafka
5m14s       Normal    Created                    pod/prod-kafka-1                        Created container kafka
5m14s       Normal    Pulled                     pod/prod-kafka-1                        Container image "quay.io/strimzi/kafka:0.40.0-kafka-3.7.0" already present on machine
3m26s       Normal    Killing                    pod/prod-kafka-1                        Stopping container kafka
3m24s       Warning   Unhealthy                  pod/prod-kafka-1                        Readiness probe failed:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current...
3m17s       Normal    Scheduled                  pod/prod-kafka-1                        Successfully assigned kafka/prod-kafka-1 to aks-kafka-13950104-vmss00000n
3m17s       Normal    ManualRollingUpdate        pod/prod-kafka-1                        Pod was manually annotated to be rolled
3m16s       Normal    Started                    pod/prod-kafka-1                        Started container kafka
3m16s       Normal    Created                    pod/prod-kafka-1                        Created container kafka
3m16s       Normal    Pulled                     pod/prod-kafka-1                        Container image "quay.io/strimzi/kafka:0.40.0-kafka-3.7.0" already present on machine
86s         Normal    Killing                    pod/prod-kafka-1                        Stopping container kafka
78s         Normal    Scheduled                  pod/prod-kafka-1                        Successfully assigned kafka/prod-kafka-1 to aks-kafka-13950104-vmss00000n
78s         Normal    ManualRollingUpdate        pod/prod-kafka-1                        Pod was manually annotated to be rolled
76s         Normal    Started                    pod/prod-kafka-1                        Started container kafka
76s         Normal    Created                    pod/prod-kafka-1                        Created container kafka
76s         Normal    Pulled                     pod/prod-kafka-1                        Container image "quay.io/strimzi/kafka:0.40.0-kafka-3.7.0" already present on machine

drain cleaner pod that denies evictions and calls the operator to evict safely:

2024-12-19 00:25:11,702 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:25:16,716 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:25:16,716 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:25:16,716 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:25:21,735 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:25:21,735 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:25:21,735 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:25:26,750 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:25:26,751 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:25:26,751 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:25:31,765 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:25:31,765 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka should be annotated for restart
2024-12-19 00:25:31,792 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka was patched
2024-12-19 00:25:31,792 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:25:36,806 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:25:36,806 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:25:36,806 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:25:41,834 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:25:41,835 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:25:41,835 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:25:46,848 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:25:46,848 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:25:46,848 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:25:51,887 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:25:51,887 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:25:51,887 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:25:56,901 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:25:56,901 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:25:56,901 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:01,916 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:01,916 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:26:01,916 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:06,929 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:06,929 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:26:06,929 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:11,946 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:11,946 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:26:11,946 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:16,960 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:16,960 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:26:16,960 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:21,978 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:21,978 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:26:21,978 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:26,992 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:26,992 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:26:26,992 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:32,008 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:32,008 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:26:32,008 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:37,022 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:37,023 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:26:37,023 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:42,037 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:42,037 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:26:42,037 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:47,050 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:47,050 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:26:47,050 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:52,067 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:52,067 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:26:52,067 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:57,089 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:26:57,089 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:26:57,089 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:02,103 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:02,103 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:27:02,103 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:07,117 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:07,118 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:27:07,118 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:12,137 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:12,137 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:27:12,137 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:17,149 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:17,149 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:27:17,149 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:22,165 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:22,165 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:27:22,165 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:27,178 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:27,178 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:27:27,178 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:32,195 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:32,195 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka should be annotated for restart
2024-12-19 00:27:32,222 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka was patched
2024-12-19 00:27:32,222 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:37,237 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:37,237 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:27:37,237 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:42,253 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:42,253 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:27:42,253 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:47,268 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:47,268 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:27:47,268 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:52,285 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:52,285 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:27:52,285 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:57,299 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:27:57,299 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:27:57,299 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:02,317 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:02,317 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:28:02,317 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:07,333 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:07,333 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:28:07,333 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:12,354 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:12,354 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:28:12,355 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:17,372 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:17,373 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:28:17,373 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:22,388 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:22,388 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:28:22,388 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:27,401 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:27,401 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:28:27,401 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:32,417 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:32,417 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:28:32,417 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:37,430 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:37,430 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:28:37,430 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:42,446 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:42,446 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:28:42,446 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:47,460 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:47,460 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:28:47,460 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:52,480 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:52,480 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:28:52,480 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:57,494 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:28:57,494 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:28:57,494 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:02,511 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:02,511 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:29:02,511 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:07,526 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:07,526 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:29:07,526 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:12,542 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:12,542 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:29:12,542 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:17,555 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:17,555 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:29:17,555 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:22,572 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:22,572 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:29:22,572 INFO  [io.str.ValidatingWebhook] (executor-thread-11) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:27,597 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:27,597 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:29:27,597 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:32,613 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:32,613 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka should be annotated for restart
2024-12-19 00:29:32,643 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka was patched
2024-12-19 00:29:32,643 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:37,657 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:37,657 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:29:37,657 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:42,673 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:42,674 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:29:42,674 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:47,687 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:47,687 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:29:47,687 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Denying request for eviction of Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:52,702 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Received eviction webhook for Pod prod-kafka-1 in namespace kafka
2024-12-19 00:29:52,702 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Pod prod-kafka-1 in namespace kafka is already annotated for restart
2024-12-19 00:29:52,702 INFO  [io.str.ValidatingWebhook] (executor-thread-15) Denying request for eviction of Pod prod-kafka-1 in namespace kafka

You can see that drain cleaner is receiving continued requests to evict prod-kafka-1. On first request the pod lived on vmss00000j which was being drained. Strimzi operator took action and relocated prod-kafka-1 to vmss00000n which was an already upgraded node. However the AKS upgrade process continued to request evictions.

k get nodes -l agentpool=kafka
NAME                            STATUS                     ROLES    AGE    VERSION
aks-kafka-13950104-vmss00000j   Ready,SchedulingDisabled   agent    212d   v1.28.5
aks-kafka-13950104-vmss00000k   Ready                      agent    212d   v1.28.5
aks-kafka-13950104-vmss00000l   Ready                      agent    212d   v1.28.5
aks-kafka-13950104-vmss00000m   Ready                      agent    201d   v1.28.5
aks-kafka-13950104-vmss00000n   Ready                      <none>   31m    v1.29.10

Ultimately this causes the prod-kafka-1 pod to get restarted every few minutes, even though it's not blocking the upgrade process at all as it lives on a node that's already up to date.

This also slows down the upgrade process as the only way this moves forward is when node time outs are hit.

Copy link
Contributor

@kaarthis, @sdesai345 would you be able to assist?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants