-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ScaledJob Scaler does not take "Terminating" Pods into consideration #2910
Comments
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions. |
hey, |
Hi Jorge, |
your use case makes sense I'd say, are you willing to contribute it? |
Hi, yes we could imagine to contribute this feature. Some questions on our side: |
Sorry again for the slow response. I have read your use case again, and I don't get the current problem. If you are using ScaledJob, your pods shouldn't receive the termination signal, at least form KEDA. KEDA creates the job and monitors them, but KEDA doesn't kill them (correct me if I'm wrong @zroubalik ). |
Yeah, you are right on this one @JorTurFer |
When Keda is not configured to use the "gradual rollout" feature, keda does delete the existing jobs and creates new jobs instead. In this case kubernetes will send a sig term to the underlying pods. Those pods will be in state "Terminating". As a consequence the amount of running pods will be off by the number of "Terminating" pods that dont have a parent job any more. In our case we want that sig-term in our pods because we can decide with logic that runs in the pod if the pod should terminate itself or shutdown needs to be delayed (which causes the "Terminating" status). |
But in that case, KEDA doesn't know if those pods are jobs or not :( |
Yes, you are right. This is what adds the complexity here :/... In order to achieve this, we would require some logic to find ther pods that are labeled with the according I am currently doing some reading on ownerships in kubernetes. Scaled-Job Definition (Custom Resource) ----> Kubernetes Jobs ----> Kubernetes Pods Keda uses the Foreground ownership as a default. This means that whenever the parent job is deleted, the corresponding child pod is deleted first. All we have to achieve now is that the job deletion is delayed until the pod could be deleted as well. |
According to the kubernetes docu the deletion of the parent resource (job) is delayed until the child (pod) was deleted when setting ownerReference.blockOwnerDeletion=true:
Keda defines this setting as a default so I am curious why we observe the issue described here in the first place to be honest. Edit: It seems that background delegation is the default. We will do some further tests with foreground delegation. |
I guess all we have to do is making the propagation policy configurable here: keda/controllers/keda/scaledjob_controller.go Line 190 in 2fc3796
I will prepare a PR for this. This will solve the issue described here. |
nice investigation! |
Yes, we are currently preparing such a parameter. Do you prefer something like this:
or something like this:
We would prefer option 1 but this change would be incompatible. |
I'd say that we can keep both. I mean, we can create another section named Something like: rolloutStrategy: default #Marking this as deprecated
rollout:
strategy: default
propagationPolicy: background |
Report
When a new scaledJob spec is rolled out, all existing Pods for those jobs receive a Sig-Term when using the default rollout strategy. This is desired behavior for us and fits our usecase.
Since we are doing long term calculations in those pods, we ignore the Sig-Term and our pod finishes its work. The Pod status of those sig-termed Pods is "Terminating".
Keda only seems to care about Pods that are either Pending or Running. As a consequence Keda will not count the "Terminating" Pods to the already running and healthy Pods. This means the amount of created Jobs by Keda is off by the number of Pods that are in state "Terminating".
Is there a possibility to configure that "Terminating" Pods are counted as running Pods as well? We had a look into the PendingPodConditions but they dont seem to work with the actual Pod status (Terminating).
Expected Behavior
KEDA should create N pods - number of the Terminating pods from the previous version.
Actual Behavior
KEDA will create N pods + number of the Terminating pods from the previous version.
Steps to Reproduce the Problem
KEDA Version
2.5.0
Kubernetes Version
1.21
Platform
Microsoft Azure
Scaler Details
Redis
Anything else?
Slack-Discussion: https://kubernetes.slack.com/archives/CKZJ36A5D/p1649865034815469
The text was updated successfully, but these errors were encountered: