-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle pod deletes outside workflow lifecycle #1414
Comments
I run in to same issue with v2.6.0-rc1. To reproduce just delete node running one or several workflow pods. Whole workflow stuck in running state as some pods had "pod deleted" and newer retried. Expected behaviour is pod rescheduled with retry. Found it works with 2.4.3 |
@audriusrudalevicius - Could you give more detail about your case? Argo has the ability to handle the situation that pod is deleted outside of the wf lifecycle, in general if the POD is |
I found the issue. The problem was in my workflow after I upgraded argo from 2.4.3 to 2.6.0. I did't changed retry parameters: before it was |
Closing, feel free to reopen if necessary |
The parameters is a field of StandardK8sTrigger based on this https://github.com/argoproj/argo-events/blob/master/api/sensor.md#standardk8strigger Signed-off-by: Tho Nguyen <[email protected]>
Possibly a suggestion/feature request. I ran into an issue similar to #893 when Kured restarted a node on which a pod was executing a workflow step. This triggered the handling mechanism here which marked the workflow step as failed with the message
pod deleted
.Can't this scenario be augmented with a pre-stop hook injected into the pod-spec to notify
workflow-controller
to better handle cases where a pod has been deleted outside of the workflow lifecycle?https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/
The text was updated successfully, but these errors were encountered: