Feature request clean shutdown or helm operations as single pods / jobs #632

runningman84 · 2023-03-07T15:37:49Z

We have clusters in EKS where workers are controlled by karpenter. The worker nodes are spot instances. Therefore the cluster is quite dynamic nodes appear and disappear every few minutes.

Running helm controller on these nodes is risky because if you have long running helm install operations a given helm controller pod might be interrupted.

It would be great if the helm controller would wait before shutting down (which my still be an issue once a spot node is terminated within a 2 minute window) or would ensure that the given helm release does not stay in progress.

Another idea could be to use jobs or pods to do the single helm operation instead of doing everything in the main loop.

tldr I would like to run helm controller on short lives nodes without manual cleanups … right know I run it on fargate which is quite expensive compared to spot instances.

hiddeco · 2023-03-07T15:41:03Z

See #149 (comment). In combination with a sensitive retry configuration, this should ensure that from next release on releases should terminate gracefully (by marking them as "failed"), and then being retried once the controller finds a new node.

hiddeco · 2023-03-10T21:10:49Z

This should now happen in >=v0.31.0, see also #644.

hiddeco closed this as completed Mar 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request clean shutdown or helm operations as single pods / jobs #632

Feature request clean shutdown or helm operations as single pods / jobs #632

runningman84 commented Mar 7, 2023

hiddeco commented Mar 7, 2023 •

edited

Loading

hiddeco commented Mar 10, 2023

Feature request clean shutdown or helm operations as single pods / jobs #632

Feature request clean shutdown or helm operations as single pods / jobs #632

Comments

runningman84 commented Mar 7, 2023

hiddeco commented Mar 7, 2023 • edited Loading

hiddeco commented Mar 10, 2023

hiddeco commented Mar 7, 2023 •

edited

Loading