-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Add lifecycle hooks to pods from jobs automatically #8006
Comments
If I understand correctly, lifecycle hooks can't actually do this. From https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#hook-handler-execution:
That is, lifecycle hooks don't apply when a container exits gracefully. They only apply when Kubelet decides to terminate a container; and if Kubelet is deciding to terminate the Job, the proxy will shutdown gracefully. I think the only real approach to solving this problem would be to write a controller that deletes jobs when the linkerd proxy is the only running container. |
So i did look at this: https://itnext.io/three-ways-to-use-linkerd-with-kubernetes-jobs-c12ccc6d4c7c I thought it was kind of neat to do this, but the cleanest and easiest way would be to have a controller like you say, since if I have numerous jobs to configure in numerous places it just becomes tedious to manage. Having a controller constantly check to auto-cleanup would be useful |
Another option would be to make the linkerd proxy container (either the binary directly or another process therein) aware of the state of the pod by polling the kubernetes API. Instead of having a central controller polling the state of the pods each pod would poll its own state and terminate its own proxy. Advantage would be that there is no controller installation needed. There could also be less resource usage when there are no jobs running. I think it should also perform better when there are significantly less job pods than other pods which is probably more common. I am not sure though if the default service account has permissions for that or if linkerd can possibly inject those, but I believe it could. |
Update: I came up with a proof of concept that I've been developing as an extension for Linkerd (and a personal side project). The POC works as-is, but it has two issues:
The basic idea for the POC is to have an admission controller that will modify the entrypoint of pods to call into
e.g
While the idea works in practice, it exposes us to a few questions around security implications. It also makes testing a bit of a nightmare, and it still involves changing manifests to expose commands and args. I also considered the approach suggested above (in the previous comment). This is not something I think we'd be open to doing in the proxy (it shouldn't have a Kubernetes client, or even the notion of running in Kubernetes) but I did think of having an "init system" that will
A different approach would be to have a controller that mutates pods and runs this init system and then watches the state of all pods. When a pod should be terminated, it signals the init system (network call) which will kill the proxy's process. I haven't looked much into this alternative, it's probably the simplest since:
Will wait for some feedback but this is what I've come up with so far. |
fyi you may run into kubernetes/kubernetes#106896 if you are watching the API server. Although I recall there are some cases where it works, just not all - I don't think I tested jobs |
Hey, what is the current state of this issue? |
in light of sidecars (actually) being added to k8s (read: https://buoyant.io/blog/kubernetes-1-28-revenge-of-the-sidecars) i believe this might become something that can be addressed once it is in |
Agreed on @jack1902's statement; sidecar containers in k8s are only on alpha stage for now, waiting in particular for proper termination ordering to be implemented (see kubernetes/kubernetes#120620 ). When that implementation matures we'll prioritize its integration with linkerd to solve this issue. |
We're using #11461 to track the work of implementing KEP-753. |
Linkerd now supports native sidecars. |
What problem are you trying to solve?
When using
linkerd
to inject everything inside a cluster, pods spawned from jobs fall into aNotReady
state as the main container inside the pod has completed its task but theproxy
runs forever.Additionally, it is impossible to use
defaultAllowPolicy: "cluster-authenticated"
without injectingjobs
because they will not be able to communicate with the relevant things inside the mesh.Slack Threads:
How should the problem be solved?
When a
pod
is spawned which belongs to ajob
/cronjob
thepod
should have a lifecycleHook automatically injected to runcurl -X POST http://localhost:4191/shutdown
or equivalent to ensure the container running the work terminates theproxy
.Additionally, it could be beneficial to have an annotation that could configure the lifecycleHook, for example:
Any alternatives you've considered?
Configuring a bunch of policies cluster wide to enable
jobs
to work whilst 99% of other traffic is authed and through the mesh. Ideally, getting fresh clusters onboarded would be pretty quick and painless where possible for many users.Additionally, I've considered adding the
hook
myself to my objects but some of them are spawned via third-party charts which don't provide a clean interface to add these relevant hooks. I would have to resort tokustomize
to add the lifecycle hook for each job within the cluster that needs to communicate to things on the meshHow would users interact with this feature?
They could configure it via
annotations
that are read by the injection webhook which vary the output slightly (curl vs wget vs other) and would be able to enable/disable the hook injection aswell as the injection of the proxyWould you like to work on this feature?
No response
The text was updated successfully, but these errors were encountered: