Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Add lifecycle hooks to pods from jobs automatically #8006

Closed
jack1902 opened this issue Mar 4, 2022 · 10 comments
Closed

Feature: Add lifecycle hooks to pods from jobs automatically #8006

jack1902 opened this issue Mar 4, 2022 · 10 comments

Comments

@jack1902
Copy link

jack1902 commented Mar 4, 2022

What problem are you trying to solve?

When using linkerd to inject everything inside a cluster, pods spawned from jobs fall into a NotReady state as the main container inside the pod has completed its task but the proxy runs forever.

Additionally, it is impossible to use defaultAllowPolicy: "cluster-authenticated" without injecting jobs because they will not be able to communicate with the relevant things inside the mesh.

Slack Threads:

How should the problem be solved?

When a pod is spawned which belongs to a job / cronjob the pod should have a lifecycleHook automatically injected to run curl -X POST http://localhost:4191/shutdown or equivalent to ensure the container running the work terminates the proxy.

Additionally, it could be beneficial to have an annotation that could configure the lifecycleHook, for example:

annotations:
  config.linkerd.io/lifecycle-hook-enabled: "true"
  config.linkerd.io/lifecycle-hook-binary: "wget" # could also be curl or others

Any alternatives you've considered?

Configuring a bunch of policies cluster wide to enable jobs to work whilst 99% of other traffic is authed and through the mesh. Ideally, getting fresh clusters onboarded would be pretty quick and painless where possible for many users.

Additionally, I've considered adding the hook myself to my objects but some of them are spawned via third-party charts which don't provide a clean interface to add these relevant hooks. I would have to resort to kustomize to add the lifecycle hook for each job within the cluster that needs to communicate to things on the mesh

How would users interact with this feature?

They could configure it via annotations that are read by the injection webhook which vary the output slightly (curl vs wget vs other) and would be able to enable/disable the hook injection aswell as the injection of the proxy

Would you like to work on this feature?

No response

@olix0r
Copy link
Member

olix0r commented Mar 8, 2022

If I understand correctly, lifecycle hooks can't actually do this. From https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#hook-handler-execution:

PreStop

This hook is called immediately before a container is terminated due to an API request or management event such as a liveness/startup probe failure, preemption, resource contention and others.

That is, lifecycle hooks don't apply when a container exits gracefully. They only apply when Kubelet decides to terminate a container; and if Kubelet is deciding to terminate the Job, the proxy will shutdown gracefully.

I think the only real approach to solving this problem would be to write a controller that deletes jobs when the linkerd proxy is the only running container.

@jack1902
Copy link
Author

So i did look at this: https://itnext.io/three-ways-to-use-linkerd-with-kubernetes-jobs-c12ccc6d4c7c

I thought it was kind of neat to do this, but the cleanest and easiest way would be to have a controller like you say, since if I have numerous jobs to configure in numerous places it just becomes tedious to manage. Having a controller constantly check to auto-cleanup would be useful

@mladedav
Copy link

Another option would be to make the linkerd proxy container (either the binary directly or another process therein) aware of the state of the pod by polling the kubernetes API.

Instead of having a central controller polling the state of the pods each pod would poll its own state and terminate its own proxy.

Advantage would be that there is no controller installation needed. There could also be less resource usage when there are no jobs running. I think it should also perform better when there are significantly less job pods than other pods which is probably more common.

I am not sure though if the default service account has permissions for that or if linkerd can possibly inject those, but I believe it could.

@adleong adleong added this to the stable-2.13.0 milestone Sep 22, 2022
@adleong adleong removed this from the stable-2.13.0 milestone Jan 19, 2023
@mateiidavid
Copy link
Member

Update:

I came up with a proof of concept that I've been developing as an extension for Linkerd (and a personal side project). The POC works as-is, but it has two issues:

  1. It is undocumented
  2. Progress has been slow. The idea is a bit too complicated and it introduces some security implications that I am not 100% sure I want to tackle.

The basic idea for the POC is to have an admission controller that will modify the entrypoint of pods to call into linkerd-await:

  1. Based on a label we figure out if a pod should be mutated.
  2. A pod needs to contain linkerd-proxy and at least one container with command field exposed in the template spec.
  3. The controller will add: an empty dir volume, an init container with linkerd-await, and it will copy the binary from the init container into the empty dir volume. The volume will be attached to any container that exposes the command field.
  4. The controller will then modify the command entrypoint to call into the await binary. If this is done for more than one container, then whichever finishes first will shut down the proxy.

e.g

# if I'd write it in pseuo-bash-code:
before: ./my-process
after: ./linkerd-await --shutdown -- '$@'  where '$@' is "./my-process" followed by whatever args were originally passed in.

While the idea works in practice, it exposes us to a few questions around security implications. It also makes testing a bit of a nightmare, and it still involves changing manifests to expose commands and args.
This is the "safest" way I found to do it though. If you have feedback on it, or a better way to do it, I'd be interested in resuming the work. But as it stands, I'm not sure how useful it would be in its current form.

I also considered the approach suggested above (in the previous comment). This is not something I think we'd be open to doing in the proxy (it shouldn't have a Kubernetes client, or even the notion of running in Kubernetes) but I did think of having an "init system" that will fork() the proxy process and shut it down whenever the other container has finished. The problem is:

  1. If we want to do this at the container level, we need to shareProcessNamespace, which I'm reluctant to suggest (again, based on security implications and configuration nightmares)
  2. If we were to do this at the pod level, we'd need a Kubernetes client for each pod, not necessarily a bad thing, but Linkerd (or the extension) will need to handle RBAC, which is not something I want to get into.

A different approach would be to have a controller that mutates pods and runs this init system and then watches the state of all pods. When a pod should be terminated, it signals the init system (network call) which will kill the proxy's process. I haven't looked much into this alternative, it's probably the simplest since:

  • RBAC can be a bit more centralised (what I mean by it is that only one deployment will be able to watch the resources)
  • If the proxy's image contains the binary, we don't need to mutate a bunch of pods, mutations will also be a bit more deterministic.

Will wait for some feedback but this is what I've come up with so far.

@howardjohn
Copy link
Contributor

fyi you may run into kubernetes/kubernetes#106896 if you are watching the API server. Although I recall there are some cases where it works, just not all - I don't think I tested jobs

@salvalcantara
Copy link

Hey, what is the current state of this issue?

@jack1902
Copy link
Author

in light of sidecars (actually) being added to k8s (read: https://buoyant.io/blog/kubernetes-1-28-revenge-of-the-sidecars) i believe this might become something that can be addressed once it is in stable. Hopefully with that being added the answer to this is to simply ensure that any job uses the above mentioned bit around ensuring linkerd is injected a true sidecar and this will ensure the proxy shutsdown accordingly when the main container is shutdown

@adleong adleong assigned alpeb and unassigned mateiidavid Oct 5, 2023
@alpeb
Copy link
Member

alpeb commented Oct 5, 2023

Agreed on @jack1902's statement; sidecar containers in k8s are only on alpha stage for now, waiting in particular for proper termination ordering to be implemented (see kubernetes/kubernetes#120620 ). When that implementation matures we'll prioritize its integration with linkerd to solve this issue.

@kflynn
Copy link
Member

kflynn commented Oct 7, 2023

We're using #11461 to track the work of implementing KEP-753.

@olix0r
Copy link
Member

olix0r commented Sep 10, 2024

Linkerd now supports native sidecars.

@olix0r olix0r closed this as completed Sep 10, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants