Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

createpod doesn't delete dead pods, spawned by the old createpod instance #165

Closed
d-uzlov opened this issue May 27, 2021 · 2 comments · Fixed by #356
Closed

createpod doesn't delete dead pods, spawned by the old createpod instance #165

d-uzlov opened this issue May 27, 2021 · 2 comments · Fixed by #356
Assignees
Labels
bug Something isn't working

Comments

@d-uzlov
Copy link
Contributor

d-uzlov commented May 27, 2021

Expected Behavior

createpod monitors all its pods' state. If a pod died, then this pod should be deleted by the createpod element to prevent pod list pollution.

Current Behavior

createpod only monitors the pods that it has created during the current session. If a server with createpod dies and respawns, then all of the pods that were spawned by the old server remains in the pod list forever, until they are removed manually.

Steps to Reproduce

  1. Create a deployment with createpod element.
  2. Use server with createpod to spawn a new pod.
  3. Restart the deployment.
  4. Wait until spawned server dies.
    The pod it was running in will not be removed from the list be the server with createpod element.

Solution

  1. Add a label to each of the pods that createpod spawns.
  2. Watch for all pods with this label, regardless of if it was spawned by this server of not.
@d-uzlov d-uzlov changed the title Make createpod watch for pod events by label createpod doesn't delete dead pods, spawned by the old createpod instance May 27, 2021
@d-uzlov
Copy link
Contributor Author

d-uzlov commented May 27, 2021

Other issues caused by createpod having transient state:

Issue 1: If a server with createpod creates a pod and then dies and respawns, and the new server gets a request, then a new pod is created, despite the fact that we already have the old pod.

Issue 2: If we accidentally or deliberately have several servers with createpod that create the same pod, then these servers will each create one pod on any node, thus one node will potentially have as many pods as we have servers spawning them.

Both of these issues can actually be solved exactly the same way as the issue with dead pods remaining in the list forever.
We could restore the list of created pods on the start using the label and then update our list using the pod events to synchronize with potentially existing other servers and minimize the chances of having several pods on one node.

@denis-tingaikin do you approve fixing this or is current behaviour desired?

@denis-tingaikin denis-tingaikin added the bug Something isn't working label May 27, 2021
@denis-tingaikin
Copy link
Member

it is a good catch. Within this issue, we need to consider a scenario with a few suppliers on the cluster. So currently it is out of scope and the issue doesn't break the core scenario scale from zero.
Thus issue will be considered after release.

denis-tingaikin added a commit to denis-tingaikin/sdk-k8s that referenced this issue May 12, 2022
nsmbot pushed a commit that referenced this issue Aug 28, 2023
…k@main

PR link: networkservicemesh/sdk#1505

Commit: f96fdf6
Author: Network Service Mesh Bot
Date: 2023-08-28 11:31:19 -0500
Message:
  - Update go.mod and go.sum to latest version from networkservicemesh/api@main (#1505)
PR link: networkservicemesh/api#165

Commit: c4a3ece
Author: Nikita Skrynnik
Date: 2023-08-22 21:51:24 +0700
Message:
    - Fix Connection Selector (#165)
* fix selector

* minor check

* fix golang linter

* add tests for MatchesMonitorScopeSelector

* fix linter issue

---------

Signed-off-by: NSMBot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
3 participants