Skip to content

Conversation

mariash
Copy link
Member

@mariash mariash commented Mar 30, 2023

What is this change about?

This PR helps with the issue when containers are failing to be created and destroyed:

  • not to release resources when destroying container fails
  • being able to clean up all containers when it was partially created

What problem it is trying to solve?

For example when the networker is down, both garden create and destroy will be failing.

When we destroy container we want to make sure that if we fail to destroy we do not release available resources. Releasing available resources will result in more work to be scheduled on that Diego cell and eventually hitting other limits, like max container limit in garden.

When rep is staring and performing clean up we want to account for containers that were partially created (garden.state is not set to created).

What is the impact if the change is not made?

Containers that are not fully created will be leaked by rep. Eventually rep will overschedule work and hit the limits like max container reached. These containers won't be cleaned up when rep is restarted.

There is an option in garden to clean up all containers but it is optional.

How should this change be described in diego-release release notes?

Improve container tracking when containers failing to be created/destroyed in garden

Please provide any contextual information.

Related change in garden - cloudfoundry/guardian#371

Garden change is not required, since it is backwards compatible. Setting garden.state in Rep is not going to have any effect in old garden (it overwrites garden.state as created).

Tag your pair, your PM, and/or team!

@reneighbor

Thank you!

mariash and others added 2 commits March 30, 2023 22:18
Pass garden.state as `all`. This values is supported in guardian commit
cloudfoundry/guardian@65e01bd

This allows to pull all containers, even if they are not fully created.
E.g. if networker was down when container was being created.

This change is backwards compatible because garden was always setting
garden.state to created.

Signed-off-by: Maria Shaldybin <[email protected]>
@mariash mariash changed the title Gard 21 Fix container tracking when containers failing to be created/destroyed in garden Mar 30, 2023
@mariash mariash changed the title Fix container tracking when containers failing to be created/destroyed in garden Improve container tracking when containers failing to be created/destroyed in garden Mar 30, 2023
Copy link
Member

@MarcPaquette MarcPaquette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@geofffranks geofffranks merged commit 7ea2723 into main Mar 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants