Improve container tracking when containers failing to be created/destroyed in garden #76
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What is this change about?
This PR helps with the issue when containers are failing to be created and destroyed:
What problem it is trying to solve?
For example when the networker is down, both garden create and destroy will be failing.
When we destroy container we want to make sure that if we fail to destroy we do not release available resources. Releasing available resources will result in more work to be scheduled on that Diego cell and eventually hitting other limits, like max container limit in garden.
When rep is staring and performing clean up we want to account for containers that were partially created (
garden.state
is not set tocreated
).What is the impact if the change is not made?
Containers that are not fully created will be leaked by rep. Eventually rep will overschedule work and hit the limits like max container reached. These containers won't be cleaned up when rep is restarted.
There is an option in garden to clean up all containers but it is optional.
How should this change be described in diego-release release notes?
Improve container tracking when containers failing to be created/destroyed in garden
Please provide any contextual information.
Related change in garden - cloudfoundry/guardian#371
Garden change is not required, since it is backwards compatible. Setting garden.state in Rep is not going to have any effect in old garden (it overwrites garden.state as created).
Tag your pair, your PM, and/or team!
@reneighbor
Thank you!