-
Notifications
You must be signed in to change notification settings - Fork 253
Slowly delete stale pool clusters #1512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slowly delete stale pool clusters #1512
Conversation
|
/hold Needs UT & live test |
Codecov Report
@@ Coverage Diff @@
## master #1512 +/- ##
==========================================
+ Coverage 41.47% 41.51% +0.03%
==========================================
Files 335 336 +1
Lines 30538 30569 +31
==========================================
+ Hits 12666 12691 +25
- Misses 16790 16794 +4
- Partials 1082 1084 +2
|
58cc192 to
2f12f8b
Compare
|
LGTM so far. |
| return reconcile.Result{}, err | ||
| } | ||
| // Special case for stale CDs: allow deleting one if all CDs are installed. | ||
| case drift == 0 && len(cds.Installing()) == 0 && len(cds.Stale()) > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually does this need a check to make sure none are deleting? Otherwise it kinda looks like it might immediately wipe them all out. (depending on how fast Installing clusters start showing up)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't need that:
- On this reconcile we
Deleteone. - The next reconcile will miss this
casebecause we'll be one short ofpool.sizesodriftwill be< 0. It'll add a CD via L382 which will go toInstalling. - The next reconcile (and subsequent ones for the next 40m) will see nonzero
Installingclusters, so will skip thiscase.
Did I logic that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are probably right, there are cases where the cache isn't updated for new objects and may not see the new Installing cluster, which is where we have to use expectations. If I'm correct that might happen here, it's possible it could wipe out a couple at once before the installing start showing up. I'm not super confident in any of this so we can see how it goes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we do a counter metric here so we can see the rate at which clusters are getting replaced due to pool changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use expectations
Then someone will have to explain to me what "expectations" are and how they work :P
a counter metric
Just a monotonic counter we increment every time we hit this branch? Or a gauge we reset once we're fully synced?
And the fancy rate calculation stuff happens in prometheus, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use expectations
Then someone will have to explain to me what "expectations" are and how they work :P
Best ref I've used for expectations was when Matthew first introduced them to Hive and why: #518 I don't necessarily think we need to do this here, but it may be an edge case we'll hit where a couple get deleted when we only expected one. I do think an unclaimed deleting check would likely rectify it though, that should be reflected in cache immediately, and by the time it's done deleting the new cluster provisions should definitely be in the cache.
a counter metric
Just a monotonic counter we increment every time we hit this branch? Or a gauge we reset once we're fully synced?
Just a counter we increment yes, it'll tell us how often this is happening.
And the fancy rate calculation stuff happens in prometheus, right?
Correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary from today's discussion re expectations: we don't need them here because the deletion hits the cache immediately, which would result in the next reconcile having a nonzero drift and would thus miss the new branch.
Digging into the metric a bit...
You suggested a Counter that would keep an running total of the number of stale CDs we've ever* deleted.
The other option is a Gauge that denotes how many stale clusters exist at any given time. Would that have value? Would it allow us to calculate the above by some prom magic? Or perhaps we could do both.
*Since the last time this controller started? Or like ever ever? I never really understood when data are in the controller and when they're in prometheus...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the Counter metric in a separate commit. LMK if the Gauge seems like a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Counters are since the process started, prometheus handles the past data and adds the two together. In this case it's not so much for the value of the counter, more so for the rate at which it's occurring.
A gauge does sound useful for the total number of stale at any given time, rather than the rate at which we're doing replacements.
2f12f8b to
5d89d37
Compare
|
(Still holding for test) |
5d89d37 to
37fb6af
Compare
3dc3c3c / openshift#1484 added setting and discovery of a version marker for ClusterPool and ClusterDeployment so we could tell whether they match. This commit goes a step further and uses that information to replace stale CDs such that the pool will eventually be consistent. The algorithm we use is as follows: If the pool is in steady state -- i.e. we're at capacity and all unclaimed CDs are finished installing -- delete *one* stale CD. This triggers another reconcile wherein the deleted CD is replaced. We're no longer in steady state until (at least) that CD finishes installing, whereupon we're eligible to repeat. Reasoning: - We don't want to immediately trash all stale CDs, as this could result in significant downtime following a pool edit. - We don't want to exceed capacity (size or maxSize) to do this rotation, so we go "down" instead of "up". - It would have been nice to just wait for the *replacement* CDs to finish installing; but that would have entailed tracking those somehow as being distinct from CDs added through other code paths. That could get tricky, especially across multiple pool edits. TODO: - As written, we'll even wait for *stale* CDs to finish installing before we start deleting. We should figure out a way to discount those when checking whether we're eligible for a deletion. - Consider exposing a knob allowing the consumer to tune how aggressively stale CDs are replaced. HIVE-1058
37fb6af to
c48acb3
Compare
|
UT is in. Still need to test live. |
Add a CounterVec, labeled by pool name, incremented every time we delete a stale ClusterDeployment. Part of HIVE-1058
c48acb3 to
f3e437e
Compare
|
/test e2e |
|
Tested live, works. /hold cancel |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: 2uasimojo, dgoodwin The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This was missed in openshift#1512 / f3e437e. HIVE-1058 HIVE-1645
3dc3c3c / #1484 added setting and discovery of a version marker for ClusterPool and ClusterDeployment so we could tell whether they match. This commit goes a step further and uses that information to replace stale CDs such that the pool will eventually be consistent.
The algorithm we use is as follows:
If the pool is in steady state -- i.e. we're at capacity and all unclaimed CDs are finished installing -- delete one stale CD. This triggers another reconcile wherein the deleted CD is replaced. We're no longer in steady state until (at least) that CD finishes installing, whereupon we're eligible to repeat.
Reasoning:
TODO:
HIVE-1058