-
Notifications
You must be signed in to change notification settings - Fork 254
Consolidate claim assignment into clusterpool_controller #1474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consolidate claim assignment into clusterpool_controller #1474
Conversation
|
First pass at the code side. TODO:
|
9f27bcc to
8c0ecb5
Compare
Codecov Report
@@ Coverage Diff @@
## master #1474 +/- ##
==========================================
+ Coverage 41.23% 42.26% +1.02%
==========================================
Files 334 335 +1
Lines 30265 31012 +747
==========================================
+ Hits 12481 13108 +627
- Misses 16722 16800 +78
- Partials 1062 1104 +42
|
8c0ecb5 to
f1f7e59
Compare
2uasimojo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some notes for reviewers inline
|
/assign @abhinavdahiya |
abhinavdahiya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did a first pass.
- the collection is feeling a little weird, because side-effect functions leave the object not actionable again as they become stale
- we are treating both claims and cds as source of truth when for safety we should only treat the cds as that. (claims are user editable)
- View functions update the collection (sorting) when makes them incompatible with concurrent calls.
| // ByName returns the named ClusterDeployment from the cdLookup, or nil if no CD by that name exists. | ||
| ByName(string) *hivev1.ClusterDeployment | ||
| // Installing returns the list of ClusterDeployments in the process of being installed. These are | ||
| // not available for claim assignment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are not available for claim assignment.
doesn't seem like this function should say what is assignable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's totally its job. The implication is that you couldn't, for example, do:
for cd in cds.Installing() {
cd.Assign()
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, installing shouldn't say what you can do with the cd that are in this state.. that the user does with these is completely up to them. and in your example Assign() functions defines what type of cd can be assigned. Assign function should say that it should pass only assignable cd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These structs/methods are 100% designed to conform to and enforce the architecture of the clusterpool controller, and that architecture dictates that we don't assign CDs that are Installing. If future-me is reading, say, assignClustersToClaims and says, "Whoah, we're only using Assignable() CDs -- was that a mistake??" I will go look at the comments for Installing and at least see that it was done on purpose.
I've added some words to the docstrings for cdCollection.Assign and assignClustersToClaims. But I'd like to keep these words here.
| // be assigned first to minimize claim response time. | ||
| // - Running CDs, oldest first | ||
| // - Resuming CDs, in order of least recently resumed (soonest likely to become Running) | ||
| // - Hibernating CDs, oldest first | ||
| // - Anything else, oldest first. (These should probably be skipped. TODO?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't seem like a good place for interface to define this specific behavior. the implementation should be the one in control, otherwise it's too strict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interfaces are gone. I'd like to keep this in the docstring of the impl itself.
|
|
||
| // SyncCDAssignments makes sure each claim which purports to be assigned has the correct CD | ||
| // assigned to it, updating the CD and/or claim on the server as necessary. | ||
| func (claims *claimCollection) SyncCDAssignments(c client.Client, cds cdLookup, logger log.FieldLogger) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does this not return an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's best-effort. It logs problems it encounters, but we don't want those problems to short-circuit the Reconcile. (I could have bubbled the errors up and ignored them in the caller, but didn't see the need.)
|
|
||
| // SyncClaimAssignments makes sure each ClusterDeployment which purports to be assigned has the | ||
| // correct claim assigned to it, updating the CD and/or claim on the server as necessary. | ||
| func (cds *cdCollection) SyncClaimAssignments(c client.Client, claims claimLookup, logger log.FieldLogger) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does this not return an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's best-effort. It logs problems it encounters, but we don't want those problems to short-circuit the Reconcile. (I could have bubbled the errors up and ignored them in the caller, but didn't see the need.)
I don't think they become stale. I responded to one inline comment about this; were there other places this was of concern?
I think this was addressed in an inline comment, yes?
Moved sorting out of getters per this comment. |
f1f7e59 to
2852c47
Compare
To avoid timing issues and properly count assigned ClusterDeployments and ClusterClaims, this commit moves the assignment of ClusterClaims (updating their reference to their assigned CD) from the clusterclaim_controller into the clusterpool_controller, right next to the assignment of ClusterDeployments (updating their reference to their assigned ClusterClaim). It also adds code to double-check that assignments are in sync (claims assigned to the clusters that ref them, and vice versa) before proceeding with pool math and further assignments. This should help eliminate assignment conflicts, and "heal" partial updates (e.g. assigning the claim succeeds, but assigning the CD fails). HIVE-1599
2852c47 to
d8180cb
Compare
|
Note: This is now stacked on #1489. Doesn't matter to me if we merge both commits here or separately. |
might want to rebase as 1489 merged? |
abhinavdahiya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: 2uasimojo, abhinavdahiya The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
To avoid timing issues and properly count assigned ClusterDeployments and ClusterClaims, this commit moves the assignment of ClusterClaims (updating their reference to their assigned CD) from the clusterclaim_controller into the clusterpool_controller, right next to the assignment of ClusterDeployments (updating their reference to their assigned ClusterClaim).
It also adds code to double-check that assignments are in sync (claims assigned to the clusters that ref them, and vice versa) before proceeding with pool math and further assignments. This should help eliminate assignment conflicts, and "heal" partial updates (e.g. assigning the claim succeeds, but assigning the CD fails).
HIVE-1599