-
Notifications
You must be signed in to change notification settings - Fork 377
feat: prioritize emptiness over other consolidation methods #2180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: prioritize emptiness over other consolidation methods #2180
Conversation
jonathan-innis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
|
lgtm I like this change, and I don't think the "cons" is a big concern. Emptiness is scaling down nodes and not replacing them, so if you need to patch an AMI for a security issue, scaling down nodes is still the desired behavior because it reduces the attack surface (fewer vulnerable nodes) faster than drift can. |
Pull Request Test Coverage Report for Build 14846033812Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
| ExpectSingletonReconciled(ctx, queue) | ||
| Expect(len(ExpectNodeClaims(ctx, env.Client))).To(Equal(10)) | ||
| }) | ||
| It("should allow all nodes from each nodePool to be deleted", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would now happen in emptiness because the test used to have empty && drifted nodes. There is a test that covers disruptions across multiple nodepools.
| Expect(ExpectNodes(ctx, env.Client)).To(HaveLen(1)) | ||
| ExpectExists(ctx, env.Client, nodeClaim) | ||
| }) | ||
| It("should ignore nodes with the drifted status condition set to false", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved.
| Expect(ExpectNodeClaims(ctx, env.Client)).To(HaveLen(1)) | ||
| ExpectExists(ctx, env.Client, nodeClaim) | ||
| }) | ||
| It("can delete drifted nodes", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Covered by should delete drifted nodes
| Expect(ExpectNodes(ctx, env.Client)).To(HaveLen(0)) | ||
| Expect(ExpectNodeClaims(ctx, env.Client)).To(HaveLen(0)) | ||
| }) | ||
| It("can replace drifted nodes", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Covered by new should replace drifted nodes.
| ExpectExists(ctx, env.Client, nodeClaim) | ||
| ExpectExists(ctx, env.Client, node) | ||
| }) | ||
| It("should delete nodes with the karpenter.sh/do-not-disrupt annotation set to false", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to other annotation test in same file.
| Expect(ExpectNodeClaims(ctx, env.Client)).To(HaveLen(0)) | ||
| Expect(ExpectNodes(ctx, env.Client)).To(HaveLen(0)) | ||
| ExpectNotFound(ctx, env.Client, nodeClaim, node) | ||
| ExpectMetricGaugeValue(disruption.EligibleNodes, 1, map[string]string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validate via metrics that the disruption was due to emptiness.
jonathan-innis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
| Expect(ExpectNodes(ctx, env.Client)).To(HaveLen(0)) | ||
| ExpectNotFound(ctx, env.Client, nodeClaim, node) | ||
| }) | ||
| It("can delete empty and drifted nodes", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 🎉
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cnmcavoy, jonathan-innis, rschalo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Prioritizing Emptiness - Reordering Graceful Consolidation
Background:
Karpenter performs dataplane right-sizing for Kubernetes clusters. When performing consolidation, Karpenter considers if a node should be deleted, replaced, or if it isn't a candidate for disruption at all. There are multiple criteria used when determining a node's disrupt-ability and this PR proposes a re-ordering of them. Disruption via Expiration, Interruption, and Node Auto Repair are forceful consolidation methods, and are out of scope for this change.
Goal:
Prioritize emptiness consolidation as it is fast to validate and special in that it is the only graceful consolidation that results in only a deletion operation.
Graceful Consolidation:
Karpenter evaluates four methods of graceful disruption.
The four methods happen sequentially and are:
If a valid disruption is found for any of these methods then it is sent to the termination orchestrator, exits the loop, and begins evaluating drift again.
Consolidation Controls:
Disruption can be controlled by a Node Disruption Budget. These budgets are part of the NodePool spec and enables users to determine how many nodes should be disrupted for a given reason and during what time. For example, a user may want to block all consolidation except for an overnight maintenance window for 2 hours.
When multiple budgets are present on a NodePool, the most restrictive budget applies.
While the semantic around the application of these budgets is not wrong - this PR argues that it’s unexpected that emptiness would be blocked by other disruption methods and that it instead should be the first graceful disruption performed.
Problem:
Today, when Karpenter runs Drift, it first checks for any nodes that are both empty and drifted.
These nodes are then sent to the termination orchestrator and Karpenter exits, and returns to the top of the graceful consolidation loop and starts evaluating Drift again. As a result, nodes that are
Drifted && Emptyare taking priority overDriftednodes, being counted against the Drift node disruption budget, and are sending fewer than expected nodes to be terminated.Even more confusing for users, when Karpenter logs these disruptions, the disruption reason is Drifted as evidenced by this log line:
While this isn’t strictly wrong, it’s perhaps unexpected.
To repro this, a sleep was added (otherwise the 1 node disruption budget causes Karpenter to skip over Drift entirely) before evaluating consolidation methods for the following cluster:
Proposals:
1. Update Drift's Handling of Empty Nodes (not implemented)
Specifically handle Empty nodes within Drift as being Empty, applying relevant Node Disruption Budgets for Emptiness as well as correctly logging the disruption reason as Empty. Alternatively, empty nodes can be skipped over when evaluating Drift and then handled by the Emptiness consolidation method.
Pros:
Cons:
2. Reorder Graceful Consolidation and Skip Emptiness in Other Methods (this PR)
Update Drift such that if it detects empty nodes, those nodes are skipped over like we do in other disruption methods. Then, these nodes are picked up in Emptiness and are deleted when evaluating Empty nodes. Additionally, Emptiness should be evaluated before Drift is evaluated so that zero-simulation consolidation options are performed first. In the above repro, 500 nodes would be consolidated before evaluating drift for the remaining nodes.
Pros:
Cons: