[Exploration] Introducing Priority to Kubelet Memory Eviction #846

dashpole · 2017-07-25T20:31:16Z

This proposal is an attempt to solicit feedback from the sig-node community on various methods for integrating priority into kubelet memory evictions.
This proposal focuses on memory eviction since this is the most complex case. Disk eviction is the same, except pods have requests of 0, or equal to their limit if they specify one.
The goal of this proposal is to reach consensus on a single policy within the community.

Feel free to propose alternatives, or to suggest that some solutions be ruled out.

@kubernetes/sig-node-proposals
@derekwaynecarr @vishh @dchen1107
@davidopp @bsalamat

derekwaynecarr · 2017-07-29T04:12:59Z

contributors/design-proposals/priority-eviction.md

+
+This proposal explores possible implementations for integrating priority with the kubelet process of eviction.  This integration should provide a method for cluster operators to provide greater availability to "high-priority" pods without needing to use the critical-pod annotation.  
+
+The current method for ranking pods for eviction is by QoS, then usage over requests.  This currently holds the invariant that a pod that does not exceed its requests is not evicted, since the sum of the requests on the node cannot exceed the allocatable memory on the node.


for clarity, is oom_score_adj behavior worth explaining here as i think they go hand-in-hand.

derekwaynecarr · 2017-07-29T04:13:34Z

contributors/design-proposals/priority-eviction.md

+ - Low Abuse.  High priority pods should not be able to intentionally, or unintentionally disrupt large numbers of well-behaved pods.
+ - Respect Priority.  Pods that have higher priority should be less likely to be evicted.
+
+The goal of this proposal is to build consensus on the general design of how priority is integrated with memory eviction.  This proposal itself will not be merged, but will result in a set of changes to the [kubelet eviction documentation](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/kubelet-eviction.md) after a broad design is settled on.


derekwaynecarr · 2017-07-29T04:18:40Z

contributors/design-proposals/priority-eviction.md

+### By Priority, QoS, then usage over requests
+This solution allows high-priority pods to consume up to their limits with very little chance of eviction, unless other high-priority pods are also present on the same node.  This solution would require using Quota controls on pod limits, rather than requests, as high-priority pods' requests make only a minor difference in their chance for eviction, and limits are a better indicator of what they can consume.  Evictions are easy for users to understand, as they can reason that a higher priority pod bursted, but does not provide a course of action to prevent evictions other than raising the priority of their own pod.
+
+### Function(priority, QoS, usage over requests)


i do not want to explain this to our users.

Why should users care about eviction policies? Unless a cluster is oversubscribed or a pod doesn't express memory limits, evictions shouldn't matter to users right?

derekwaynecarr · 2017-07-29T04:19:56Z

contributors/design-proposals/priority-eviction.md

+This solution specifies a mapping between priority, QoS, and usage - request.  For example, a possible implementation could specify that 100 points priority, 1 QoS level, and 100Mb over requests are equivalent, and then rank pods based on their "score".  It has the potential to be a solution that can balance prioritizing high-priority pods, and preventing abuse from high-priority pods.  However, it would require cluster administrators to understand this mapping in order to correctly specify priority levels, and would be prone to configuration errors.  Users would have little insight into why their pods are evicted.
+
+## Implementation Timeline:
+The integration of priority and evictions is targeted for kubernetes v1.8


this change will only take effect when running with the priority feature gate on, correct?

absent my ability to control who can create pods of a specific priority, and how many, i will not turn priority on for many of our clusters in 1.8.

Yes, Ill clarify that.

derekwaynecarr · 2017-07-29T04:48:25Z

contributors/design-proposals/priority-eviction.md

+
+The current method for ranking pods for eviction is by QoS, then usage over requests.  This currently holds the invariant that a pod that does not exceed its requests is not evicted, since the sum of the requests on the node cannot exceed the allocatable memory on the node.
+
+### Goals


i like how you laid this out.

derekwaynecarr · 2017-07-29T04:53:52Z

contributors/design-proposals/priority-eviction.md

+### By priority, then usage over requests, but only evicting pods where usage > requests
+This solution is similar in practice to the "By QoS, priority, then usage over requests" proposal, but preserves the current invariant that pods are guaranteed to be able to consume their requests without facing eviction.  Like the "By QoS, priority, then usage over requests" proposal, it exempts guaranteed pods from eviction, and thus lowers the potential for abuse by high priority pods.  Users can easily understand that their pods are evicted because they exceed their requests.  This solution provides additional availability for high-priority pods by giving priority access to remaining allocatable memory on the node and memory other pods request, but do not use. 
+
+### By Priority, QoS, then usage over requests


for me, this violates the low abuse goal.

derekwaynecarr · 2017-07-29T05:31:32Z

contributors/design-proposals/priority-eviction.md

+## Proposed Implementations
+This list is not expected to be exhaustive, but rather to explore options.  New options may be added if there is support for them.
+
+The following are proposals for how to rank pods when the node is under memory pressure.


i am aware of a minimum two priority bands

critical pods - deployed by a cluster operator to each node

normal pods - everything else

i am biased to evaluate the preferred ranking approach on that relationship.

I think we need to consider at least three -- critical pods (BTW that's not just one-per-node pods, but also one-or-a-small-number-per-cluster pods like DNS and Heapster), pods that are in the serving path for external user requests, and opportunistic work (e.g. batch jobs with a flexible deadline). When HPA kicks in to scale up that second category, you want it to push out the third category but not the first category. (Obviously this assumes you can't or don't want to increase the number of nodes, e.g. non-virtualized on-prem environment or a cloud environment where you don't want to pay more.)

derekwaynecarr · 2017-07-29T05:44:17Z

contributors/design-proposals/priority-eviction.md

+### By QoS, priority, then usage over requests
+This solution is closest to the current behavior since it only makes a small modification by considering priority before usage over requests.  Since this is a small change from the current implementation, it should be an easy transition for cluster admins and users to make.  High-priority pods are only able to able to disrupt pods in their QoS tier or below, which lowers the potential for abuse since users can run their workloads as guaranteed if they need to avoid evictions.  However, this means that burstable pods consuming less than their requests could be evicted if a different high priority burstable pod bursts.  High priority pods are given increased availability depending on the QoS of pods that they share the node with.  If the node has many guaranteed pods, it is still possible that the high priority pod could be evicted.  If the node does not have many guaranteed pods, then the high-priority pod is able to consume all memory not consumed by Guaranteed pods.
+
+### By priority, then usage over requests, but only evicting pods where usage > requests


the cluster operator that reserves resource for peak density may prefer this option.

pros:

easy to explain to normal users

maps clearly to how the scheduler works with allocatable

very transparent (the operator is required to state their reservation)

hard to see a clear abuse vector

cons:

requires more resources reserved for critical pods

i wonder if we had vertical pod autoscaling (in-place) if this option would work well for all cases; right now, usage of a daemonset doesn't really allow for variable pod sizing per node, but if we could somehow allow a pod to right-size its requests based on usage, it would feel easiest to explain.

derekwaynecarr · 2017-07-29T05:52:22Z

contributors/design-proposals/priority-eviction.md

+
+The following are proposals for how to rank pods when the node is under memory pressure.
+
+### By QoS, priority, then usage over requests


the cluster operator that does not reserve resources for peak density may prefer this option.

pros:

operator can size critical pods for an avg or target density

cluster operators may get better node utilization

cons:

i think transparency decreases as number of priorities increase

doesn't align as well with how we calculate oom_score_adj

Given that priority is being added to the APIs, wouldn't this option be deviating from the priority APIs?

derekwaynecarr · 2017-07-29T05:54:04Z

since we are not very prescriptive from what i can tell on priorities out the box, i am fine with allowing operator choice between two options:

"By QoS, priority, then usage over requests"
"By priority, then usage over requests, but only evicting pods where usage > requests"

what do others think?

derekwaynecarr · 2017-07-29T05:54:26Z

/cc @sjenning - curious on your perspective as well.

sjenning · 2017-07-31T01:56:50Z

I have many of the same questions @derekwaynecarr has so I'll just wait for response on those.

A few thoughts I had:

Why is the critical pod annotation bad/insufficient? How is this better?
This does not to provide a hard guarantee against pod eviction. If that is the functionality we desire, why not make it explicit rather than "well... it could be evicted in theory but, in practice, it can't be because insert-complicated-priority-scheme-here"
If we don't want to enable a per pod annotation for critical pod, we might just do it at the namespace level. Seems like any pod started in kube-system namespace, for example, could be assumed to be critical. Assuming that it takes a higher privilege level to set annotations on namespaces than on pods.

It does seem to be that priority would be a lot like oom_score_adj for eviction.

davidopp · 2017-07-31T05:05:08Z

Why is the critical pod annotation bad/insufficient? How is this better?

See my comment about the need for at least three bands. That's driven by cluster-level scheduling requirements not node-level eviction, but the two should play well together. See the priority design doc for more details.

Seems like any pod started in kube-system namespace, for example, could be assumed to be critical.

We have a notion of a global default priority if you don't set it explicitly, but per-namespace default is something we could add in the future (@bsalamat has considered it).

dashpole · 2017-07-31T19:14:46Z

@sjenning

Why is the critical pod annotation bad/insufficient? How is this better?

I did a poor job of explaining that. See the updated motivation section.

This does not to provide a hard guarantee against pod eviction. If that is the functionality we desire, why not make it explicit rather than "well... it could be evicted in theory but, in practice, it can't be because insert-complicated-priority-scheme-here"

I dont think we necessarily want a hard guarantee against pod eviction in all cases. I listed "low abuse" as a goal because we should be able to provide higher availability to important pods without completely sacrificing any availability guarantees we have for other pods. Generally, the reason high-priority pods wouldnt be evicted "in practice" is that they are reasonably well behaved "in practice" (they have reasonable limits set, and use somewhere around requests). But it is useful to have a mechanism for limiting how much disruption a pod can cause, even if only in theory.

bsalamat · 2017-07-31T19:37:48Z

I like the idea of eviction "by a function of priority and usage above request". It addresses some of the use-cases I have in mind better than the other solutions.
Here are a couple of important scenarios that we should address:

Scenario 1: Running a critical service that occasionally bursts: I would like to run a service that normally uses 10GB of memory, but could burst to 30GB only for a few minutes in 24 hours. I would like to run the service as burstable to allow other pods run when I have resources available on the nodes, but when I need the resources, I would like to kill anything else in favor of my important service. Those other pods should be evicted when the critical service needs memory no matter how much they are using. This is actually a common use-case in many large clusters. Looks like by giving a very high priority to the critical service and lower priorities to others, we can solve this problem, but see the next scenario why this solution is not enough.

Scenario 2: Prevent abuse: Admin has given me a tiny amount of quota at a very high priority. If I create a bunch of best-effort pods asking for no resources, they get scheduled on various nodes right away, but I shouldn't be able to evict everyone else as my memory footprint grows. So, priority by itself cannot solve the problem. In my opinion percentage of usage above request should have a significant weight on determining who is getting evicted first. The assumption here is that Kubelet cannot enforce quota on cluster-wide resource usage. So, resource quota is only enforced on "request" at the time of admission.

Given these scenarios, I think the score that kubelet gives to pods for eviction purposes should be something like "priority * request/usage". The lower the score, the higher the chance of getting evicted.
(This formula does not work for negative priorities and shouldn't be used in real implementation, but I wrote it just to provide an example of what I mean by a "function of priority and usage above request".)

vishh · 2017-08-02T02:28:38Z

contributors/design-proposals/priority-eviction.md

+This solution allows high-priority pods to consume up to their limits with very little chance of eviction, unless other high-priority pods are also present on the same node.  This solution would require using Quota controls on pod limits, rather than requests, as high-priority pods' requests make only a minor difference in their chance for eviction, and limits are a better indicator of what they can consume.  Evictions are easy for users to understand, as they can reason that a higher priority pod bursted, but does not provide a course of action to prevent evictions other than raising the priority of their own pod.
+
+### Function(priority, QoS, usage over requests)
+This solution specifies a mapping between priority, QoS, and usage - request.  For example, a possible implementation could specify that 100 points priority, 1 QoS level, and 100Mb over requests are equivalent, and then rank pods based on their "score".  It has the potential to be a solution that can balance prioritizing high-priority pods, and preventing abuse from high-priority pods.  However, it would require cluster administrators to understand this mapping in order to correctly specify priority levels, and would be prone to configuration errors.  Users would have little insight into why their pods are evicted.


This option would help in building a resource economy of some sorts for power users of Priority. Imagine each level of priority associated with a cost. The lower a user pays, the lesser SLOs they get.
On the other hand, regular users can ensure that higher priority bands aren't over subscribed, and reserve over subscription just for batch workloads which can tolerate evictions and (typically) do not care about why eviction happened.

For the sake of evaluation, did you have a specific function in mind, or a class of functions (e.g. priority/usage, priority - usage, etc)?
I think that in all options, there would be a resource economy for cluster resources. In some cases (e.g. usage < request, priority, usage - request), this resource economy would be primarily based around request quota, and priority would play a less prominent role. In others, (e.g. priority, QoS, usage-request), a resource economy would be entirely based around priority. In this case, it would be a combination of both priority and request quota.

My understanding is that priority is meant to take the front seat where the underlying capacity may be over subscribed. For example, at priority 10000, a user get's 10% of cluster capacity with many 9s of SLA. At priority 1000, a user get's 50% of cluster capacity with two 9s of SLA. At priority 0, a user get's access to 80% of cluster capacity with little to no SLA. The fact that usage is above request is only one of the facets considered while choosing a victim.

@bsalamat any thoughts? I see you raise similar points in #846 (comment)
For some reason, this option 3 has made the most sense to me as well since the beginning of this conversation.

I agree. As I've said in this comment, I like this option as well, however, I am not so sure if we actually need to add QoS class to the formula. IMO, percentage of usage above request is enough as QoS is implied in it. For example, best effort pods has an infinite percentage of usage above request which makes them the first candidates for eviction. Similarly, guaranteed pods get the lowest amount of usage above request and burstable pods get something in between.

at priority 10000, a user get's 10% of cluster capacity with many 9s of SLA.

What does it mean to "get" or "have access to" a portion of the cluster? What mechanism are we using to enforce that that a priority is actually limited to 10% of capacity, if not requests? Can you clarify what the SLA would mean in this case? It doesnt seem possible to provide an SLA without respect to usage.
The scenario you provided could be easily configured in other options using quota on requests (for the first two) or limits (for the last solution), and doesn't seem unique to the function solution.

Ill update the document for the third to exclude QoS for now.

dashpole · 2017-08-03T21:44:53Z

Since everyone is fairly split, a first item to find consensus on might be whether we want to preserve the behavior:

A pod where usage < requests is not evicted

If we decide that this is important to preserve, we would discuss options like:

(current) QoS, then usage - requests
usage < requests, then Function(priority, usage - requests)
usage < requests, priority, usage - requests
usage < requests, ... etc.

This behavior aligns well with the goals I set forward:

Transparency: Users can understand that pods are evicted for exceeding requests even if they don't understand the entire algorithm.
Low Abuse: A pod using less than its request is "protected" from high-priority pods.

bsalamat · 2017-08-04T01:03:50Z

@dashpole As we talked offline, I think I am fine with this approach. The only remaining concern is that scheduler preemption logic works differently. Scheduler does not have any knowledge of pod resource usage and preempts pods based on their priority.
So, we will not be able to tell users that their pods will not be killed as long as their usage < requests. If a higher priority pod shows up and scheduler does not find any node that fits the pod, scheduler will preempt one or more lower priority pods (regardless of their usage) to create room for the new pod.
If we combine the eviction and preemption behavior, we will get this rule: "A pod will keep running as long as it has the highest priority among other pods AND its usage is below request". I am a bit unsure how users will take this, particularly because it will not work in scenario 1 that I described in this comment.

davidopp · 2017-08-04T04:19:42Z

In your Scenario 1 I think it's fine to tell users that if they want the strongest guarantee, they have to buy enough high-priority quota to satisfy their maximum burst, so that they always run within their Request. I suspect this is probably what a cluster admin would want anyway, from the abuse-prevention perspective.

bsalamat · 2017-08-04T05:53:14Z

Yes, but having a high request blocks other pods (except best effort ones) from running on the node. This may not be desirable when the high priority pod needs so much resources only for a short period in say 24 hours and uses much less resources other times. Anyway, we discussed all these and I am convinced that @dashpole's proposal is fine, as other approaches may not protect well against abuse.

dashpole · 2017-08-08T00:06:23Z

I incorporated feedback from comments into a proposal. I plan to share this at the sig-node meeting tomorrow 8/8.
I hope to settle on a proposal, and produce a concrete set of changes by 8/15.

dashpole · 2017-08-15T17:27:25Z

I think I have spoken to everyone who has raised questions, or expressed interest, and I think we have agreement with the current proposed. If you have remaining questions or comments, feel free to add them. I will close this, and propose changes to the Eviction Documentation by the end of the week if there are no outstanding concerns.

dchen1107 · 2017-10-03T22:44:09Z

contributors/design-proposals/priority-eviction.md

+
+## Proposed Implementation
+
+### Only evict pods where usage > requests.  Then sort by Function(priority, usage - requests)


I thought we settled down with a simpler algorithms in the meeting:

Only evict pods where usage > requests. Among those pods in eviction pool, we first sort them by priority, then sort by ( usage - request ) as the tie-breaker for the pods at the same priority.

I liked the initial proposal way better than introducing a function to re-generate an internal priority for eviction because:

It is much more deterministic. If there is a resource starvation issue, the user expect the abusive pods (with usage > request) with lower-priority will be evicted before the higher priority ones.

It might introduce reverse-priority issue to the system.

It is hard to explain to the users upon a pod being evicted at a given time.

The same priority is used for preemption by the scheduler(s) / rescheduler(s) / controller(s). The current proposal might introduce a contradict decision for the eviction on the node.

cc/ @dashpole @derekwaynecarr @bsalamat @kubernetes/sig-node-proposals

@dchen1107

Automatic merge from submit-queue. Update priority eviction docs After discussion with @dchen1107 and @bsalamat I think it would be simpler to start with a tiered eviction sorting rather than by using a function. See #846 (review) for some of the rationale. This PR makes two changes: 1. Changes the release at which changes take effect to 1.9 (since implementation missed 1.8) 2. Changes the strategy from (usage > requests, func(priority, usage - requests)) to (usage > requests, priority, usage - requests) cc @dchen1107 @derekwaynecarr @vishh

@dchen1107

Automatic merge from submit-queue (batch tested with PRs 51840, 53542, 53857, 53831, 53702). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Kubelet Evictions take Priority into account Issue: #22212 This implements the eviction strategy documented here: kubernetes/community#1162, and discussed here: kubernetes/community#846. When priority is not enabled, all pods are treated as equal priority. This PR makes the following changes: 1. Changes the eviction ordering strategy to (usage < requests, priority, usage - requests) 2. Changes unit testing to account for this change in eviction strategy (including tests where priority is disabled). 3. Adds a node e2e test which tests the eviction ordering of pods with different priorities. /assign @dchen1107 @vishh cc @bsalamat @derekwaynecarr ```release-note Kubelet evictions take pod priority into account ```

@dchen1107

Automatic merge from submit-queue. Update priority eviction docs After discussion with @dchen1107 and @bsalamat I think it would be simpler to start with a tiered eviction sorting rather than by using a function. See kubernetes/community#846 (review) for some of the rationale. This PR makes two changes: 1. Changes the release at which changes take effect to 1.9 (since implementation missed 1.8) 2. Changes the strategy from (usage > requests, func(priority, usage - requests)) to (usage > requests, priority, usage - requests) cc @dchen1107 @derekwaynecarr @vishh

weijia32167 · 2019-12-26T12:44:12Z

Why not align the logic of OOM, preemption, and eviction.
All three of them work based on priority or QoS?
This commit resulted in preemption and eviction based on priority, but OOM is still based on QoS?

@dchen1107

Automatic merge from submit-queue. Update priority eviction docs After discussion with @dchen1107 and @bsalamat I think it would be simpler to start with a tiered eviction sorting rather than by using a function. See kubernetes/community#846 (review) for some of the rationale. This PR makes two changes: 1. Changes the release at which changes take effect to 1.9 (since implementation missed 1.8) 2. Changes the strategy from (usage > requests, func(priority, usage - requests)) to (usage > requests, priority, usage - requests) cc @dchen1107 @derekwaynecarr @vishh

@dchen1107

Automatic merge from submit-queue. Update priority eviction docs After discussion with @dchen1107 and @bsalamat I think it would be simpler to start with a tiered eviction sorting rather than by using a function. See kubernetes/community#846 (review) for some of the rationale. This PR makes two changes: 1. Changes the release at which changes take effect to 1.9 (since implementation missed 1.8) 2. Changes the strategy from (usage > requests, func(priority, usage - requests)) to (usage > requests, priority, usage - requests) cc @dchen1107 @derekwaynecarr @vishh

@dchen1107

Automatic merge from submit-queue. Update priority eviction docs After discussion with @dchen1107 and @bsalamat I think it would be simpler to start with a tiered eviction sorting rather than by using a function. See kubernetes/community#846 (review) for some of the rationale. This PR makes two changes: 1. Changes the release at which changes take effect to 1.9 (since implementation missed 1.8) 2. Changes the strategy from (usage > requests, func(priority, usage - requests)) to (usage > requests, priority, usage - requests) cc @dchen1107 @derekwaynecarr @vishh

@dchen1107

Automatic merge from submit-queue. Update priority eviction docs After discussion with @dchen1107 and @bsalamat I think it would be simpler to start with a tiered eviction sorting rather than by using a function. See kubernetes/community#846 (review) for some of the rationale. This PR makes two changes: 1. Changes the release at which changes take effect to 1.9 (since implementation missed 1.8) 2. Changes the strategy from (usage > requests, func(priority, usage - requests)) to (usage > requests, priority, usage - requests) cc @dchen1107 @derekwaynecarr @vishh

@dchen1107

Automatic merge from submit-queue. Update priority eviction docs After discussion with @dchen1107 and @bsalamat I think it would be simpler to start with a tiered eviction sorting rather than by using a function. See kubernetes/community#846 (review) for some of the rationale. This PR makes two changes: 1. Changes the release at which changes take effect to 1.9 (since implementation missed 1.8) 2. Changes the strategy from (usage > requests, func(priority, usage - requests)) to (usage > requests, priority, usage - requests) cc @dchen1107 @derekwaynecarr @vishh

@dchen1107

Automatic merge from submit-queue. Update priority eviction docs After discussion with @dchen1107 and @bsalamat I think it would be simpler to start with a tiered eviction sorting rather than by using a function. See kubernetes/community#846 (review) for some of the rationale. This PR makes two changes: 1. Changes the release at which changes take effect to 1.9 (since implementation missed 1.8) 2. Changes the strategy from (usage > requests, func(priority, usage - requests)) to (usage > requests, priority, usage - requests) cc @dchen1107 @derekwaynecarr @vishh

…rnetes#846)

exploration of priority eviction options

0be3224

k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. kind/design Categorizes issue or PR as related to design. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 25, 2017

derekwaynecarr reviewed Jul 29, 2017

View reviewed changes

update motivation and add oom

33bfcde

vishh reviewed Aug 2, 2017

View reviewed changes

update Function description to remove QoS

36bf396

propose solution

2f68945

k8s-github-robot assigned calebamiles Aug 15, 2017

k8s-github-robot assigned idvoretskyi Aug 15, 2017

k8s-github-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 15, 2017

dashpole closed this Aug 18, 2017

dashpole mentioned this pull request Aug 18, 2017

Modify Eviction Strategy to take Priority into account #946

Merged

dchen1107 reviewed Oct 3, 2017

View reviewed changes

This was referenced Oct 6, 2017

Update priority eviction docs #1162

Merged

Kubelet Evictions take Priority into account kubernetes/kubernetes#53542

Merged

danehans pushed a commit to danehans/community that referenced this pull request Jul 18, 2023

Automator: update common-files@master in istio/community@master (kube…

34f245a

…rnetes#846)


		This proposal explores possible implementations for integrating priority with the kubelet process of eviction. This integration should provide a method for cluster operators to provide greater availability to "high-priority" pods without needing to use the critical-pod annotation.

		The current method for ranking pods for eviction is by QoS, then usage over requests. This currently holds the invariant that a pod that does not exceed its requests is not evicted, since the sum of the requests on the node cannot exceed the allocatable memory on the node.


		The current method for ranking pods for eviction is by QoS, then usage over requests. This currently holds the invariant that a pod that does not exceed its requests is not evicted, since the sum of the requests on the node cannot exceed the allocatable memory on the node.

		### Goals


		The following are proposals for how to rank pods when the node is under memory pressure.

		### By QoS, priority, then usage over requests


		## Proposed Implementation

		### Only evict pods where usage > requests. Then sort by Function(priority, usage - requests)

[Exploration] Introducing Priority to Kubelet Memory Eviction #846

[Exploration] Introducing Priority to Kubelet Memory Eviction #846

Conversation

dashpole commented Jul 25, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

derekwaynecarr Jul 29, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

derekwaynecarr commented Jul 29, 2017

derekwaynecarr commented Jul 29, 2017

sjenning commented Jul 31, 2017

davidopp commented Jul 31, 2017

dashpole commented Jul 31, 2017 • edited Loading

bsalamat commented Jul 31, 2017

Choose a reason for hiding this comment

dashpole Aug 2, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dashpole commented Aug 3, 2017 • edited Loading

A pod where usage < requests is not evicted

bsalamat commented Aug 4, 2017

davidopp commented Aug 4, 2017

bsalamat commented Aug 4, 2017

dashpole commented Aug 8, 2017

dashpole commented Aug 15, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

weijia32167 commented Dec 26, 2019

derekwaynecarr Jul 29, 2017 •

edited

Loading

dashpole commented Jul 31, 2017 •

edited

Loading

dashpole Aug 2, 2017 •

edited

Loading

dashpole commented Aug 3, 2017 •

edited

Loading