KEP-1287: InPlacePodVerticalScaling changes for v1.33 #5089

tallclair · 2025-01-24T23:32:34Z

InPlacePodVerticalScaling changes for v1.33, including:

Replace Resize status with pod conditions (PodResizePending and PodResizing)
Forbid memory limit downsizing (background)
Rename ResizeRestartPolicy NotRequired to PreferNoRestart, and update CRI UpdateContainerResources contract
Add pod-level AllocatedResources ([FG:InPlacePodVerticalScaling] Inconsistency between scheduler & kubelet admission logic kubernetes#129532)
Switch to edge-triggered resize actuation ([FG:InPlacePodVerticalScaling] Pod CPU limit is not configured to cgroups as calculated if systemd cgroup driver is used kubernetes#129357)

/sig node
/milestone v1.33

tallclair · 2025-01-24T23:32:49Z

/assign @dchen1107 @thockin

tallclair · 2025-01-24T23:33:10Z

/assign @vinaykul

esotsal · 2025-01-25T00:12:08Z

/cc

keps/sig-node/1287-in-place-update-pod-resources/README.md

tallclair · 2025-02-11T17:18:38Z

FYI - I just changed ResizePolicy to be immutable. We no longer require it to be mutable, since we no longer default the resize policy field, and a mutable policy prevents us from being able to allow memory limit decreases for containers with a RestartContainer policy (otherwise you could change the resize policy & decrease the limit, and then immediately change the resize policy back).

keps/sig-node/1287-in-place-update-pod-resources/README.md

dchen1107 · 2025-02-12T18:37:00Z

keps/sig-node/1287-in-place-update-pod-resources/README.md

-See [`Alternatives: Allocated Resources`](#allocated-resources-1) for alternative APIs considered.
-
-The allocated resources API should be reevaluated prior to GA.
+The scheduler uses `max(spec...resources, status...allocatedResources, status...resources)` for fit


A note to myself: This is a joint decision we made together, but it could lead the resource under utilized in the worst scenario.

keps/sig-node/1287-in-place-update-pod-resources/README.md

dchen1107 · 2025-02-12T19:28:45Z

keps/sig-node/1287-in-place-update-pod-resources/README.md

+  - Restart after checkpointing: pod goes through admission using the allocated resources
+1. Kubelet creates a container
+  - Resources acknowledged after CreateContainer call succeeds
+  - Restart before acknowledgement: Kubelet issues a superfluous UpdatePodResources request


Just a note: cc/ @samuelkarp and @mikebrow to ensure UpdatePodResources call is idempotent since I roughly remembered there was an bug at CRI layer on this lately.

I'm going to continue to argue that kubelet should be periodically re-asserting the resources it wants, so this absolutely needs to be idempotent :)

Also, I think we should just re-assert at startup for all pods and containers.

+1 with you above. We should address this, but not a block for this KEP.

Idempotence is mentioned under the CRI changes (and actuating resources). Resync is under future enhancements.

dchen1107 · 2025-02-12T19:44:43Z

keps/sig-node/1287-in-place-update-pod-resources/README.md

+If a resize request does not succeed, the Kubelet will retry the resize on every subsequent pod
+sync, until it succeeds or the container is terminated.
+
+### Memory Limit Decreases


A note to us: cc/ @haircommander @mrunalp since you two are leading the conversation with the kernel community on cgroup v2 limitation.

dchen1107 · 2025-02-12T19:59:37Z

/lgtm overall with some minor comments.

Will leave other reviews and @thockin for a final review before approve the KEP.

thockin

/approve for API

thockin · 2025-02-12T23:18:09Z

keps/sig-node/1287-in-place-update-pod-resources/README.md

    - `status.containerStatuses[0].resources.requests[cpu]` = 1.6
-```
+    - `status.conditions[type==PodResizePending].type` = `"Infeasible"`
+    - actual CPU shares = 1638


This timeline is great. So great I think you should do more (non-blocking).

I'd love to see the situation where an NRI plugin mutates the request to round to 10 shares, just to demonstrat how actual and allocated are related (e.g. I assume the actual would show shares/1024, so that 1.5 request would translate to 1536 shares, which gets rounded to 1540, which comes to actual as 1.504. At that point, the scheduler, with its max() logic would consider 1.504, not 1.500. Right?

Will address in a separate PR.

thockin · 2025-02-12T23:20:07Z

keps/sig-node/1287-in-place-update-pod-resources/README.md

+  - Restart after checkpointing: pod goes through admission using the allocated resources
+1. Kubelet creates a container
+  - Resources acknowledged after CreateContainer call succeeds
+  - Restart before acknowledgement: Kubelet issues a superfluous UpdatePodResources request


I'm going to continue to argue that kubelet should be periodically re-asserting the resources it wants, so this absolutely needs to be idempotent :)

Also, I think we should just re-assert at startup for all pods and containers.

thockin · 2025-02-12T23:33:19Z

keps/sig-node/1287-in-place-update-pod-resources/README.md

+restarts. There is the possibility that a poorly timed restart could lead to a resize request being
+repeated, so `UpdateContainerResources` must be idempotent.
+
+When a resize CRI request succeeds, the pod will be marked for resync to read the latest resources. If


Worth adding a section on exploring whether to activel re-assert periodically and at start? I think we should demand that re-asserting be a low-impact operation in the already-correct case (make that the CRI's problem) and just not be worried about re-asserting too often.

I think this is worth exploring 👍

Added it to the future enhancements section.

mrunalp · 2025-02-13T00:23:03Z

keps/sig-node/1287-in-place-update-pod-resources/README.md

 'cpu' and 'memory' as names. It supports the following restart policy values:
-* NotRequired - default value; resize the Container without restart, if possible.
-* RestartContainer - the container requires a restart to apply new resource values.
+* `PreferNoRestart` - default value; resize the Container without restart, if possible.


Can we clarify what if possible means here? Whether it will restart anyways or not take action or will be retried later indefinitely?

This is going to depend somewhat on what we decide to do with memory limit decreases. I'll add a bit more detail though.

keps/sig-node/1287-in-place-update-pod-resources/README.md

mrunalp · 2025-02-13T00:56:25Z

/approve

for sig-node. Thanks!
will leave final LGTM to @thockin

vinaykul

A few comments/suggestions.
Overall LGTM.

keps/sig-node/1287-in-place-update-pod-resources/README.md

dchen1107 · 2025-02-13T18:45:41Z

/lgtm
/approve

k8s-ci-robot · 2025-02-13T18:45:52Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dchen1107, mrunalp, tallclair, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/sig-node/OWNERS~~ [dchen1107,mrunalp]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added this to the v1.33 milestone Jan 24, 2025

k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory labels Jan 24, 2025

k8s-ci-robot requested review from derekwaynecarr and mrunalp January 24, 2025 23:32

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 24, 2025

k8s-ci-robot assigned dchen1107 and thockin Jan 24, 2025

k8s-ci-robot assigned vinaykul Jan 24, 2025

tallclair force-pushed the ippr branch from ff0700d to 8cdd01b Compare January 24, 2025 23:37

k8s-ci-robot added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label Jan 24, 2025

KEP-1287: Remove resize requests metric

c636b72

tallclair force-pushed the ippr branch from 8cdd01b to cc9982c Compare January 24, 2025 23:38

k8s-ci-robot requested a review from esotsal January 25, 2025 00:12

kannon92 reviewed Jan 26, 2025

View reviewed changes

keps/sig-node/1287-in-place-update-pod-resources/README.md Outdated Show resolved Hide resolved

tallclair force-pushed the ippr branch from 8fc805d to 2efe8d5 Compare January 28, 2025 21:58

tallclair mentioned this pull request Jan 28, 2025

Move CPU equivalency logic to shared utilities kubernetes/kubernetes#129350

Closed

hshiina reviewed Jan 30, 2025

View reviewed changes

keps/sig-node/1287-in-place-update-pod-resources/README.md Outdated Show resolved Hide resolved

tallclair added 6 commits January 30, 2025 11:46

KEP-1287: ResizeStatus changes

0950ab8

KEP-1287: Safer memory limit decreases

584c14c

KEP-1287: PreferNoRestart

44c6fbd

KEP-1287: More details on limit resize failures

45ec205

KEP-1287: Edge-triggered resizes

be05551

KEP-1287: AllocatedResources and scheduler changes

4e1ca58

tallclair force-pushed the ippr branch from 475bc1d to 9092e2a Compare February 11, 2025 21:35

tallclair added 2 commits February 11, 2025 16:08

KEP-1287: Kubelet restart analysis

c6b13b4

KEP-1287: Make ResizePolicy immutable

00af4e0

tallclair force-pushed the ippr branch from cd9f1ff to e4ae4fc Compare February 12, 2025 00:08

tallclair commented Feb 12, 2025

View reviewed changes

keps/sig-node/1287-in-place-update-pod-resources/README.md Show resolved Hide resolved

KEP-1287: Replace Resize status with conditions

3bcbdde

tallclair force-pushed the ippr branch from e4ae4fc to 3bcbdde Compare February 12, 2025 00:15

dchen1107 reviewed Feb 12, 2025

View reviewed changes

keps/sig-node/1287-in-place-update-pod-resources/README.md Show resolved Hide resolved

dchen1107 reviewed Feb 12, 2025

View reviewed changes

thockin reviewed Feb 12, 2025

View reviewed changes

mrunalp reviewed Feb 13, 2025

View reviewed changes

keps/sig-node/1287-in-place-update-pod-resources/README.md Show resolved Hide resolved

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 13, 2025

vinaykul reviewed Feb 13, 2025

View reviewed changes

keps/sig-node/1287-in-place-update-pod-resources/README.md Show resolved Hide resolved

keps/sig-node/1287-in-place-update-pod-resources/README.md Outdated Show resolved Hide resolved

KEP-1287: Address PR feedback

2062c22

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 13, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 13, 2025

k8s-ci-robot merged commit a5a31d5 into kubernetes:master Feb 13, 2025
4 checks passed

esotsal mentioned this pull request Feb 22, 2025

[FG:InPlacePodVerticalScaling] Never attempt a resize of windows pods and always use allocated resources for unsupported resize pods kubernetes/kubernetes#129216

Merged

witomlin mentioned this pull request Mar 6, 2025

Abstract scale resources ExpediaGroup/container-startup-autoscaler#15

Merged

tallclair mentioned this pull request Mar 11, 2025

[FG:InPlacePodVerticalScaling] Move pod resize status to pod conditions kubernetes/kubernetes#130733

Merged

olderTaoist mentioned this pull request Jun 23, 2025

Add Support for Pod-Level Resources in eviction manager kubernetes/kubernetes#132448

Closed

KEP-1287: InPlacePodVerticalScaling changes for v1.33 #5089

KEP-1287: InPlacePodVerticalScaling changes for v1.33 #5089

Uh oh!

Conversation

tallclair commented Jan 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tallclair commented Jan 24, 2025

Uh oh!

tallclair commented Jan 24, 2025

Uh oh!

esotsal commented Jan 25, 2025

Uh oh!

Uh oh!

Uh oh!

tallclair commented Feb 11, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dchen1107 commented Feb 12, 2025

Uh oh!

thockin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mrunalp commented Feb 13, 2025

Uh oh!

vinaykul left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dchen1107 commented Feb 13, 2025

Uh oh!

k8s-ci-robot commented Feb 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

tallclair commented Jan 24, 2025 •

edited

Loading