From 42f5e852090433601c9510779c7e9695266d2107 Mon Sep 17 00:00:00 2001
From: Kevin Torres <kevtorres@google.com>
Date: Thu, 8 Jan 2026 22:01:08 +0000
Subject: [PATCH] Pod Level Resource Managers KEP

---
 .../README.md                                 | 376 ++++++++++++++----
 .../5526-pod-level-resource-managers/kep.yaml |   8 +-
 2 files changed, 301 insertions(+), 83 deletions(-)

diff --git a/keps/sig-node/5526-pod-level-resource-managers/README.md b/keps/sig-node/5526-pod-level-resource-managers/README.md
index 7eaecf6508d9..7cbf9a1ec00b 100644
--- a/keps/sig-node/5526-pod-level-resource-managers/README.md
+++ b/keps/sig-node/5526-pod-level-resource-managers/README.md
@@ -86,6 +86,7 @@ tags, and then generate with `hack/update-toc.sh`.
 
 <!-- toc -->
 - [Release Signoff Checklist](#release-signoff-checklist)
+- [Glossary](#glossary)
 - [Summary](#summary)
 - [Motivation](#motivation)
   - [Goals](#goals)
@@ -154,27 +155,27 @@ Check these off as they are completed for the Release Team to track. These
 checklist items _must_ be updated for the enhancement to be released.
 -->
 
-Items marked with (R) are required *prior to targeting to a milestone /
+Items marked with `R` are required *prior to targeting to a milestone /
 release*.
 
--   [ ](R) Enhancement issue in release milestone, which links to KEP dir in
+-   [x] `R` Enhancement issue in release milestone, which links to KEP dir in
     [kubernetes/enhancements](not the initial KEP PR)
--   [ ](R) KEP approvers have approved the KEP status as `implementable`
--   [ ](R) Design details are appropriately documented
--   [ ](R) Test plan is in place, giving consideration to SIG Architecture and
+-   [x] `R` KEP approvers have approved the KEP status as `implementable`
+-   [x] `R` Design details are appropriately documented
+-   [ ] `R` Test plan is in place, giving consideration to SIG Architecture and
     SIG Testing input (including test refactors)
     -   [ ] e2e Tests for all Beta API Operations (endpoints)
-    -   [ ](R) Ensure GA e2e tests meet requirements for
+    -   [ ] `R` Ensure GA e2e tests meet requirements for
         [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
-    -   [ ](R) Minimum Two Week Window for GA e2e tests to prove flake free
--   [ ](R) Graduation criteria is in place
-    -   [ ](R)
+    -   [ ] `R` Minimum Two Week Window for GA e2e tests to prove flake free
+-   [ ] `R` Graduation criteria is in place
+    -   [ ] `R`
         [all GA Endpoints](https://github.com/kubernetes/community/pull/1806)
         must be hit by
         [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
         within one minor version of promotion to GA
--   [ ](R) Production readiness review completed
--   [ ](R) Production readiness review approved
+-   [x] `R` Production readiness review completed
+-   [x] `R` Production readiness review approved
 -   [ ] "Implementation History" section is up-to-date for milestone
 -   [ ] User-facing documentation has been created in [kubernetes/website], for
     publication to [kubernetes.io]
@@ -190,13 +191,30 @@ release*.
 [kubernetes/kubernetes]: https://git.k8s.io/kubernetes
 [kubernetes/website]: https://git.k8s.io/website
 
+## Glossary
+
+-   **Pod Level Resources**: The resource budget defined at the pod level in
+    `pod.spec.resources`, which specifies the collective requests and limits for
+    the entire pod.
+-   **Guaranteed Container**: Within the context of this KEP, a container is
+    considered `Guaranteed` if it specifies resource requests equal to its
+    limits for both CPU and Memory. This status makes it eligible for exclusive
+    resource allocation from the resource managers, provided the topology scope
+    allows it.
+-   **Pod Shared Pool**: The subset of a pod's allocated resources that remains
+    after all exclusive slices have been reserved. These resources are shared by
+    all containers in the pod that do not receive an exclusive allocation.
+-   **Exclusive Slice**: A dedicated portion of resources (e.g., specific CPUs
+    or memory pages) allocated solely to a single container, ensuring isolation
+    from other containers.
+
 ## Summary
 
 This KEP proposes extending the Kubelet's Topology, CPU, and Memory Managers to
 support pod-level resource specifications. This enhancement evolves the resource
-managers from a strictly per-container allocation model to a pod-centric one. It
-enables them to use `pod.spec.resources` to perform NUMA alignment for the pod
-as a whole, and introduces a partitioning scheme to manage resources for
+managers from a strictly per-container allocation model to a pod-centric one,
+that enables them to use `pod.spec.resources` to perform NUMA alignment for the
+pod as a whole, and introduces a partitioning scheme to manage resources for
 containers within that pod-level grouping. This change introduces a more
 flexible and powerful resource management model, particularly for
 performance-sensitive workloads.
@@ -353,6 +371,10 @@ section.
 
 -   This feature is dependent on the `PodLevelResources` feature gate being
     enabled.
+-   If the `PodLevelResourceManagers` feature gate is disabled, pods utilizing
+    pod-level resources remain eligible for admission; however, they will not
+    receive NUMA-aligned or exclusive resource allocations from the CPU and
+    Memory managers, even if the pod's QoS class is `Guaranteed`.
 -   The functionality is only implemented for the `static` CPU Manager policy
     and the `Static` Memory Manager policy.
     -   Other policies like `none` are unaffected as they do not perform
@@ -455,9 +477,9 @@ can be aligned to a single NUMA node or a set of NUMA nodes.
 
 Pod-level resources | Container resources | CPU and Memory manager behavior
 :------------------ | :------------------ | :------------------------------
-Unset               | Set                 | Current behavior (alignment and exclusive allocation based on container resources)
-Set                 | Unset               | Alignment and exclusive allocation for the overall pod. All containers share the resulting resource pool.
-Set                 | Set                 | Alignment and exclusive allocation for the overall pod. The pod's resource pool is partitioned into exclusive slices for `Guaranteed` containers and a shared pool for the rest.
+Unset               | Set                 | Current behavior (alignment based on aggregated container resources, exclusive allocation based on individual container resources).
+Set                 | Unset               | Alignment and exclusive allocation for the overall pod based on pod-level resources. All containers share the resulting resource pool.
+Set                 | Set                 | Alignment and exclusive allocation for the overall pod based on pod-level resources. The pod's resource pool is partitioned into exclusive slices for `Guaranteed` containers and a shared pool for the rest.
 
 When `pod.spec.resources` are defined, they are the sole determinant of the
 pod's QoS class. However, the resource managers will still check each
@@ -482,10 +504,29 @@ New cases that need to be covered:
         allocated the remaining shared portion of the pool.
 
 Already existing logic for standard and restartable init containers is
-compatible with the changes presented in this KEP. They will continue be
-allocated exclusive slices from the pod's managed resource pool. The resources
-from standard init containers will be made reusable by the app containers within
-the pod's shared pool.
+compatible with the changes presented in this KEP. Standard init containers that
+are `Guaranteed` will be allocated exclusive slices from the pod's managed
+resource pool. Because they run sequentially, their resources are made reusable
+for subsequent app containers. Restartable init containers (sidecars) that are
+`Guaranteed` will also be allocated exclusive slices. Since they are intended to
+run for the entire duration of the pod's life, their resources are reserved for
+the entire pod lifecycle and are not returned to the pool for reuse. This
+results in a progressively decreasing `podSharedPool` as more sidecars with
+exclusive resource requirements are defined.
+
+Additionally, if a standard init container does not specify resource requests
+and limits (placing it in the shared pool), it is granted access to the entire
+pod-level resource budget, minus only the resources exclusively allocated to any
+`Guaranteed` sidecars running up to that point. This ensures that init
+containers can burst to utilize maximum available resources during
+initialization.
+
+In contrast, sidecars that run in the shared pool are assigned the final
+`podSharedPool`. This pool consists of the resources remaining after all
+exclusive allocations (for both `Guaranteed` sidecars and `Guaranteed`
+application containers) have been reserved. This ensures that shared sidecars do
+not interfere with the exclusive resources of application containers once they
+start.
 
 Pod Scope                                                             | Spec                                                                               | Hint Generation | Allocation
 :-------------------------------------------------------------------- | :--------------------------------------------------------------------------------- | :-------------- | :---------
@@ -711,7 +752,7 @@ bound by the pod's overall resource limit.
 
 Pod-level resources | Container resources | CPU and Memory manager behavior
 :------------------ | :------------------ | :------------------------------
-Unset               | Set                 | Current behavior (alignment and exclusive allocation based on container resources)
+Unset               | Set                 | Current behavior (alignment and exclusive allocation based on individual container resources).
 Set                 | Unset               | No alignment or exclusive allocation. Containers run in the node's shared pool.
 Set                 | Set                 | Alignment and exclusive allocation for containers that specify `Guaranteed` resources; pod-level resources are ignored for allocation decisions.
 
@@ -876,32 +917,58 @@ based on their individual requirements.
     reserved resources while allowing all other containers to efficiently share
     the remainder of the pod's budget.
 
-3.  **Handling Init Containers:** The existing, well-established logic for
-    handling init container resources is reused and applied within the new
-    pod-level resource pool. The allocation is pre-calculated based on the
+3.  **[New Logic] Handling Init Containers:** The existing, well-established
+    logic for handling init container resources is reused and applied within the
+    new pod-level resource pool. The allocation is pre-calculated based on the
     predictable, sequential lifecycle of init containers.
 
-    -   **Standard Init Containers:** Following the current logic, if a standard
-        init container is individually `Guaranteed`, it is allocated an
-        exclusive slice from the pod's total resource pool. At allocation time,
-        these resources are also added to a per-pod reusable set, making them
-        available for subsequent app containers (whether for exclusive
-        allocations or within the `podSharedPool`).
-    -   **Restartable Init Containers (Sidecars):** Consistent with the existing
-        sidecar behavior, if a restartable init container is individually
-        `Guaranteed`, it is allocated an exclusive slice that is reserved for
-        the entire pod lifecycle and is *not* added to the reusable set.
+    -   **Standard Init Containers:** If individually `Guaranteed`, they are
+        allocated an exclusive slice. These resources are added to a per-pod
+        reusable set upon completion, making them available for subsequent app
+        containers. If they are `non-Guaranteed` (shared), they are granted
+        access to the entire pod-level resource budget minus any resources
+        exclusively allocated to `Guaranteed` sidecars running at that time.
+        This allows standard init containers to burst during initialization.
+    -   **Restartable Init Containers (Sidecars):** If individually
+        `Guaranteed`, they receive an exclusive slice reserved for the entire
+        pod lifecycle. If they are `non-Guaranteed` (shared), they are assigned
+        the final `podSharedPool`. This pool consists of resources remaining
+        after all exclusive allocations (for both sidecars and application
+        containers) have been carved out. This ensures that shared sidecars
+        never compete with the exclusive resources of application containers.
+
+    This model results in a progressively decreasing shared resource set for
+    standard init containers as more `Guaranteed` sidecars are started,
+    eventually settling into the final `podSharedPool` used by all application
+    containers and shared sidecars.
 
 4.  **[New Logic] Pre-computation and State Update:** The manager pre-computes
     and writes the specific assignment for every container to the state file.
     `Guaranteed` containers are assigned their exclusive slices, and all other
-    containers are assigned the `podSharedPool`.
+    containers are assigned the `podSharedPool`. To ensure state consistency and
+    support accurate recovery after restarts, a new top-level property is added
+    to the state file to track the aggregate pod-level resource allocation. For
+    CPU Manager, this is `PodCPUAssignments` (a map of pod UIDs to `PodEntry`
+    containing the `CPUSet`). For Memory Manager, this is `PodMemoryAssignments`
+    (a map of pod UIDs to `PodEntry` containing the `MemoryBlocks`). This entry
+    serves as the source of truth for the pod's total NUMA-aligned resource
+    reservation.
 
 5.  **Container-Level Allocation:** For the current container (and all
     subsequent containers in the pod), the `Allocate` function reads that
     container's specific, pre-computed assignment from the state and applies it
     to the container's cgroup.
 
+6.  **[New Logic] Container and Pod Removal:** When a container is removed, its
+    individual assignment is deleted from the state. The pod-level resource
+    allocation persists to maintain alignment for any remaining containers.
+    Resources exclusively allocated to a container within a pod are not returned
+    to the pod's shared pool upon that container's termination; they remain
+    reserved for the pod until it is fully decommissioned. Only when the last
+    container of the pod is removed are the pod's resources released back to the
+    node's general pool and the aggregate pod-level entry is deleted from the
+    state.
+
 ### CPU Manager
 
 The CPU Manager will be updated to support pod-level resources when its policy
@@ -993,21 +1060,48 @@ resource-specific responsibilities include:
 
 #### State Management and Container Removal
 
-The in-memory state and on-disk checkpoint file will be updated to be pod-aware.
-For the `pod` scope, the state will now store the total managed resource set for
-the pod and the individual assignments for each container (either an exclusive
-slice or the `podSharedPool`).
+The in-memory state and on-disk checkpoint file will be updated to facilitate
+the transition from a container-only model to a pod-aware model. This is
+achieved by introducing a new top-level field in the checkpoint file:
+`PodCPUAssignments`. This map uses pod UIDs as keys and the new `PodEntry`
+struct as values.
+
+The `PodEntry` struct is explicitly designed as an extensible container for all
+state information associated with a pod-level resource allocation. While its
+initial implementation only encapsulates the overall `CPUSet` assigned to the
+pod (`CPUSet cpuset.CPUSet`), using a struct instead of a raw type is a
+strategic choice for future-proofing. It allows for the seamless addition of new
+fields—such as the original `TopologyHint` used for alignment, the original
+resource request, or other metadata—without requiring breaking changes to the
+state file schema or complex migration logic for existing checkpoints.
+
+The state management logic is designed to support upgrade state compatibility,
+as detailed in the
+[Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
+section. This ensures that the Kubelet can transparently migrate from
+container-only checkpoints to the new pod-level aware format.
+
+```go
+// PodCPUAssignments contains pod-level CPU assignments.
+type PodCPUAssignments map[string]PodEntry
+
+// PodEntry represents pod-level CPU assignments for a pod
+type PodEntry struct {
+    CPUSet cpuset.CPUSet `json:"cpuSet,omitempty"`
+}
+```
 
-The implementation must also consider the potential need to enhance the state
+The implementation also considers the potential need to enhance the state
 representation to support features like the correct handling of CPU quotas, as
-described in the "Interaction with CPU Quota Management" section. Any changes to
-the on-disk state format will require a carefully designed migration and
-rollback strategy.
+described in the "Interaction with CPU Quota Management" section.
 
-The `RemoveContainer` function will also be updated. When a container with an
-exclusive slice is removed, its resources will be returned to the pod's shared
-pool. The pod's entire managed resource set will only be released back to the
-node's shared pool when the last container of that pod is removed.
+When a container is removed, the `RemoveContainer` function first deletes the
+container-level assignment. For pods using the `pod` scope, the manager then
+checks if any containers from that pod remain in the state. Only when the last
+container is removed does the manager release the pod's aggregate `CPUSet` back
+to the node's default pool and delete the entry from `PodCPUAssignments`. This
+ensures that the NUMA-aligned pool remains reserved for the pod as long as it is
+running.
 
 #### Interaction with CPU Quota Management
 
@@ -1052,6 +1146,66 @@ measure; it is a mandatory requirement to ensure that:
 2.  Containers within the new `podSharedPool` have their quotas correctly
     enforced to maintain resource fairness between them.
 
+To achieve this, the CPU Manager performs a check to determine the
+`ResourceIsolationLevel` for each container. This logic evaluates the
+container's resource assignment and its eligibility for exclusivity to
+distinguish between the following levels of isolation. Note that
+`ResourceIsolationLevel` is an internal Kubelet concept used for resource
+accounting and cgroup configuration; it is not part of the user-facing
+Kubernetes API.
+
+-   **`ResourceIsolationContainer`**
+
+    -   **Description:** Resources are exclusive to the container.
+    -   **Quota Behavior:** The CFS quota is disabled to allow full, unthrottled
+        access to the exclusive CPUs.
+    -   **Applicability:** Applies to any container that receives exclusive
+        resources, whether in a standard pod or a pod using pod-level resources
+        (in both `pod` and `container` scopes).
+
+-   **`ResourceIsolationPod`**
+
+    -   **Description:** Resources are isolated from other pods but shared among
+        containers within the same NUMA-aligned pod budget (the
+        `podSharedPool`).
+    -   **Quota Behavior:** The CFS quota corresponds to what the pod-level
+        request/limit define. However, if the `podSharedPool` is smaller than
+        the pod-level definition (due to other `Guaranteed` containers consuming
+        part of the budget), the containers are effectively constrained by the
+        CPU set of the `podSharedPool`. The CFS quota remains enabled to ensure
+        fairness and prevent noisy neighbors within the pod's shared pool.
+    -   **Applicability:** Specific to `non-Guaranteed` containers in a
+        `Guaranteed` pod when `topology-manager-scope=pod`.
+
+-   **`ResourceIsolationHost`**
+
+    -   **Description:** Resources are drawn from the node's general shared
+        pool.
+    -   **Quota Behavior:** The CFS quota remains enabled as per standard
+        behavior.
+    -   **Applicability:** Applies to `non-Guaranteed` containers in a
+        `Guaranteed` pod when `topology-manager-scope=container`, or any
+        container in a `non-Guaranteed` pod.
+
+The `ContainerHasExclusiveCPUs` and `PodHasExclusiveCPUs` checks are updated to
+return `true` only when the calculated isolation level is
+`ResourceIsolationContainer`, preventing the unintended disabling of quotas for
+containers in the `podSharedPool`.
+
+```go
+// ResourceIsolationLevel defines the level of isolation for a resource.
+type ResourceIsolationLevel string
+
+const (
+    // ResourceIsolationHost implies the resource is shared with other containers on the host.
+    ResourceIsolationHost ResourceIsolationLevel = "host"
+    // ResourceIsolationPod implies the resource is isolated from other pods but shared within the pod.
+    ResourceIsolationPod ResourceIsolationLevel = "pod"
+    // ResourceIsolationContainer implies the resource is exclusive to the container.
+    ResourceIsolationContainer ResourceIsolationLevel = "container"
+)
+```
+
 ### Memory Manager
 
 The Memory Manager will be updated to support pod-level resources when its
@@ -1101,15 +1255,42 @@ resource-specific responsibilities include:
 
 #### State Management and Container Removal
 
-The in-memory state and on-disk checkpoint file will be updated to be pod-aware.
-For the `pod` scope, the state will now store the total managed resource set for
-the pod and the individual assignments for each container (either an exclusive
-slice or the `podSharedPool`).
+The in-memory state and on-disk checkpoint file will be updated to facilitate
+the transition from a container-only model to a pod-aware model. This is
+achieved by introducing a new top-level field in the checkpoint file:
+`PodMemoryAssignments`. This map uses pod UIDs as keys and the `PodEntry` struct
+as values, mirroring the pattern established in the CPU Manager.
+
+The `PodEntry` struct encapsulates the pod-level memory assignment, currently
+containing the overall `MemoryBlocks` slice assigned to the pod. As with the CPU
+Manager, using a struct here is a forward-looking design choice. It allows for
+the future inclusion of additional metadata—such as the original topology hint
+or request details—without disrupting the state file schema or requiring complex
+migrations.
+
+The state management logic is designed to support upgrade state compatibility,
+as detailed in the
+[Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
+section. This ensures that the Kubelet can transparently migrate from
+container-only checkpoints to the new pod-level aware format.
+
+```go
+// PodMemoryAssignments stores memory assignments of pods
+type PodMemoryAssignments map[string]PodEntry
+
+// PodEntry represents pod-level memory assignments for a pod
+type PodEntry struct {
+    MemoryBlocks []Block `json:"memoryBlocks,omitempty"`
+}
+```
 
-The `RemoveContainer` function will also be updated. When a container with an
-exclusive slice is removed, its resources will be returned to the pod's shared
-pool. The pod's entire managed resource set will only be released back to the
-node's shared pool when the last container of that pod is removed.
+When a container is removed, the `RemoveContainer` function first deletes the
+container-level assignment. For pods using the `pod` scope, the manager then
+checks if any containers from that pod remain in the state. Only when the last
+container is removed does the manager release the pod's aggregate `MemoryBlocks`
+back to the node's free pool and delete the entry from `PodMemoryAssignments`.
+This ensures that the NUMA-aligned memory remains reserved for the pod as long
+as it is running.
 
 ### Future Enhancements and Long-Term Vision
 
@@ -1177,6 +1358,16 @@ the new partitioning logic and the updated `getPodRequestedResources`
 function. - `pkg/kubelet/allocation/state/state_mem_test.go`: - Add tests for
 the updated `GetContainerResources` logic.
 
+Additionally, dedicated state restoration tests will be implemented in:
+
+-   `pkg/kubelet/cm/cpumanager/policy_static_restore_test.go`
+-   `pkg/kubelet/cm/memorymanager/policy_static_restore_test.go`
+
+These tests will validate that pod-level resource assignments
+(`PodCPUAssignments` and `PodMemoryAssignments`) will be correctly persisted to
+the checkpoint file and restored accurately after a Kubelet restart, ensuring
+alignment consistency across the pod lifecycle.
+
 ##### Integration tests
 
 N/A. The functionality is confined to the Kubelet and does not require
@@ -1212,6 +1403,32 @@ end-to-end behavior. These tests will:
 -   Verify that resources are correctly aligned and allocated by inspecting the
     cgroup filesystem on the node (e.g., `CPU set.cpus`).
 
+Additional e2e tests will be implemented to validate the correctness of
+pod-level resource accounting and the lifecycle of aggregate resource
+allocations.
+
+-   **Pod-Level Accounting for Shared Containers:** These tests will create
+    multiple pods, each requesting a significant portion of the node's
+    allocatable resources (CPU or Memory) at the pod level, but containing
+    multiple `non-Guaranteed` containers. These tests verify that accounting is
+    correctly performed against the pod-level budget. If the system incorrectly
+    summed container-level resources (which are unset), the pods would be
+    rejected; instead, the tests will ensure both pods are admitted and run
+    concurrently based on their aggregate pod-level specifications.
+-   **Pod Shared Pool Resource Cleanup:** These tests will verify that resources
+    will be correctly released from the pod shared pool, even in cases where no
+    containers are explicitly linked to it at the time of removal. This
+    demonstrates that the container manager correctly identifies and cleans up
+    the pod-level aggregate allocation from the state when the pod's container
+    lifecycle ends.
+-   **Sequential Resource Recycle:** These tests will demonstrate that pod-level
+    resources will be correctly released and made available for subsequent
+    workloads. It will run two `Guaranteed` pods sequentially, each requesting a
+    large portion of the node's resources. The test will verify that the second
+    pod will be successfully allocated the same physical resources as the first
+    pod, proving that the managers correctly clean up and return the pod-level
+    bubble to the node's assignable pool once the pod terminates.
+
 ###### Specific Test Scenarios
 
 -   **Pod Scope, Shared Only:** A `Guaranteed` pod with only pod-level resources
@@ -1472,22 +1689,23 @@ back up if an issue is present.
         ignored by the resource managers. After enabling the feature, new or
         restarted pods will be correctly aligned, and new entries will be added
         to the state file.
-    *   **Mitigation (Forward Compatibility):** The new Kubelet version will be
+    *   **Mitigation (Backward Compatibility):** The new Kubelet version will be
         implemented to transparently read the old state file format. When it
         admits a new pod using this feature, it will write the new pod-aware
         entries, ensuring a seamless upgrade path without manual intervention.
 
 *   **Rollback (Disabling the Feature):** This is a higher-risk operation.
 
-    *   **Failure Mode:** The main risk is state file incompatibility. An older
-        Kubelet may fail to parse the new state file format, leading to a crash
-        loop or preventing any Guaranteed pods from being admitted on the node.
-    *   **Mitigation (Backward Compatibility):** The new state file will be
-        structured to maintain backward compatibility. New pod-level fields will
-        be added in a way that allows older Kubelet versions to safely ignore
-        them. However, to completely eliminate any risk of parsing errors, the
-        safest and recommended rollback procedure involves draining the node and
-        deleting the state file.
+    *   **Failure Mode:** Forward compatibility is not supported. An older
+        Kubelet will fail to validate the state file due to the presence of the
+        new pod-level fields, which will cause it to mark the checkpoint as
+        corrupted. This leads to a crash loop or a failure to admit Guaranteed
+        pods.
+    *   **Mitigation (Forward Compatibility):** The changes to the state file
+        are additive (new fields only), preserving the overall format. However,
+        because older Kubelets cannot gracefully handle these new fields, the
+        only safe rollback path involves draining the node and deleting the
+        state file before restarting the older Kubelet version.
 
 **Recommended Operator Process for Rollout and Rollback**
 
@@ -1690,27 +1908,25 @@ pod-level resource management.
 
     -   Metric name: `resource_manager_allocations_total`
         -   Type: `CounterVec`
-        -   Labels: `resource_name` (e.g., "cpu", "memory"), `source` (e.g.,
-            "pod", "node")
+        -   Labels: `resource_name` ("cpu", "memory"), `source` ("pod", "node")
         -   Description: Counts the total number of exclusive resource
             allocations performed by a manager. The `source` label distinguishes
             between allocations drawn from the node-level pool (`node`) versus a
             pre-allocated pod-level pool (`pod`).
     -   Metric name: `resource_manager_allocation_errors_total`
         -   Type: `CounterVec`
-        -   Labels: `resource_name` (e.g., "cpu", "memory"), `source` (e.g.,
-            "pod", "node")
+        -   Labels: `resource_name` ("cpu", "memory"), `source` ("pod", "node")
         -   Description: Counts errors encountered during exclusive resource
             allocation, distinguished by the intended allocation source.
     -   Metric name: `resource_manager_container_assignments`
-        -   Type: `GaugeVec`
-        -   Labels: `resource_name` (e.g., "cpu", "memory"), `assignment_type`
-            (e.g., "exclusive_from_node", "exclusive_from_pod",
-            "shared_from_pod")
-        -   Description: Tracks the current number of containers with a specific
-            type of resource assignment. This provides visibility into how many
-            containers are running with exclusive resources (from the node or
-            pod pool) versus the pod-level shared pool.
+        -   Type: `CounterVec`
+        -   Labels: `resource_name` ("cpu", "memory"), `assignment_type`
+            ("node_exclusive", "pod_exclusive", "pod_shared")
+        -   Description: Counts the total number of containers that will be
+            granted a specific type of resource assignment. This provides
+            visibility into how many containers are running with exclusive
+            resources (from the node or pod pool) versus the pod-level shared
+            pool.
 
 -   [X] Already existing Metrics
 
@@ -1718,6 +1934,8 @@ pod-level resource management.
         -   Aggregation method: `sum()`
     -   Metric name: `topology_manager_admission_errors_total`
         -   Aggregation method: `sum()`
+    -   Metric name: `topology_manager_admission_duration_ms`
+        -   Aggregation method: `histogram_quantile`
     -   Metric name: `cpu_manager_pinning_errors_total`
         -   Aggregation method: `sum()`
     -   Metric name: `memory_manager_pinning_errors_total`
@@ -1867,7 +2085,7 @@ Major milestones might include:
 - when the KEP was retired or superseded
 -->
 
--   **v1.35**: Target for initial alpha implementation.
+-   **v1.36**: Target for initial alpha implementation.
 
 ## Drawbacks
 
diff --git a/keps/sig-node/5526-pod-level-resource-managers/kep.yaml b/keps/sig-node/5526-pod-level-resource-managers/kep.yaml
index df4586303ce0..bd73f313f223 100644
--- a/keps/sig-node/5526-pod-level-resource-managers/kep.yaml
+++ b/keps/sig-node/5526-pod-level-resource-managers/kep.yaml
@@ -27,13 +27,13 @@ stage: alpha
 # The most recent milestone for which work toward delivery of this KEP has been
 # done. This can be the current (upcoming) milestone, if it is being actively
 # worked on.
-latest-milestone: "v1.35"
+latest-milestone: "v1.36"
 
 # The milestone at which this feature was, or is targeted to be, at each stage.
 milestone:
-  alpha: "v1.35"
-  beta: "v1.36"
-  stable: "v1.38"
+  alpha: "v1.36"
+  beta: "v1.37"
+  stable: "v1.39"
 
 # The following PRR answers are required at alpha release
 # List the feature gate name and the components for which it must be enabled