From f6998b4f4e4d7024429779d5c45b0e537aa44937 Mon Sep 17 00:00:00 2001 From: Swati Sehgal Date: Wed, 8 Feb 2023 10:40:37 +0000 Subject: [PATCH] node: topologymgr: address PRR review comments (2) Signed-off-by: Swati Sehgal --- keps/sig-node/693-topology-manager/README.md | 22 +++++++++++--------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/keps/sig-node/693-topology-manager/README.md b/keps/sig-node/693-topology-manager/README.md index 3c6ea58abcea..b995d35c0221 100644 --- a/keps/sig-node/693-topology-manager/README.md +++ b/keps/sig-node/693-topology-manager/README.md @@ -710,7 +710,7 @@ This feature is kubelet specific, so version skew strategy is N/A. - [X] Feature gate (also fill in values in `kep.yaml`) - Feature gate name: TopologyManager - - Components depending on the feature gate: Topology Manager + - Components depending on the feature gate: kubelet Kubelet Flag for the Topology Manager Policy, which is described above. The `none` policy will be the default policy. @@ -743,15 +743,7 @@ Memory Manager and Device Manager to either admit a pod to the node or reject it ###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? -Yes, this feature can be disabled by specifying `TopologyManager` feature gate -in the kubelet configuration. Note that disabling the feature gate requires -kubelet restart for the changes to take effect. In case no pods consuming -resources aligned by Topology Manager are running on the node, disabling -feature gate won't cause any issue. - -If the feature gate is being disabled on a node where such pods are running, -it is the responsibliity of the cluster admin to ensure that the node is -appropriately drained. +Since going to stable in 1.27, the feature gate is locked on as is the standard practice in Kubernetes. ###### What happens if we reenable the feature if it was previously rolled back? @@ -816,6 +808,9 @@ configured. "topology_manager_admission_duration_seconds" (which will be added as this release) can be used to determine if the resource alignment logic performed at pod admission time is taking longer than expected. +Measurements haven't been performed to determine the latency as this metric will be introduced in 1.27 +development cycle but the duration is expected to be very short most likely in the ballpark of 50-100 ms. + ###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? - [X] Metrics @@ -871,6 +866,13 @@ Also, the resource alignment logic is executed at pod admission time which is pr No reported or known increase in resource usage. +###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)? + +No. + +The feature is only responsble for alignment of resources. It does not use node resources like PIDs, sockets, inodes, etc. +for running its alignment algorithm. + ### Troubleshooting ###### How does this feature react if the API server and/or etcd is unavailable?