Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,7 @@ for these devices:
service PodResourcesLister {
rpc List(ListPodResourcesRequest) returns (ListPodResourcesResponse) {}
rpc GetAllocatableResources(AllocatableResourcesRequest) returns (AllocatableResourcesResponse) {}
rpc Get(GetPodResourcesRequest) returns (GetPodResourcesResponse) {}
}
```

Expand All @@ -223,6 +224,14 @@ id of exclusively allocated CPUs, device id as it was reported by device plugins
the NUMA node where these devices are allocated. Also, for NUMA-based machines, it contains the
information about memory and hugepages reserved for a container.

Starting from Kubernetes v1.27, the `List` enpoint can provide information on resources
of running pods allocated in `ResourceClaims` by the `DynamicResourceAllocation` API. To enable
this feature `kubelet` must be started with the following flags:

```
--feature-gates=DynamicResourceAllocation=true,KubeletPodResourcesDynamiceResources=true
```

```gRPC
// ListPodResourcesResponse is the response returned by List function
message ListPodResourcesResponse {
Expand All @@ -242,6 +251,7 @@ message ContainerResources {
repeated ContainerDevices devices = 2;
repeated int64 cpu_ids = 3;
repeated ContainerMemory memory = 4;
repeated DynamicResource dynamic_resources = 5;
}

// ContainerMemory contains information about memory and hugepages assigned to a container
Expand All @@ -267,6 +277,28 @@ message ContainerDevices {
repeated string device_ids = 2;
TopologyInfo topology = 3;
}

// DynamicResource contains information about the devices assigned to a container by Dynamic Resource Allocation
message DynamicResource {
string class_name = 1;
string claim_name = 2;
string claim_namespace = 3;
repeated ClaimResource claim_resources = 4;
}

// ClaimResource contains per-plugin resource information
message ClaimResource {
repeated CDIDevice cdi_devices = 1 [(gogoproto.customname) = "CDIDevices"];
}

// CDIDevice specifies a CDI device information
message CDIDevice {
// Fully qualified CDI device name
// for example: vendor.com/gpu=gpudevice1
// see more details in the CDI specification:
// https://github.com/container-orchestrated-devices/container-device-interface/blob/main/SPEC.md
string name = 1;
}
```
{{< note >}}
cpu_ids in the `ContainerResources` in the `List` endpoint correspond to exclusive CPUs allocated
Expand Down Expand Up @@ -333,6 +365,36 @@ Support for the `PodResourcesLister service` requires `KubeletPodResources`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) to be enabled.
It is enabled by default starting with Kubernetes 1.15 and is v1 since Kubernetes 1.20.

### `Get` gRPC endpoint {#grpc-endpoint-get}

{{< feature-state state="alpha" for_k8s_version="v1.27" >}}

The `Get` endpoint provides information on resources of a running Pod. It exposes information
similar to those described in the `List` endpoint. The `Get` endpoint requires `PodName`
and `PodNamespace` of the running Pod.

```gRPC
// GetPodResourcesRequest contains information about the pod
message GetPodResourcesRequest {
string pod_name = 1;
string pod_namespace = 2;
}
```

To enable this feature, you must start your kubelet services with the following flag:

```
--feature-gates=KubeletPodResourcesGet=true
```

The `Get` endpoint can provide Pod information related to dynamic resources
allocated by the dynamic resource allocation API. To enable this feature, you must
ensure your kubelet services are started with the following flags:

```
--feature-gates=KubeletPodResourcesGet=true,DynamicResourceAllocation=true,KubeletPodResourcesDynamiceResources=true
```

## Device plugin integration with the Topology Manager

{{< feature-state for_k8s_version="v1.18" state="beta" >}}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,12 @@ gets scheduled onto one node and then cannot run there, which is bad because
such a pending Pod also blocks all other resources like RAM or CPU that were
set aside for it.

## Monitoring resources

The kubelet provides a gRPC service to enable discovery of dynamic resources of
running Pods. For more information on the gRPC endpoints, see the
[resource allocation reporting](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#monitoring-device-plugin-resources).

## Limitations

The scheduler plugin must be involved in scheduling Pods which use
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -125,8 +125,10 @@ For a reference to old feature gates that are removed, please refer to
| `KubeletInUserNamespace` | `false` | Alpha | 1.22 | |
| `KubeletPodResources` | `false` | Alpha | 1.13 | 1.14 |
| `KubeletPodResources` | `true` | Beta | 1.15 | |
| `KubeletPodResourcesGet` | `false` | Alpha | 1.27 | |
| `KubeletPodResourcesGetAllocatable` | `false` | Alpha | 1.21 | 1.22 |
| `KubeletPodResourcesGetAllocatable` | `true` | Beta | 1.23 | |
| `KubeletPodResourcesDynamicResources` | `false` | Alpha | 1.27 | |
| `KubeletTracing` | `false` | Alpha | 1.25 | |
| `LegacyServiceAccountTokenTracking` | `false` | Alpha | 1.25 | |
| `LocalStorageCapacityIsolationFSQuotaMonitoring` | `false` | Alpha | 1.15 | - |
Expand Down Expand Up @@ -578,9 +580,14 @@ Each feature gate is designed for enabling/disabling a specific feature:
- `KubeletPodResources`: Enable the kubelet's pod resources gRPC endpoint. See
[Support Device Monitoring](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/606-compute-device-assignment/README.md)
for more details.
- `KubeletPodResourcesGet`: Enable the `Get` gRPC endpoint on kubelet's for Pod resources.
This API augments the [resource allocation reporting](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#monitoring-device-plugin-resources).
- `KubeletPodResourcesGetAllocatable`: Enable the kubelet's pod resources
`GetAllocatableResources` functionality. This API augments the
[resource allocation reporting](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#monitoring-device-plugin-resources)
- `KubeletPodResourcesDynamiceResources`: Extend the kubelet's pod resources gRPC endpoint to
to include resources allocated in `ResourceClaims` via `DynamicResourceAllocation` API.
See [resource allocation reporting](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#monitoring-device-plugin-resources) for more details.
with informations about the allocatable resources, enabling clients to properly
track the free compute resources on a node.
- `KubeletTracing`: Add support for distributed tracing in the kubelet.
Expand Down