Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for CRI container metrics #742

Merged
merged 1 commit into from
Jun 28, 2017
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
121 changes: 121 additions & 0 deletions contributors/devel/cri-container-stats.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
# Container Runtime Interface: Container Metrics

[Container runtime interface
(CRI)](https://github.com/kubernetes/community/blob/master/contributors/devel/container-runtime-interface.md)
provides an abstraction for container runtimes to integrate with Kubernetes.
CRI expects the runtime to provide resource usage statistics for the
containers.

## Background

Historically Kubelet relied on the [cAdvisor](https://github.com/google/cadvisor)
library, an open-source project hosted in a separate repository, to retrieve
container metrics such as CPU and memory usage. These metrics are then aggregated
and exposed through Kubelet's [Summary
API](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/apis/stats/v1alpha1/types.go)
for the monitoring pipeline (and other components) to consume. Any container
runtime (e.g., Docker and Rkt) integrated with Kubernetes needed to add a
corresponding package in cAdvisor to support tracking container and image file
system metrics.

With CRI being the new abstraction for integration, it was a natural
progression to augment CRI to serve container metrics to eliminate a separate
integration point.

*See the [core metrics design
proposal](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/core-metrics-pipeline.md)
for more information on metrics exposed by Kubelet, and [monitoring
architecture](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/monitoring_architecture.md)
for the evolving monitoring pipeline in Kubernetes.*

# Container Metrics

Kubelet is responsible for creating pod-level cgroups based on the Quality of
Service class to which the pod belongs, and passes this as a parent cgroup to the
runtime so that it can ensure all resources used by the pod (e.g., pod sandbox,
containers) will be charged to the cgroup. Therefore, Kubelet has the ability
to track resource usage at the pod level (using the built-in cAdvisor), and the
API enhancement focuses on the container-level metrics.


We include the only a set of metrics that are necessary to fulfill the needs of
Kubelet. As the requirements evolve over time, we may extend the API to support
more metrics. Below is the API with the metrics supported today.

```go
// ContainerStats returns stats of the container. If the container does not
// exist, the call returns an error.
rpc ContainerStats(ContainerStatsRequest) returns (ContainerStatsResponse) {}
// ListContainerStats returns stats of all running containers.
rpc ListContainerStats(ListContainerStatsRequest) returns (ListContainerStatsResponse) {}
```

```go
// ContainerStats provides the resource usage statistics for a container.
message ContainerStats {
// Information of the container.
ContainerAttributes attributes = 1;
// CPU usage gathered from the container.
CpuUsage cpu = 2;
// Memory usage gathered from the container.
MemoryUsage memory = 3;
// Usage of the writable layer.
FilesystemUsage writable_layer = 4;
}

// CpuUsage provides the CPU usage information.
message CpuUsage {
// Timestamp in nanoseconds at which the information were collected. Must be > 0.
int64 timestamp = 1;
// Cumulative CPU usage (sum across all cores) since object creation.
UInt64Value usage_core_nano_seconds = 2;
}

// MemoryUsage provides the memory usage information.
message MemoryUsage {
// Timestamp in nanoseconds at which the information were collected. Must be > 0.
int64 timestamp = 1;
// The amount of working set memory in bytes.
UInt64Value working_set_bytes = 2;
}

// FilesystemUsage provides the filesystem usage information.
message FilesystemUsage {
// Timestamp in nanoseconds at which the information were collected. Must be > 0.
int64 timestamp = 1;
// The underlying storage of the filesystem.
StorageIdentifier storage_id = 2;
// UsedBytes represents the bytes used for images on the filesystem.
// This may differ from the total bytes used on the filesystem and may not
// equal CapacityBytes - AvailableBytes.
UInt64Value used_bytes = 3;
// InodesUsed represents the inodes used by the images.
// This may not equal InodesCapacity - InodesAvailable because the underlying
// filesystem may also be used for purposes other than storing images.
UInt64Value inodes_used = 4;
}
```

There are three categories or resources: CPU, memory, and filesystem. Each of
the resource usage message includes a timestamp to indicate when the usage
statistics is collected. This is necessary because some resource usage (e.g.,
filesystem) are inherently more expensive to collect and may be updated less
frequently than others. Having the timestamp allows the consumer to know how
stale/fresh the data is, while giving the runtime flexibility to adjust.

Although CRI does not dictate the frequency of the stats update, Kubelet needs
a minimum guarantee of freshness of the stats for certain resources so that it
can reclaim them timely when under pressure. We will formulate the requirements
for any of such resources and include them in CRI in the near future.


*For more details on why we request cached stats with timestamps as opposed to
requesting stats on-demand, here is the [rationale](https://github.com/kubernetes/kubernetes/pull/45614#issuecomment-302258090)
behind it.*

## Status

The container metrics calls are added to CRI in Kubernetes 1.7, but Kubelet does not
yet use it to gather metrics from the runtime. We plan to enable Kubelet to
optionally consume the container metrics from the API in 1.8.