diff --git a/hpa-v2.md b/hpa-v2.md new file mode 100644 index 00000000000..8d48dbce948 --- /dev/null +++ b/hpa-v2.md @@ -0,0 +1,291 @@ +Horizontal Pod Autoscaler with Arbitary Metrics +=============================================== + +The current Horizontal Pod Autoscaler object only has support for CPU as +a percentage of requested CPU. While this is certainly a common case, one +of the most frequently sought-after features for the HPA is the ability to +scale on different metrics (be they custom metrics, memory, etc). + +The current HPA controller supports targeting "custom" metrics (metrics +with a name prefixed with "custom/") via an annotation, but this is +suboptimal for a number of reasons: it does not allow for arbitrary +"non-custom" metrics (e.g. memory), it does not allow for metrics +describing other objects (e.g. scaling based on metrics on services), and +carries the various downsides of annotations (not be typed/validated, +being hard for a user to hand-construct, etc). + +Object Design +------------- + +### Requirements ### + +This proposal describes a new version of the Horizontal Pod Autoscaler +object with the following requirements kept in mind: + +1. The HPA should continue to support scaling based on percentage of CPU + request + +2. The HPA should support scaling on arbitrary metrics associated with + pods + +3. The HPA should support scaling on arbitrary metrics associated with + other Kubernetes objects in the same namespace as the HPA (and the + namespace itself) + +4. The HPA should make scaling on multiple metrics in a single HPA + possible and explicit (splitting metrics across multiple HPAs leads to + the possibility of fighting between HPAs) + +### Specification ### + +```go +type HorizontalPodAutoscalerSpec struct { + // the target scalable object to autoscale + ScaleTargetRef CrossVersionObjectReference `json:"scaleTargetRef"` + + // the minimum number of replicas to which the autoscaler may scale + // +optional + MinReplicas *int32 `json:"minReplicas,omitempty"` + // the maximum number of replicas to which the autoscaler may scale + MaxReplicas int32 `json:"maxReplicas"` + + // the metrics to use to calculate the desired replica count (the + // maximum replica count across all metrics will be used). The + // desired replica count is calculated multiplying the ratio between + // the target value and the current value by the current number of + // pods. Ergo, metrics used must decrease as the pod count is + // increased, and vice-versa. See the individual metric source + // types for more information about how each type of metric + // must respond. + // +optional + Metrics []MetricSpec `json:"metrics,omitempty"` +} + +// a type of metric source +type MetricSourceType string +var ( + // a metric describing a kubernetes object (for example, hits-per-second on an Ingress object) + ObjectSourceType MetricSourceType = "Object" + // a metric describing each pod in the current scale target (for example, transactions-processed-per-second). + // The values will be averaged together before being compared to the target value + PodsSourceType MetricSourceType = "Pods" + // a resource metric known to Kubernetes, as specified in requests and limits, describing each pod + // in the current scale target (e.g. CPU or memory). Such metrics are built in to Kubernetes, + // and have special scaling options on top of those available to normal per-pod metrics (the "pods" source) + ResourceSourceType MetricSourceType = "Resource" +) + +// a specification for how to scale based on a single metric +// (only `type` and one other matching field should be set at once) +type MetricSpec struct { + // the type of metric source (should match one of the fields below) + Type MetricSourceType `json:"type"` + + // a metric describing a single kubernetes object (for example, hits-per-second on an Ingress object) + Object *ObjectMetricSource `json:"object,omitempty"` + // a metric describing each pod in the current scale target (for example, transactions-processed-per-second). + // The values will be averaged together before being compared to the target value + Pods *PodsMetricSource `json:"pods,omitemtpy"` + // a resource metric (such as those specified in requests and limits) known to Kubernetes + // describing each pod in the current scale target (e.g. CPU or memory). Such metrics are + // built in to Kubernetes, and have special scaling options on top of those available to + // normal per-pod metrics using the "pods" source. + Resource *ResourceMetricSource `json:"resource,omitempty"` +} + +// a metric describing a single kubernetes object (for example, hits-per-second on an Ingress object) +type ObjectMetricSource struct { + // the described Kubernetes object + Target CrossVersionObjectReference `json:"target"` + + // the name of the metric in question + MetricName string `json:"metricName"` + // the target value of the metric (as a quantity) + TargetValue resource.Quantity `json:"targetValue"` +} + +// a metric describing each pod in the current scale target (for example, transactions-processed-per-second). +// The values will be averaged together before being compared to the target value +type PodsMetricSource struct { + // the name of the metric in question + MetricName string `json:"metricName"` + // the target value of the metric (as a quantity) + TargetAverageValue resource.Quantity `json:"targetAverageValue"` +} + +// a resource metric known to Kubernetes, as specified in requests and limits, describing each pod +// in the current scale target (e.g. CPU or memory). The values will be averaged together before +// being compared to the target. Such metrics are built in to Kubernetes, and have special +// scaling options on top of those available to normal per-pod metrics using the "pods" source. +// Only one "target" type should be set. +type ResourceMetricSource struct { + // the name of the resource in question + Name api.ResourceName `json:"name"` + // the target value of the resource metric, represented as + // a percentage of the requested value of the resource on the pods. + // +optional + TargetAverageUtilization *int32 `json:"targetAverageUtilization,omitempty"` + // the target value of the resource metric as a raw value, similarly + // to the "pods" metric source type. + // +optional + TargetAverageValue *resource.Quantity `json:"targetAverageValue,omitempty"` +} + +type HorizontalPodAutoscalerStatus struct { + // most recent generation observed by this autoscaler. + ObservedGeneration *int64 `json:"observedGeneration,omitempty"` + // last time the autoscaler scaled the number of pods; + // used by the autoscaler to control how often the number of pods is changed. + LastScaleTime *unversioned.Time `json:"lastScaleTime,omitempty"` + + // the last observed number of replicas from the target object. + CurrentReplicas int32 `json:"currentReplicas"` + // the desired number of replicas as last computed by the autoscaler + DesiredReplicas int32 `json:"desiredReplicas"` + + // the last read state of the metrics used by this autoscaler + CurrentMetrics []MetricStatus `json:"currentMetrics" protobuf:"bytes,5,rep,name=currentMetrics"` +} + +// the status of a single metric +type MetricStatus struct { + // the type of metric source + Type MetricSourceType `json:"type"` + + // a metric describing a single kubernetes object (for example, hits-per-second on an Ingress object) + Object *ObjectMetricStatus `json:"object,omitemtpy"` + // a metric describing each pod in the current scale target (for example, transactions-processed-per-second). + // The values will be averaged together before being compared to the target value + Pods *PodsMetricStatus `json:"pods,omitemtpy"` + // a resource metric known to Kubernetes, as specified in requests and limits, describing each pod + // in the current scale target (e.g. CPU or memory). Such metrics are built in to Kubernetes, + // and have special scaling options on top of those available to normal per-pod metrics using the "pods" source. + Resource *ResourceMetricStatus `json:"resource,omitempty"` +} + +// a metric describing a single kubernetes object (for example, hits-per-second on an Ingress object) +type ObjectMetricStatus struct { + // the described Kubernetes object + Target CrossVersionObjectReference `json:"target"` + + // the name of the metric in question + MetricName string `json:"metricName"` + // the current value of the metric (as a quantity) + CurrentValue resource.Quantity `json:"currentValue"` +} + +// a metric describing each pod in the current scale target (for example, transactions-processed-per-second). +// The values will be averaged together before being compared to the target value +type PodsMetricStatus struct { + // the name of the metric in question + MetricName string `json:"metricName"` + // the current value of the metric (as a quantity) + CurrentAverageValue resource.Quantity `json:"currentAverageValue"` +} + +// a resource metric known to Kubernetes, as specified in requests and limits, describing each pod +// in the current scale target (e.g. CPU or memory). The values will be averaged together before +// being compared to the target. Such metrics are built in to Kubernetes, and have special +// scaling options on top of those available to normal per-pod metrics using the "pods" source. +// Only one "target" type should be set. Note that the current raw value is always displayed +// (even when the current values as request utilization is also displayed). +type ResourceMetricStatus struct { + // the name of the resource in question + Name api.ResourceName `json:"name"` + // the target value of the resource metric, represented as + // a percentage of the requested value of the resource on the pods + // (only populated if the corresponding request target was set) + // +optional + CurrentAverageUtilization *int32 `json:"currentAverageUtilization,omitempty"` + // the current value of the resource metric as a raw value + CurrentAverageValue resource.Quantity `json:"currentAverageValue"` +} +``` + +### Example ### + +In this example, we scale based on the `hits-per-second` value recorded as +describing a service in our namespace, plus the CPU usage of the pods in +the ReplicationController being autoscaled. + +```yaml +kind: HorizontalPodAutoscaler +apiVersion: autoscaling/v2alpha1 +spec: + scaleTargetRef: + kind: ReplicationController + name: WebFrontend + minReplicas: 2 + maxReplicas: 10 + metrics: + - type: Resource + resource: + name: cpu + targetAverageUtilization: 80 + - type: Object + object: + target: + kind: Service + name: Frontend + metricName: hits-per-second + targetValue: 1k +``` + +### Alternatives and Future Considerations ### + +Since the new design mirrors volume plugins (and similar APIs), it makes +it relatively easy to introduce new fields in a backwards-compatible way: +we simply introduce a new field in `MetricSpec` as a new "metric type". + +#### External #### + +It was discussed adding a source type of `External` which has a single +opaque metric field and target value. This would indicate that the HPA +was under control of an external autoscaler, which would allow external +autoscalers to be present in the cluster while still indicating to tooling +that autoscaling is taking place. + +However, since this raises a number of questions and complications about +interaction with the existing autoscaler, it was decided to exclude this +feature. We may reconsider in the future. + +#### Limit Percentages #### + +In cluster environments where request is automatically set for scheduling +purposes, it is advantageous to be able to autoscale on percentage of +limit for resource metrics. We may wish to consider adding +a `targetPercentageOfLimit` to the `ResourceMetricSource` type. + +#### Referring to the current Namespace #### + +It is beneficial to be able to refer to a metric on the current namespace, +similarly to the `ObjectMetricSource` source type, but without an explicit +name. Because of the similarity to `ObjectMetricSource`, it may simply be +sufficient to allow specificying a `kind` of "Namespace" without a name. +Alternatively, a similar source type to `PodsMetricSource` could be used. + +#### Calculating Final Desired Replica Count #### + +Since we have multiple replica counts (one from each metric), we must have +a way to aggregated them into a final replica count. In this iteration of +the proposal, we simply take the maximum of all the computed replica +counts. However, in certain cases, it could be useful to allow the user +to specify that they wanted the minimum or average instead. + +In the general case, maximum should be sufficient, but if the need arises, +it should be fairly easy to add such a field in. + +Mechanical Concerns +------------------- + +The HPA will derive metrics from two sources: resource metrics (i.e. CPU +request percentage) will come from the +[master metrics API](resource-metrics-api.md), while other metrics will +come from the custom metrics API (currently proposed as #34586), which is +an adapter API which sources metrics directly from the monitoring +pipeline. + + + +[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/docs/proposals/hpa-v2.md?pixel)]() +