Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
title: HPAConfigurableTolerance
content_type: feature_gate
_build:
list: never
render: false

stages:
- stage: alpha
defaultValue: false
fromVersion: "1.33"
---
Enables setting a [tolerance threshold](/docs/tasks/run-application/horizontal-pod-autoscale.md#tolerance)
for HorizontalPodAutoscaler metrics.
37 changes: 32 additions & 5 deletions content/en/docs/tasks/run-application/horizontal-pod-autoscale.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,8 +143,8 @@ is `100m`, the number of replicas will be doubled, since
\\( { 200.0 \div 100.0 } = 2.0 \\).
If the current value is instead `50m`, you'll halve the number of
replicas, since \\( { 50.0 \div 100.0 } = 0.5 \\). The control plane skips any scaling
action if the ratio is sufficiently close to 1.0 (within a globally-configurable
tolerance, 0.1 by default).
action if the ratio is sufficiently close to 1.0 (within a
[configurable tolerance](#tolerance), 0.1 by default).

When a `targetAverageValue` or `targetAverageUtilization` is specified,
the `currentMetricValue` is computed by taking the average of the given
Expand Down Expand Up @@ -388,9 +388,10 @@ to configure separate scale-up and scale-down behaviors.
You specify these behaviours by setting `scaleUp` and / or `scaleDown`
under the `behavior` field.

You can specify a _stabilization window_ that prevents [flapping](#flapping)
the replica count for a scaling target. Scaling policies also let you control the
rate of change of replicas while scaling.
Scaling policies let you control the rate of change of replicas while scaling.
Also two settings can be used to prevent [flapping](#flapping): you can specify a
_stabilization window_ for smoothing replica counts, and a tolerance to ignore
minor metric fluctuations below a specified threshold.

### Scaling policies

Expand Down Expand Up @@ -452,6 +453,32 @@ interval. In the above example, all desired states from the past 5 minutes will
This approximates a rolling maximum, and avoids having the scaling algorithm frequently
remove Pods only to trigger recreating an equivalent Pod just moments later.

### Tolerance {#tolerance}

{{< feature-state feature_gate_name="HPAConfigurableTolerance" >}}

The `tolerance` field configures a threshold for metric variations, preventing the
autoscaler from scaling for changes below that value.

This tolerance is defined as the amount of variation around the desired metric value under
which no scaling will occur. For example, consider a HorizontalPodAutoscaler configured
with a target memory consumption of 100MiB and a scale-up tolerance of 5%:

```yaml
behavior:
scaleUp:
tolerance: 0.05 # 5% tolerance for scale up
```

With this configuration, the HPA algorithm will only consider scaling up if the memory
consumption is higher than 105MiB (that is: 5% above the target).

If you don't set this field, the HPA applies the default cluster-wide tolerance of 10%. This
default can be updated for both scale-up and scale-down using the
[kube-controller-manager](/docs/reference/command-line-tools-reference/kube-controller-manager/)
`--horizontal-pod-autoscaler-tolerance` command line argument. (You can't use the Kubernetes API
to configure this default value.)

### Default Behavior

To use the custom scaling not all fields have to be specified. Only values which need to be
Expand Down