Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 32 additions & 32 deletions docs/root/intro/arch_overview/upstream/outlier.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,43 +10,43 @@ such as consecutive failures, temporal success rate, temporal latency, etc. Outl
form of *passive* health checking. Envoy also supports :ref:`active health checking
<arch_overview_health_checking>`. *Passive* and *active* health checking can be enabled together or
independently, and form the basis for an overall upstream health checking solution.
Outlier detection is part of :ref:`cluster configuration <envoy_v3_api_msg_config.cluster.v3.OutlierDetection>`
and it needs filters to report errors, timeouts, resets. Currently the following filters support
Outlier detection is part of the :ref:`cluster configuration <envoy_v3_api_msg_config.cluster.v3.OutlierDetection>`
and it needs filters to report errors, timeouts, and resets. Currently, the following filters support
outlier detection: :ref:`http router <config_http_filters_router>`,
:ref:`tcp proxy <config_network_filters_tcp_proxy>` and :ref:`redis proxy <config_network_filters_redis_proxy>`.

Detected errors fall into two categories: externally and locally originated errors. Externally generated errors
are transaction specific and occur on the upstream server in response to the received request. For example, HTTP server returning error code 500 or redis server returning payload which cannot be decoded. Those errors are generated on the upstream host after Envoy has successfully connected to it.
Locally originated errors are generated by Envoy in response to an event which interrupted or prevented communication with the upstream host. Examples of locally originated errors are timeout, TCP reset, inability to connect to a specified port, etc.
are transaction specific and occur on the upstream server in response to the received request. For example, an HTTP server returning error code 500 or a redis server returning a payload which cannot be decoded. Those errors are generated on the upstream host after Envoy has connected to it successfully.
Locally originated errors are generated by Envoy in response to an event which interrupted or prevented communication with the upstream host. Examples of locally originated errors are timeout, TCP reset, inability to connect to a specified port, etc.

Type of detected errors depends on filter type. :ref:`http router <config_http_filters_router>` filter for example
The type of detected errors depends on the filter type. The :ref:`http router <config_http_filters_router>` filter, for example,
detects locally originated errors (timeouts, resets - errors related to connection to upstream host) and because it
also understands HTTP protocol it reports
errors returned by HTTP server (externally generated errors). In such scenario, even when connection to upstream HTTP server is successful,
transaction with the server may fail.
On the contrary, :ref:`tcp proxy <config_network_filters_tcp_proxy>` filter does not understand any protocol above
TCP layer and reports only locally originated errors.

In default configuration (:ref:`outlier_detection.split_external_local_origin_errors<envoy_v3_api_field_config.cluster.v3.OutlierDetection.split_external_local_origin_errors>` is *false*)
locally originated errors are not distinguished from externally generated (transaction) errors and all end up
in the same bucket and are compared against
also understands the HTTP protocol it reports
errors returned by the HTTP server (externally generated errors). In such a scenario, even when the connection to the upstream HTTP server is successful,
the transaction with the server may fail.
By contrast, the :ref:`tcp proxy <config_network_filters_tcp_proxy>` filter does not understand any protocol above
the TCP layer and reports only locally originated errors.

Under the default configuration (:ref:`outlier_detection.split_external_local_origin_errors<envoy_v3_api_field_config.cluster.v3.OutlierDetection.split_external_local_origin_errors>` is *false*)
locally originated errors are not distinguished from externally generated (transaction) errors, all end up
in the same bucket, and are compared against the
:ref:`outlier_detection.consecutive_5xx<envoy_v3_api_field_config.cluster.v3.OutlierDetection.consecutive_5xx>`,
:ref:`outlier_detection.consecutive_gateway_failure<envoy_v3_api_field_config.cluster.v3.OutlierDetection.consecutive_gateway_failure>` and
:ref:`outlier_detection.success_rate_stdev_factor<envoy_v3_api_field_config.cluster.v3.OutlierDetection.success_rate_stdev_factor>`
configuration items. For example, if connection to an upstream HTTP server fails twice because of timeout and
then, after successful connection, the server returns error code 500, the total error count will be 3.
then, after successful connection establishment, the server returns error code 500 then the total error count will be 3.

Outlier detection may also be configured to distinguish locally originated errors from externally originated (transaction) errors.
It is done via
:ref:`outlier_detection.split_external_local_origin_errors<envoy_v3_api_field_config.cluster.v3.OutlierDetection.split_external_local_origin_errors>` configuration item.
It is done via the
:ref:`outlier_detection.split_external_local_origin_errors<envoy_v3_api_field_config.cluster.v3.OutlierDetection.split_external_local_origin_errors>` configuration item.
In that mode locally originated errors are tracked by separate counters than externally originated
(transaction) errors and
the outlier detector may be configured to react to locally originated errors and ignore externally originated errors
or vice-versa.

It is important to understand that a cluster may be shared among several filter chains. If one filter chain
ejects a host based on its outlier detection type, other filter chains will be also affected even though their
outlier detection type would not eject that host.
outlier detection type would not have ejected that host.

Ejection algorithm
------------------
Expand Down Expand Up @@ -79,16 +79,16 @@ Envoy supports the following outlier detection types:
Consecutive 5xx
^^^^^^^^^^^^^^^

In default mode (:ref:`outlier_detection.split_external_local_origin_errors<envoy_v3_api_field_config.cluster.v3.OutlierDetection.split_external_local_origin_errors>` is *false*) this detection type takes into account all generated errors: locally
originated and externally originated (transaction) type of errors.
In the default mode (:ref:`outlier_detection.split_external_local_origin_errors<envoy_v3_api_field_config.cluster.v3.OutlierDetection.split_external_local_origin_errors>` is *false*) this detection type takes into account all generated errors: locally
originated and externally originated (transaction) errors.
Errors generated by non-HTTP filters, like :ref:`tcp proxy <config_network_filters_tcp_proxy>` or
:ref:`redis proxy <config_network_filters_redis_proxy>` are internally mapped to HTTP 5xx codes and treated as such.

In split mode (:ref:`outlier_detection.split_external_local_origin_errors<envoy_v3_api_field_config.cluster.v3.OutlierDetection.split_external_local_origin_errors>` is *true*) this detection type takes into account only externally originated (transaction) errors ignoring locally originated errors.
If an upstream host is HTTP-server, only 5xx types of error are taken into account (see :ref:`Consecutive Gateway Failure<consecutive_gateway_failure>` for exceptions).
In split mode (:ref:`outlier_detection.split_external_local_origin_errors<envoy_v3_api_field_config.cluster.v3.OutlierDetection.split_external_local_origin_errors>` is *true*) this detection type takes into account only externally originated (transaction) errors, ignoring locally originated errors.
If an upstream host is an HTTP-server, only 5xx types of error are taken into account (see :ref:`Consecutive Gateway Failure<consecutive_gateway_failure>` for exceptions).
For redis servers, served via
:ref:`redis proxy <config_network_filters_redis_proxy>` only malformed responses from the server are taken into account.
Properly formatted responses, even when they carry operational error (like index not found, access denied) are not taken into account.
Properly formatted responses, even when they carry an operational error (like index not found, access denied) are not taken into account.

If an upstream host returns some number of errors which are treated as consecutive 5xx type errors, it will be ejected.
The number of consecutive 5xx required for ejection is controlled by
Expand All @@ -99,8 +99,8 @@ the :ref:`outlier_detection.consecutive_5xx<envoy_v3_api_field_config.cluster.v3
Consecutive Gateway Failure
^^^^^^^^^^^^^^^^^^^^^^^^^^^

This detection type takes into account subset of 5xx errors, called "gateway errors" (502, 503 or 504 status code)
and is supported only by :ref:`http router <config_http_filters_router>`.
This detection type takes into account a subset of 5xx errors, called "gateway errors" (502, 503 or 504 status code)
and is supported only by the :ref:`http router <config_http_filters_router>`.

If an upstream host returns some number of consecutive "gateway errors" (502, 503 or 504 status
code), it will be ejected.
Expand All @@ -123,17 +123,17 @@ This detection type is supported by :ref:`http router <config_http_filters_route
Success Rate
^^^^^^^^^^^^

Success Rate based outlier ejection aggregates success rate data from every host in a cluster. Then at given
intervals ejects hosts based on statistical outlier detection. Success Rate outlier ejection will not be
Success Rate based outlier detection aggregates success rate data from every host in a cluster. Then at given
intervals ejects hosts based on statistical outlier detection. Success Rate outlier detection will not be
calculated for a host if its request volume over the aggregation interval is less than the
:ref:`outlier_detection.success_rate_request_volume<envoy_v3_api_field_config.cluster.v3.OutlierDetection.success_rate_request_volume>`
value. Moreover, detection will not be performed for a cluster if the number of hosts
with the minimum required request volume in an interval is less than the
:ref:`outlier_detection.success_rate_minimum_hosts<envoy_v3_api_field_config.cluster.v3.OutlierDetection.success_rate_minimum_hosts>`
value.

In default configuration mode (:ref:`outlier_detection.split_external_local_origin_errors<envoy_v3_api_field_config.cluster.v3.OutlierDetection.split_external_local_origin_errors>` is *false*)
this detection type takes into account all type of errors: locally and externally originated.
In the default configuration mode (:ref:`outlier_detection.split_external_local_origin_errors<envoy_v3_api_field_config.cluster.v3.OutlierDetection.split_external_local_origin_errors>` is *false*)
this detection type takes into account all types of errors: locally and externally originated. The
:ref:`outlier_detection.enforcing_local_origin_success<envoy_v3_api_field_config.cluster.v3.OutlierDetection.enforcing_local_origin_success_rate>` config item is ignored.

In split mode (:ref:`outlier_detection.split_external_local_origin_errors<envoy_v3_api_field_config.cluster.v3.OutlierDetection.split_external_local_origin_errors>` is *true*),
Expand All @@ -150,15 +150,15 @@ to externally originated errors only and :ref:`outlier_detection.enforcing_local
Failure Percentage
^^^^^^^^^^^^^^^^^^

Failure Percentage based outlier ejection functions similarly to the success rate detecion type, in
Failure Percentage based outlier detection functions similarly to success rate detection, in
that it relies on success rate data from each host in a cluster. However, rather than compare those
values to the mean success rate of the cluster as a whole, they are compared to a flat
user-configured threshold. This threshold is configured via the
:ref:`outlier_detection.failure_percentage_threshold<envoy_v3_api_field_config.cluster.v3.OutlierDetection.failure_percentage_threshold>`
field.

The other configuration fields for failure percentage based ejection are similar to the fields for
success rate ejection. Failure percentage based ejection also obeys
The other configuration fields for failure percentage based detection are similar to the fields for
success rate detection. Failure percentage based detection also obeys
:ref:`outlier_detection.split_external_local_origin_errors<envoy_v3_api_field_config.cluster.v3.OutlierDetection.split_external_local_origin_errors>`;
the enforcement percentages for externally- and locally-originated errors are controlled by
:ref:`outlier_detection.enforcing_failure_percentage<envoy_v3_api_field_config.cluster.v3.OutlierDetection.enforcing_failure_percentage>`
Expand Down