-
Notifications
You must be signed in to change notification settings - Fork 5.4k
gRPC retries, assuming gRPC status code is in the returned headers #1067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 11 commits
66f18e9
9a3f9a6
11281eb
05c6dba
8fb9bbe
e46dbc1
7ed0e35
2775929
9fec1d8
6f613d8
872fad0
a5aac27
0130379
a4e13ff
ab6c25a
bc4f8de
65566c9
e5e7f53
39c9606
f0cfd1c
b0643f2
d8a7a28
3d33288
bbba9a8
3dec86d
1dcc41e
36aae46
5df0a7e
6692aef
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -51,8 +51,9 @@ x-envoy-max-retries | |
| If a :ref:`retry policy <config_http_conn_man_route_table_route_retry>` is in place, Envoy will default to retrying one | ||
| time unless explicitly specified. The number of retries can be explicitly set in the | ||
| :ref:`route retry config <config_http_conn_man_route_table_route_retry>` or by using this header. | ||
| If a :ref:`retry policy <config_http_conn_man_route_table_route_retry>` is not configured or a | ||
| :ref:`config_http_filters_router_x-envoy-retry-on` header is not specified, Envoy will not retry a failed request. | ||
| If a :ref:`retry policy <config_http_conn_man_route_table_route_retry>` is not configured and | ||
| :ref:`config_http_filters_router_x-envoy-retry-on` or | ||
| :ref:`config_http_filters_router_x-envoy-grpc-retry-on` headers are not specified, Envoy will not retry a failed request. | ||
|
|
||
| A few notes on how Envoy does retries: | ||
|
|
||
|
|
@@ -120,6 +121,32 @@ Note that retry policies can also be applied at the :ref:`route level | |
|
|
||
| By default, Envoy will *not* perform retries unless you've configured them per above. | ||
|
|
||
| .. _config_http_filters_router_x-envoy-grpc-retry-on: | ||
|
|
||
| x-envoy-grpc-retry-on | ||
| ^^^^^^^^^^^^^^^^^^^^^ | ||
| Setting this header on egress requests will cause Envoy to attempt to retry failed requests (number of | ||
| retries defaults to 1, and is configured as x-envoy-grpc-retry is, above). gRPC retries are currently | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "x-envoy-grpc-retry" I think is a typo? (Not sure what you meant here) |
||
| only supported for gRPC status codes in response headers. gRPC status codes in trailers will not trigger | ||
| retry logic. One or more policies can be specified using a ',' delimited list. The supported policies are: | ||
|
|
||
| cancelled | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cancelled are returned typically when the call are cancelled by the caller, not sure a proxy should retry. http://www.grpc.io/docs/guides/concepts.html#cancelling-rpcs
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| Envoy will attempt a retry if the gRPC status code in the response headers is "cancelled" (1) | ||
|
|
||
| deadline-exceeded | ||
| Envoy will attempt a retry if the gRPC status code in the response headers is "deadline-exceeded" (4) | ||
|
|
||
| resource-exhausted | ||
| Envoy will attempt a retry if the gRPC status code in the response headers is "resource-exhausted" (8) | ||
|
|
||
| As with x-envoy-grpc-retry, the number of retries can be controlled via the | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same here? |
||
| :ref:`config_http_filters_router_x-envoy-max-retries` header | ||
|
|
||
| Note that retry policies can also be applied at the :ref:`route level | ||
| <config_http_conn_man_route_table_route_retry>`. | ||
|
|
||
| By default, Envoy will *not* perform retries unless you've configured them per above. | ||
|
|
||
| x-envoy-upstream-alt-stat-name | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -17,3 +17,8 @@ envoy_cc_library( | |
| "//include/envoy/http:header_map_interface", | ||
| ], | ||
| ) | ||
|
|
||
| envoy_cc_library( | ||
| name = "status_interface", | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: Could just call this |
||
| hdrs = ["status.h"], | ||
| ) | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,54 @@ | ||
| #pragma once | ||
|
|
||
| namespace Envoy { | ||
| namespace Grpc { | ||
|
|
||
| class Status { | ||
| public: | ||
| enum GrpcStatus { | ||
| // The RPC completed successfully. | ||
| Ok = 0, | ||
| // The RPC was canceled. | ||
| Canceled = 1, | ||
| // Some unknown error occurred. | ||
| Unknown = 2, | ||
| // An argument to the RPC was invalid. | ||
| InvalidArgument = 3, | ||
| // The deadline for the RPC expired before the RPC completed. | ||
| DeadlineExceeded = 4, | ||
| // Some resource for the RPC was not found. | ||
| NotFound = 5, | ||
| // A resource the RPC attempted to create already exists. | ||
| AlreadyExists = 6, | ||
| // Permission was denied for the RPC. | ||
| PermissionDenied = 7, | ||
| // The RPC does not have required credentials for the RPC to succeed. | ||
| Unauthenticated = 16, | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you reorder this so that this is after |
||
| // Some resource is exhausted, resulting in RPC failure. | ||
| ResourceExhausted = 8, | ||
| // Some precondition for the RPC failed. | ||
| FailedPrecondition = 9, | ||
| // The RPC was aborted. | ||
| Aborted = 10, | ||
| // Some operation was requested outside of a legal range. | ||
| OutOfRange = 11, | ||
| // The RPC requested was not implemented. | ||
| Unimplemented = 12, | ||
| // Some internal error occurred. | ||
| Internal = 13, | ||
| // The RPC endpoint is current unavailable. | ||
| Unavailable = 14, | ||
| // There was some data loss resulting in RPC failure. | ||
| DataLoss = 15, | ||
|
|
||
| // This is a non-GRPC error code, indicating the status code in gRPC headers | ||
| // was missing or empty. | ||
| MissingStatus = -1, | ||
| // This is a non-GRPC error code, indicating the status code in gRPC headers | ||
| // was invalid. | ||
| InvalidCode = -2, | ||
| }; | ||
| }; | ||
|
|
||
| } // Grpc | ||
| } // Envoy | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -24,6 +24,22 @@ namespace Grpc { | |
|
|
||
| const std::string Common::GRPC_CONTENT_TYPE{"application/grpc"}; | ||
|
|
||
| Status::GrpcStatus Common::getGrpcStatus(const Http::HeaderMap& trailers) { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could have
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure. It's a minor behavior change for out-of-bounds error codes. If we care we could split returns from getGrpcStatus into "missing" status and "out of bounds" status but either way we'll throw an exception so I assume it's moot. |
||
| const Http::HeaderEntry* grpc_status_header = trailers.GrpcStatus(); | ||
|
|
||
| uint64_t grpc_status_code; | ||
| if (!grpc_status_header || grpc_status_header->value().empty()) { | ||
| return Status::GrpcStatus::MissingStatus; | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm inclined to make this an optional return type, since having a |
||
| } | ||
| if (!StringUtil::atoul(grpc_status_header->value().c_str(), grpc_status_code)) { | ||
| return Status::GrpcStatus::InvalidCode; | ||
| } | ||
| if (grpc_status_code > Status::GrpcStatus::DataLoss) { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| return Status::GrpcStatus::InvalidCode; | ||
| } | ||
| return static_cast<Status::GrpcStatus>(grpc_status_code); | ||
| } | ||
|
|
||
| void Common::chargeStat(const Upstream::ClusterInfo& cluster, const std::string& grpc_service, | ||
| const std::string& grpc_method, bool success) { | ||
| cluster.statsScope() | ||
|
|
@@ -73,13 +89,12 @@ Http::MessagePtr Common::prepareHeaders(const std::string& upstream_cluster, | |
|
|
||
| void Common::checkForHeaderOnlyError(Http::Message& http_response) { | ||
| // First check for grpc-status in headers. If it is here, we have an error. | ||
| const Http::HeaderEntry* grpc_status_header = http_response.headers().GrpcStatus(); | ||
| if (!grpc_status_header) { | ||
| Status::GrpcStatus grpc_status_code = Common::getGrpcStatus(http_response.headers()); | ||
| if (grpc_status_code == Status::GrpcStatus::MissingStatus) { | ||
| return; | ||
| } | ||
|
|
||
| uint64_t grpc_status_code; | ||
| if (!StringUtil::atoul(grpc_status_header->value().c_str(), grpc_status_code)) { | ||
| if (grpc_status_code == Status::GrpcStatus::InvalidCode) { | ||
| throw Exception(Optional<uint64_t>(), "bad grpc-status header"); | ||
| } | ||
|
|
||
|
|
@@ -100,10 +115,8 @@ void Common::validateResponse(Http::Message& http_response) { | |
| throw Exception(Optional<uint64_t>(), "no response trailers"); | ||
| } | ||
|
|
||
| const Http::HeaderEntry* grpc_status_header = http_response.trailers()->GrpcStatus(); | ||
| uint64_t grpc_status_code; | ||
| if (!grpc_status_header || | ||
| !StringUtil::atoul(grpc_status_header->value().c_str(), grpc_status_code)) { | ||
| Status::GrpcStatus grpc_status_code = Common::getGrpcStatus(*http_response.trailers()); | ||
| if (grpc_status_code < 0) { | ||
| throw Exception(Optional<uint64_t>(), "bad grpc-status trailer"); | ||
| } | ||
|
|
||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems a bit odd to me that in the route, we stick HTTP/gRPC retry policies into a single field, but we have 2 different headers. I don't really feel strongly about this, but thought I would mention it in case anyone else has any strong opinions.