Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
66f18e9
gRPC retries, assuming status code in returned headers
alyssawilk Jun 8, 2017
9a3f9a6
gRPC retries, now with (partial) documentation
alyssawilk Jun 8, 2017
11281eb
minor changes
alyssawilk Jun 9, 2017
05c6dba
allowing gRPC retries to be configured using current router configs, …
alyssawilk Jun 9, 2017
8fb9bbe
gRPC retries, assuming status code in returned headers
alyssawilk Jun 8, 2017
e46dbc1
gRPC retries, now with (partial) documentation
alyssawilk Jun 8, 2017
7ed0e35
minor changes
alyssawilk Jun 9, 2017
2775929
allowing gRPC retries to be configured using current router configs, …
alyssawilk Jun 9, 2017
9fec1d8
doc fix and clang_cleanup
alyssawilk Jun 9, 2017
6f613d8
clang_cleanup
alyssawilk Jun 9, 2017
872fad0
clang_cleanup
alyssawilk Jun 9, 2017
a5aac27
Merge branch 'grpc' of github.com:alyssawilk/envoy into grpc
alyssawilk Jun 9, 2017
0130379
gRPC retries, assuming status code in returned headers
alyssawilk Jun 8, 2017
a4e13ff
gRPC retries, now with (partial) documentation
alyssawilk Jun 8, 2017
ab6c25a
minor changes
alyssawilk Jun 9, 2017
bc4f8de
allowing gRPC retries to be configured using current router configs, …
alyssawilk Jun 9, 2017
65566c9
gRPC retries, assuming status code in returned headers
alyssawilk Jun 8, 2017
e5e7f53
gRPC retries, now with (partial) documentation
alyssawilk Jun 8, 2017
39c9606
minor changes
alyssawilk Jun 9, 2017
f0cfd1c
allowing gRPC retries to be configured using current router configs, …
alyssawilk Jun 9, 2017
b0643f2
doc fix and clang_cleanup
alyssawilk Jun 9, 2017
d8a7a28
trying to come up with clean diffs
alyssawilk Jun 9, 2017
3d33288
reusing getGrpcStatus in trailer validation
alyssawilk Jun 9, 2017
bbba9a8
Making gRPC status errors negative, reusing getGrpcStatus in checkFor…
alyssawilk Jun 9, 2017
3dec86d
Merge remote-tracking branch 'refs/remotes/origin/grpc' into grpc
alyssawilk Jun 12, 2017
1dcc41e
Merge remote-tracking branch 'upstream/master' into grpc
alyssawilk Jun 12, 2017
36aae46
Making getGrpcStatus optional
alyssawilk Jun 12, 2017
5df0a7e
allowing all gRPC codes
alyssawilk Jun 12, 2017
6692aef
hopefully clarifying docs
alyssawilk Jun 12, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/configuration/http_conn_man/route_config/route.rst
Original file line number Diff line number Diff line change
Expand Up @@ -216,7 +216,8 @@ HTTP retry :ref:`architecture overview <arch_overview_http_routing_retry>`.

retry_on
*(required, string)* specifies the conditions under which retry takes place. These are the same
conditions documented for :ref:`config_http_filters_router_x-envoy-retry-on`.
conditions documented for :ref:`config_http_filters_router_x-envoy-retry-on` and
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 Jun 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a bit odd to me that in the route, we stick HTTP/gRPC retry policies into a single field, but we have 2 different headers. I don't really feel strongly about this, but thought I would mention it in case anyone else has any strong opinions.

:ref:`config_http_filters_router_x-envoy-grpc-retry-on`.

num_retries
*(optional, integer)* specifies the allowed number of retries. This parameter is optional and
Expand Down
34 changes: 32 additions & 2 deletions docs/configuration/http_filters/router_filter.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,8 +51,9 @@ x-envoy-max-retries
If a :ref:`retry policy <config_http_conn_man_route_table_route_retry>` is in place, Envoy will default to retrying one
time unless explicitly specified. The number of retries can be explicitly set in the
:ref:`route retry config <config_http_conn_man_route_table_route_retry>` or by using this header.
If a :ref:`retry policy <config_http_conn_man_route_table_route_retry>` is not configured or a
:ref:`config_http_filters_router_x-envoy-retry-on` header is not specified, Envoy will not retry a failed request.
If a :ref:`retry policy <config_http_conn_man_route_table_route_retry>` is not configured and
:ref:`config_http_filters_router_x-envoy-retry-on` or
:ref:`config_http_filters_router_x-envoy-grpc-retry-on` headers are not specified, Envoy will not retry a failed request.

A few notes on how Envoy does retries:

Expand Down Expand Up @@ -120,6 +121,35 @@ Note that retry policies can also be applied at the :ref:`route level

By default, Envoy will *not* perform retries unless you've configured them per above.

.. _config_http_filters_router_x-envoy-grpc-retry-on:

x-envoy-grpc-retry-on
^^^^^^^^^^^^^^^^^^^^^
Setting this header on egress requests will cause Envoy to attempt to retry failed requests (number of
retries defaults to 1, and can be controlled by
:ref:`x-envoy-max-retries <config_http_filters_router_x-envoy-max-retries>`
header or the :ref:`route config retry policy <config_http_conn_man_route_table_route_retry>`).
gRPC retries are currently only supported for gRPC status codes in response headers. gRPC status codes in
trailers will not trigger retry logic. One or more policies can be specified using a ',' delimited
list. The supported policies are:

cancelled
Envoy will attempt a retry if the gRPC status code in the response headers is "cancelled" (1)

deadline-exceeded
Envoy will attempt a retry if the gRPC status code in the response headers is "deadline-exceeded" (4)

resource-exhausted
Envoy will attempt a retry if the gRPC status code in the response headers is "resource-exhausted" (8)

As with the x-envoy-grpc-retry-on header, the number of retries can be controlled via the
:ref:`config_http_filters_router_x-envoy-max-retries` header

Note that retry policies can also be applied at the :ref:`route level
<config_http_conn_man_route_table_route_retry>`.

By default, Envoy will *not* perform retries unless you've configured them per above.

x-envoy-upstream-alt-stat-name
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down
5 changes: 5 additions & 0 deletions include/envoy/grpc/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,8 @@ envoy_cc_library(
"//include/envoy/http:header_map_interface",
],
)

envoy_cc_library(
name = "status",
hdrs = ["status.h"],
)
51 changes: 51 additions & 0 deletions include/envoy/grpc/status.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
#pragma once

namespace Envoy {
namespace Grpc {

class Status {
public:
enum GrpcStatus {
// The RPC completed successfully.
Ok = 0,
// The RPC was canceled.
Canceled = 1,
// Some unknown error occurred.
Unknown = 2,
// An argument to the RPC was invalid.
InvalidArgument = 3,
// The deadline for the RPC expired before the RPC completed.
DeadlineExceeded = 4,
// Some resource for the RPC was not found.
NotFound = 5,
// A resource the RPC attempted to create already exists.
AlreadyExists = 6,
// Permission was denied for the RPC.
PermissionDenied = 7,
// Some resource is exhausted, resulting in RPC failure.
ResourceExhausted = 8,
// Some precondition for the RPC failed.
FailedPrecondition = 9,
// The RPC was aborted.
Aborted = 10,
// Some operation was requested outside of a legal range.
OutOfRange = 11,
// The RPC requested was not implemented.
Unimplemented = 12,
// Some internal error occurred.
Internal = 13,
// The RPC endpoint is current unavailable.
Unavailable = 14,
// There was some data loss resulting in RPC failure.
DataLoss = 15,
// The RPC does not have required credentials for the RPC to succeed.
Unauthenticated = 16,

// This is a non-GRPC error code, indicating the status code in gRPC headers
// was invalid.
InvalidCode = -1,
};
};

} // Grpc
} // Envoy
1 change: 1 addition & 0 deletions include/envoy/http/header_map.h
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,7 @@ class HeaderEntry {
HEADER_FUNC(EnvoyMaxRetries) \
HEADER_FUNC(EnvoyOriginalPath) \
HEADER_FUNC(EnvoyRetryOn) \
HEADER_FUNC(EnvoyRetryGrpcOn) \
HEADER_FUNC(EnvoyUpstreamAltStatName) \
HEADER_FUNC(EnvoyUpstreamCanary) \
HEADER_FUNC(EnvoyUpstreamHealthCheckedCluster) \
Expand Down
11 changes: 7 additions & 4 deletions include/envoy/router/router.h
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,13 @@ class RedirectEntry {
class RetryPolicy {
public:
// clang-format off
static const uint32_t RETRY_ON_5XX = 0x1;
static const uint32_t RETRY_ON_CONNECT_FAILURE = 0x2;
static const uint32_t RETRY_ON_RETRIABLE_4XX = 0x4;
static const uint32_t RETRY_ON_REFUSED_STREAM = 0x8;
static const uint32_t RETRY_ON_5XX = 0x1;
static const uint32_t RETRY_ON_CONNECT_FAILURE = 0x2;
static const uint32_t RETRY_ON_RETRIABLE_4XX = 0x4;
static const uint32_t RETRY_ON_REFUSED_STREAM = 0x8;
static const uint32_t RETRY_ON_GRPC_CANCELLED = 0x10;
static const uint32_t RETRY_ON_GRPC_DEADLINE_EXCEEDED = 0x20;
static const uint32_t RETRY_ON_GRPC_RESOURCE_EXHAUSTED = 0x40;
// clang-format on

virtual ~RetryPolicy() {}
Expand Down
1 change: 1 addition & 0 deletions source/common/grpc/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ envoy_cc_library(
external_deps = ["protobuf"],
deps = [
"//include/envoy/common:optional",
"//include/envoy/grpc:status",
"//include/envoy/http:header_map_interface",
"//include/envoy/http:message_interface",
"//include/envoy/stats:stats_interface",
Expand Down
33 changes: 22 additions & 11 deletions source/common/grpc/common.cc
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,20 @@ namespace Grpc {

const std::string Common::GRPC_CONTENT_TYPE{"application/grpc"};

Optional<Status::GrpcStatus> Common::getGrpcStatus(const Http::HeaderMap& trailers) {
const Http::HeaderEntry* grpc_status_header = trailers.GrpcStatus();

uint64_t grpc_status_code;
if (!grpc_status_header || grpc_status_header->value().empty()) {
return Optional<Status::GrpcStatus>();
}
if (!StringUtil::atoul(grpc_status_header->value().c_str(), grpc_status_code) ||
grpc_status_code > Status::GrpcStatus::Unauthenticated) {
return Optional<Status::GrpcStatus>(Status::GrpcStatus::InvalidCode);
}
return Optional<Status::GrpcStatus>(static_cast<Status::GrpcStatus>(grpc_status_code));
}

void Common::chargeStat(const Upstream::ClusterInfo& cluster, const std::string& grpc_service,
const std::string& grpc_method, bool success) {
cluster.statsScope()
Expand Down Expand Up @@ -73,18 +87,17 @@ Http::MessagePtr Common::prepareHeaders(const std::string& upstream_cluster,

void Common::checkForHeaderOnlyError(Http::Message& http_response) {
// First check for grpc-status in headers. If it is here, we have an error.
const Http::HeaderEntry* grpc_status_header = http_response.headers().GrpcStatus();
if (!grpc_status_header) {
Optional<Status::GrpcStatus> grpc_status_code = Common::getGrpcStatus(http_response.headers());
if (!grpc_status_code.valid()) {
return;
}

uint64_t grpc_status_code;
if (!StringUtil::atoul(grpc_status_header->value().c_str(), grpc_status_code)) {
if (grpc_status_code.value() == Status::GrpcStatus::InvalidCode) {
throw Exception(Optional<uint64_t>(), "bad grpc-status header");
}

const Http::HeaderEntry* grpc_status_message = http_response.headers().GrpcMessage();
throw Exception(grpc_status_code,
throw Exception(grpc_status_code.value(),
grpc_status_message ? grpc_status_message->value().c_str() : EMPTY_STRING);
}

Expand All @@ -100,16 +113,14 @@ void Common::validateResponse(Http::Message& http_response) {
throw Exception(Optional<uint64_t>(), "no response trailers");
}

const Http::HeaderEntry* grpc_status_header = http_response.trailers()->GrpcStatus();
uint64_t grpc_status_code;
if (!grpc_status_header ||
!StringUtil::atoul(grpc_status_header->value().c_str(), grpc_status_code)) {
Optional<Status::GrpcStatus> grpc_status_code = Common::getGrpcStatus(*http_response.trailers());
if (!grpc_status_code.valid() || grpc_status_code.value() < 0) {
throw Exception(Optional<uint64_t>(), "bad grpc-status trailer");
}

if (grpc_status_code != 0) {
if (grpc_status_code.value() != 0) {
const Http::HeaderEntry* grpc_status_message = http_response.trailers()->GrpcMessage();
throw Exception(grpc_status_code,
throw Exception(grpc_status_code.value(),
grpc_status_message ? grpc_status_message->value().c_str() : EMPTY_STRING);
}
}
Expand Down
8 changes: 8 additions & 0 deletions source/common/grpc/common.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@

#include "envoy/common/exception.h"
#include "envoy/common/optional.h"
#include "envoy/grpc/status.h"
#include "envoy/http/header_map.h"
#include "envoy/http/message.h"
#include "envoy/stats/stats.h"
Expand All @@ -25,6 +26,13 @@ class Exception : public EnvoyException {

class Common {
public:
/**
* Returns the GrpcStatus code from a given set of headers, if present.
* @headers the headers to parse.
* @returns the parsed status code or InvalidCode if no valid status is found.
*/
static Optional<Status::GrpcStatus> getGrpcStatus(const Http::HeaderMap& headers);

/**
* Charge a success/failure stat to a cluster/service/method.
* @param cluster supplies the target cluster.
Expand Down
7 changes: 7 additions & 0 deletions source/common/http/headers.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ class HeaderValues {
const LowerCaseString EnvoyMaxRetries{"x-envoy-max-retries"};
const LowerCaseString EnvoyOriginalPath{"x-envoy-original-path"};
const LowerCaseString EnvoyRetryOn{"x-envoy-retry-on"};
const LowerCaseString EnvoyRetryGrpcOn{"x-envoy-retry-grpc-on"};
const LowerCaseString EnvoyUpstreamAltStatName{"x-envoy-upstream-alt-stat-name"};
const LowerCaseString EnvoyUpstreamCanary{"x-envoy-upstream-canary"};
const LowerCaseString EnvoyUpstreamRequestTimeoutAltResponse{
Expand Down Expand Up @@ -92,6 +93,12 @@ class HeaderValues {
const std::string Retriable4xx{"retriable-4xx"};
} EnvoyRetryOnValues;

struct {
const std::string Cancelled{"cancelled"};
const std::string DeadlineExceeded{"deadline-exceeded"};
const std::string ResourceExhausted{"resource-exhausted"};
} EnvoyRetryOnGrpcValues;

struct {
const std::string _100Continue{"100-continue"};
} ExpectValues;
Expand Down
1 change: 1 addition & 0 deletions source/common/router/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ envoy_cc_library(
"//include/envoy/upstream:upstream_interface",
"//source/common/common:assert_lib",
"//source/common/common:utility_lib",
"//source/common/grpc:common_lib",
"//source/common/http:codes_lib",
"//source/common/http:headers_lib",
"//source/common/http:utility_lib",
Expand Down
2 changes: 2 additions & 0 deletions source/common/router/config_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ RetryPolicyImpl::RetryPolicyImpl(const Json::Object& config) {
config.getObject("retry_policy")->getInteger("per_try_timeout_ms", 0));
num_retries_ = config.getObject("retry_policy")->getInteger("num_retries", 1);
retry_on_ = RetryStateImpl::parseRetryOn(config.getObject("retry_policy")->getString("retry_on"));
retry_on_ |=
RetryStateImpl::parseRetryGrpcOn(config.getObject("retry_policy")->getString("retry_on"));
}

ShadowPolicyImpl::ShadowPolicyImpl(const Json::Object& config) {
Expand Down
55 changes: 48 additions & 7 deletions source/common/router/retry_state_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

#include "common/common/assert.h"
#include "common/common/utility.h"
#include "common/grpc/common.h"
#include "common/http/codes.h"
#include "common/http/headers.h"
#include "common/http/utility.h"
Expand All @@ -19,6 +20,9 @@ namespace Router {
const uint32_t RetryPolicy::RETRY_ON_5XX;
const uint32_t RetryPolicy::RETRY_ON_CONNECT_FAILURE;
const uint32_t RetryPolicy::RETRY_ON_RETRIABLE_4XX;
const uint32_t RetryPolicy::RETRY_ON_GRPC_CANCELLED;
const uint32_t RetryPolicy::RETRY_ON_GRPC_DEADLINE_EXCEEDED;
const uint32_t RetryPolicy::RETRY_ON_GRPC_RESOURCE_EXHAUSTED;

RetryStatePtr RetryStateImpl::create(const RetryPolicy& route_policy,
Http::HeaderMap& request_headers,
Expand All @@ -29,7 +33,8 @@ RetryStatePtr RetryStateImpl::create(const RetryPolicy& route_policy,
RetryStatePtr ret;

// We short circuit here and do not both with an allocation if there is no chance we will retry.
if (request_headers.EnvoyRetryOn() || route_policy.retryOn()) {
if (request_headers.EnvoyRetryOn() || request_headers.EnvoyRetryGrpcOn() ||
route_policy.retryOn()) {
ret.reset(new RetryStateImpl(route_policy, request_headers, cluster, runtime, random,
dispatcher, priority));
}
Expand All @@ -48,12 +53,15 @@ RetryStateImpl::RetryStateImpl(const RetryPolicy& route_policy, Http::HeaderMap&

if (request_headers.EnvoyRetryOn()) {
retry_on_ = parseRetryOn(request_headers.EnvoyRetryOn()->value().c_str());
if (retry_on_ != 0 && request_headers.EnvoyMaxRetries()) {
const char* max_retries = request_headers.EnvoyMaxRetries()->value().c_str();
uint64_t temp;
if (StringUtil::atoul(max_retries, temp)) {
retries_remaining_ = temp;
}
}
if (request_headers.EnvoyRetryGrpcOn()) {
retry_on_ |= parseRetryGrpcOn(request_headers.EnvoyRetryGrpcOn()->value().c_str());
}
if (retry_on_ != 0 && request_headers.EnvoyMaxRetries()) {
const char* max_retries = request_headers.EnvoyMaxRetries()->value().c_str();
uint64_t temp;
if (StringUtil::atoul(max_retries, temp)) {
retries_remaining_ = temp;
}
}

Expand Down Expand Up @@ -96,6 +104,22 @@ uint32_t RetryStateImpl::parseRetryOn(const std::string& config) {
return ret;
}

uint32_t RetryStateImpl::parseRetryGrpcOn(const std::string& retry_grpc_on_header) {
uint32_t ret = 0;
std::vector<std::string> retry_on_list = StringUtil::split(retry_grpc_on_header, ',');
for (const std::string& retry_on : retry_on_list) {
if (retry_on == Http::Headers::get().EnvoyRetryOnGrpcValues.Cancelled) {
ret |= RetryPolicy::RETRY_ON_GRPC_CANCELLED;
} else if (retry_on == Http::Headers::get().EnvoyRetryOnGrpcValues.DeadlineExceeded) {
ret |= RetryPolicy::RETRY_ON_GRPC_DEADLINE_EXCEEDED;
} else if (retry_on == Http::Headers::get().EnvoyRetryOnGrpcValues.ResourceExhausted) {
ret |= RetryPolicy::RETRY_ON_GRPC_RESOURCE_EXHAUSTED;
}
}

return ret;
}

void RetryStateImpl::resetRetry() {
if (callback_) {
cluster_.resourceManager(priority_).retries().dec();
Expand Down Expand Up @@ -169,6 +193,23 @@ bool RetryStateImpl::wouldRetry(const Http::HeaderMap* response_headers,
}
}

if (retry_on_ &
(RetryPolicy::RETRY_ON_GRPC_CANCELLED | RetryPolicy::RETRY_ON_GRPC_DEADLINE_EXCEEDED |
RetryPolicy::RETRY_ON_GRPC_RESOURCE_EXHAUSTED) &&
response_headers) {
Optional<Grpc::Status::GrpcStatus> status = Grpc::Common::getGrpcStatus(*response_headers);
if (status.valid()) {
if ((status.value() == Grpc::Status::Canceled &&
(retry_on_ & RetryPolicy::RETRY_ON_GRPC_CANCELLED)) ||
(status.value() == Grpc::Status::DeadlineExceeded &&
(retry_on_ & RetryPolicy::RETRY_ON_GRPC_DEADLINE_EXCEEDED)) ||
(status.value() == Grpc::Status::ResourceExhausted &&
(retry_on_ & RetryPolicy::RETRY_ON_GRPC_RESOURCE_EXHAUSTED))) {
return true;
}
}
}

return false;
}

Expand Down
3 changes: 3 additions & 0 deletions source/common/router/retry_state_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,9 @@ class RetryStateImpl : public RetryState {

static uint32_t parseRetryOn(const std::string& config);

// Returns the RetryPolicy extracted from the x-envoy-retry-grpc-on header.
static uint32_t parseRetryGrpcOn(const std::string& retry_grpc_on_header);

// Router::RetryState
bool enabled() override { return retry_on_ != 0; }
bool shouldRetry(const Http::HeaderMap* response_headers,
Expand Down
1 change: 1 addition & 0 deletions test/common/grpc/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ envoy_cc_test(
"//source/common/http:headers_lib",
"//test/mocks/upstream:upstream_mocks",
"//test/proto:helloworld_proto",
"//test/test_common:utility_lib",
],
)

Expand Down
Loading