Skip to content

http: rc details for main Envoy workflow#6560

Merged
alyssawilk merged 8 commits intoenvoyproxy:masterfrom
alyssawilk:sendLocalReply
Apr 30, 2019
Merged

http: rc details for main Envoy workflow#6560
alyssawilk merged 8 commits intoenvoyproxy:masterfrom
alyssawilk:sendLocalReply

Conversation

@alyssawilk
Copy link
Contributor

@alyssawilk alyssawilk commented Apr 11, 2019

Adding response code details to for router and http connection manager local replies.

Risk Level: low (adding strings to access logger)
Testing: unit tests
Docs Changes: n/a
Release Notes: n/a
Part of #6542

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@alyssawilk alyssawilk added the no stalebot Disables stalebot from closing an issue label Apr 11, 2019
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
// sufficient? But maybe there's cases where we want multiple levels to add
// their information.
std::string details =
absl::StrCat("upstream_reset{", reset_reason, ",", transport_failure_reason, "}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both the reset reason and the transport failure reason are already available in the StreamInfo so why include them also in the rc details string? Seems simpler to just have the rc details by a fixed string like "upstream reset" which can be easily checked for and then access logs can look at the other fields if they want more specifics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's totally doable. What I don't like about that is I feel like it encourages a paradigm of "look elsewhere for the underlying failure cause"

So today we have have "rc details" which might point us to "upstream failure" which would require us to know which of transport failure and/or reset reason to debug. But then we could have a generic "cache failure" which could tell you to look at the cache details (was the entry not present? expired? etc) and "WASM failure" with a bunch of reasons the code failed to execute.

I want all of the information to be communicated in access logs and headers and I'd mildly prefer to combine the data in the string as it's created rather than to encourage each complicated filter to split out layers of reasons.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. It does make it a bit difficult to programmatically check if the response code details from a StreamInfo matches this "constant" but perhaps that is not a big deal - we could add some kind of regex matching if we really want that.

LastFlag = StreamIdleTimeout
};

struct ResponseCodeDetailValues {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those look great, thanks, but you're effectively rewriting even the existing Proxy-Status types, so what's your plan in regards to that header? Using corresponding standard types and then using those values in the details field or completely ignoring standard types and only using http_{request,response}_error with those values in details? In either case, I think you might need a std::pair (or perhaps even std::tuple) here, instead of only strings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think when we implement proxy-status these could be extended. I could do a one off with today's proxy-status codes but I really don't want to sign up for keeping it up to date and I'd prefer to not add code which is going to bitrot as the IETF makes forward progress. I don't think creating this without the tuple creates more work for when we implement proxy-status, and it might reduce it (by filling in the details field)

const std::string PATH_NORMALIZATION_FAILED = "path_normalization_failed";
// The request was rejected because it attempted an unsupported upgrade.

// TODO(reviewers) do we want a standard of filter etc. prefix or unique IDs enough?
Copy link
Contributor

@PiotrSikora PiotrSikora Apr 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we simply use filter's well know name? Unique ID might make debugging harder.

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>

// TODO(#6542): add values for sendLocalReply use-cases
// configured limits.
const std::string REQUEST_PAYLOAD_TOO_LARGE = "request_payload_too_large";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think the preference is to use PascalCase for these constants.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PascalCase for enums for sure. For consts "In the Envoy codebase we use ConstantVar or CONSTANT_VAR."

That said I thought ALL_CAPS was strictly prefered and I prefer PascalCase so I'll switch nonetheless :-P

Copy link
Contributor

@jmarantz jmarantz Apr 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had argued for a slight preference for PascalCase for constants to avoid collisions with macros in files we don't control. That's codified mildly in https://github.com/envoyproxy/envoy/blob/master/STYLE.md

The Google C++ style guide points out that constant vars should be named kConstantVar. In the Envoy codebase we use ConstantVar or CONSTANT_VAR. If you pick CONSTANT_VAR, please be certain the name is globally significant to avoid potential conflicts with #defines, which are not namespace-scoped, and may appear in externally controlled header files.

// sufficient? But maybe there's cases where we want multiple levels to add
// their information.
std::string details =
absl::StrCat("upstream_reset{", reset_reason, ",", transport_failure_reason, "}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. It does make it a bit difficult to programmatically check if the response code details from a StreamInfo matches this "constant" but perhaps that is not a big deal - we could add some kind of regex matching if we really want that.

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@alyssawilk alyssawilk changed the title WIP: rc details for main Envoy workflow http: rc details for main Envoy workflow Apr 24, 2019
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Copy link
Contributor

@snowp snowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a few nits

// 408 direct response after timeout.
EXPECT_CALL(response_encoder_, encodeHeaders(_, false))
.WillOnce(Invoke([](const HeaderMap& headers, bool) -> void {
.WillOnce(Invoke([&](const HeaderMap& headers, bool) -> void {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed? or is this lambda missing an expectation?

EXPECT_CALL(*filter, encodeHeaders(_, true));
EXPECT_CALL(response_encoder_, encodeHeaders(_, true))
.WillOnce(Invoke([](const HeaderMap& headers, bool) -> void {
.WillOnce(Invoke([filter](const HeaderMap& headers, bool) -> void {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: should we be consisting in how we pass the filter to the lamba? its implicitly passed by & elsewhere

const std::string RouteNotFound = "route_not_found";
// A direct response was generated by the router filter.
const std::string DirectResponse = "direct_response";
// The request was rejected by the router filter because there was no route found.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe "because the cluster in the selected route does not exist"?

const std::string& basic_details =
downstream_response_started_ ? StreamInfo::ResponseCodeDetails::get().LateUpstreamReset
: StreamInfo::ResponseCodeDetails::get().EarlyUpstreamReset;
std::string details = absl::StrCat(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: const

std::function<void(HeaderMap& headers)> modify_headers,
const absl::optional<Grpc::Status::GrpcStatus> grpc_status) PURE;
const absl::optional<Grpc::Status::GrpcStatus> grpc_status,
absl::string_view details = "") PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default arguments is only allowed for non-virtual methods
https://google.github.io/styleguide/cppguide.html#Default_Arguments

// The upstream connection was reset before a response was started. This
// will generally be accompanied by details about why the reset occurred.
const std::string EarlyUpstreamReset = "upstream_reset_before_response_started";
// The upstream connection was reset before a response was started. This
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: before -> after

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@alyssawilk
Copy link
Contributor Author

Oops, fixed one last comment. Any other requests?

const std::string ViaUpstream = "via_upstream";

// TODO(#6542): add values for sendLocalReply use-cases
// configured limits.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this comment is cut off.

const std::string LowVersion = "low_version";
// The request was rejected due to the Host: or :authority field missing
const std::string MissingHost = "missing_host_header";
// The request was rejected due to the Host: or :authority field missing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy paste mistake for this comment

Copy link
Contributor

@eziskind eziskind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks!

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Copy link
Contributor

@snowp snowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alyssawilk alyssawilk merged commit e866717 into envoyproxy:master Apr 30, 2019
mpuncel added a commit to mpuncel/envoy that referenced this pull request May 1, 2019
* master: (35 commits)
  Revert "api: Add total_issued_requests to Upstream Locality and Endpoint Stats. (envoyproxy#6692)" (envoyproxy#6761)
  Add test for the SocketOptionFactory::buildLiteralOptions() method. (envoyproxy#6724)
  Add test of parsing weighted_cluster route configuration to improve test coverage. (envoyproxy#6711)
  test: reducing H2 test permutations, increasing coverage time (envoyproxy#6753)
  Support gRPC-JSON translate without the google.api.http option. (envoyproxy#6731)
  quiche: implement QuicEpollClock (envoyproxy#6745)
  http: rc details for main Envoy workflow (envoyproxy#6560)
  quiche: implement QuicSystemEventLoopImpl (envoyproxy#6723)
  http: tracking 100s from upstream in stats (envoyproxy#6746)
  coverage: run without deprecated  option (envoyproxy#6752)
  quiche: Implement spdy_test_helpers_impl. (envoyproxy#6741)
  [test] convert listener test stubs to v2 API (envoyproxy#6735)
  api: Add total_issued_requests to Upstream Locality and Endpoint Stats. (envoyproxy#6692)
  quiche: Implement http2_reconstruct_object_impl.h. (envoyproxy#6717)
  build: patch protobuf for UBSAN issue. (envoyproxy#6721)
  router: scoped rds (2a): scoped routing configuration protos (envoyproxy#6675)
  tap: use move semantics for submitTrace (envoyproxy#6709)
  quiche: add epoll_server for testing (envoyproxy#6650)
  Increase timeout of the coverage test run to 3000 seconds as it is now bumping in the current 2000s limit causing coverage run to abort sometimes. (envoyproxy#6722)
  quiche: Update tarball to commit 43a1c0f10f2855c3cd142f500e8d19ac6d6f5a8c (envoyproxy#6718)
  ...

Signed-off-by: Michael Puncel <mpuncel@squareup.com>
jeffpiazza-google pushed a commit to jeffpiazza-google/envoy that referenced this pull request May 3, 2019
Adding response code details to for router and http connection manager local replies.

Risk Level: low (adding strings to access logger)
Testing: unit tests
Docs Changes: n/a
Release Notes: n/a
Part of envoyproxy#6542

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>

Signed-off-by: Jeff Piazza <jeffpiazza@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no stalebot Disables stalebot from closing an issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants