Skip to content

http: add request header timer#13341

Merged
mattklein123 merged 23 commits intoenvoyproxy:masterfrom
akonradi:http-request-header-timer
Nov 9, 2020
Merged

http: add request header timer#13341
mattklein123 merged 23 commits intoenvoyproxy:masterfrom
akonradi:http-request-header-timer

Conversation

@akonradi
Copy link
Contributor

Commit Message: Add request header timeout to HTTP connection manager
Additional Description:
Add a timer and config option to enforce a maximum allowed duration between when a downstream starts a new stream and when it finishes sending headers on that stream.
Risk Level: low - behavior is disabled unless the config option is specified
Testing: added tests
Docs Changes: autogenerated proto docs
Release Notes: documented new config field

Towards #11427

Add a new config field for an additional timeout that will cancel
streams that take too long to send headers, and implement it in the
HTTP Connection Manager.

Signed-off-by: Alex Konradi <akonradi@google.com>
@repokitteh-read-only
Copy link

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/envoy/.
CC @envoyproxy/api-watchers: FYI only for changes made to api/envoy/.

🐱

Caused by: #13341 was opened by akonradi.

see: more, trace.

…er-timer

Signed-off-by: Alex Konradi <akonradi@google.com>
Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool to see work towards more fine grained (and maybe adaptive) timeouts!

Starting out with a few questions since I want to make sure I know what we're aiming for both here and in the long run

// when the request is initiated, and is disarmed when the last byte of the headers is sent
// upstream (i.e. all decoding filters have processed the headers). If not specified or set to 0,
// this timeout is disabled.
google.protobuf.Duration request_headers_timeout = 41
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this sounds like a different timeout than described in the issue. I think issue is about the client sending headers (not hogging resources with a trickle attack). Going through the filter chain seems to mix Envoy processing time with potentially malicious clients. Did we want the latter as well?

Also, do we think this is a fixed timer long term, or do we hope it will evolve into the ranged timeout in the linked isssue? If so I think we want to start with a different API and hide it until the range timeout is implemented.

I'm going to tag antonio for first pass here as I think he has a bit more context than I do.

Copy link
Member

@mattklein123 mattklein123 Oct 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I would not recommend making this involve filters at all. I think we want a timer that starts at stream init (1st header block), and ends after all blocks/continuations are complete?

Given ^ it might be easier to configure this at the codec/protocol level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, the upstreams bit was a mistake on my part. I think the existing implementation that disables the timer in onHeadersReceived does what we want.

google.protobuf.Duration request_timeout = 28
[(udpa.annotations.security).configure_for_untrusted_downstream = true];

// The amount of time that Envoy will wait for the headers to be received. The timer is activated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the request headers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Signed-off-by: Alex Konradi <akonradi@google.com>
…er-timer

Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
Call disableTimer() before setting deleting for testing

Signed-off-by: Alex Konradi <akonradi@google.com>
…er-timer

Signed-off-by: Alex Konradi <akonradi@google.com>
@akonradi
Copy link
Contributor Author

Windows failure looks to be unrelated.

// The amount of time that Envoy will wait for the request headers to be received. The timer is
// activated when the request is initiated, and is disarmed when the last byte of the headers has
// been received (i.e. all decoding filters have processed the headers). If not specified or set
// to 0, this timeout is disabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if this duration is configured to a negative number? Does a proto validator to ensure that durations are >= 0 exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, fixed.

[(udpa.annotations.security).configure_for_untrusted_downstream = true];

// The amount of time that Envoy will wait for the request headers to be received. The timer is
// activated when the request is initiated, and is disarmed when the last byte of the headers has
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"when the request is initiated" -> "when the first byte of the headers is received"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

EXPECT_CALL(*codec_, dispatch(_)).WillOnce(Invoke([&](Buffer::Instance&) -> Http::Status {
Event::MockTimer* request_header_timer = setUpTimer();
EXPECT_CALL(*request_header_timer, enableTimer(request_headers_timeout_, _));
EXPECT_CALL(*request_header_timer, disableTimer());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What triggers this disable? the request not parsing and triggering an error reply?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that was the actual issue. I made the test more restrictive to verify that the correct behavior is being checked.

return Http::okStatus();
}));

Buffer::OwnedImpl fake_input("1234");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this input be a partial valid request to avoid potential for header parse error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this test is more realistic now.


conn_manager_->newStream(response_encoder_);
EXPECT_CALL(filter_callbacks_.connection_.dispatcher_, setTrackedObject(_)).Times(2);
request_header_timer->invokeCallback();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invoking this callback from within dispatch doesn't seem like something that would normally happen. Please call this callback outside codec dispatch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

new TestRequestHeaderMapImpl{{":authority", "host"}, {":path", "/"}, {":method", "GET"}}};

// the second parameter 'false' leaves the stream open
decoder->decodeHeaders(std::move(headers), false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's just too much mocking happening on these "tests". I don't follow what you're testing here or how it relates to the call to conn_manager_->onData below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's reasonable, because the tests were bad. PTAL at the cleaned-up version.

Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
Event::TimerPtr stream_idle_timer_;
// Per-stream request timeout.
// Per-stream request timeout. This timer is enabled when the stream is created and disabled
// when the downstream closes the connection. If triggered, it will close the stream.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are some errors in this comment. This seems to be a stream timeout, but there are references to downstream closing the connection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed.

This is what was decided on in the PR and I forgot to change it back
before now.

Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
…er-timer

Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
…er-timer

Signed-off-by: Alex Konradi <akonradi@google.com>
Signed-off-by: Alex Konradi <akonradi@google.com>
@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #13341 (comment) was created by @akonradi.

see: more, trace.

@akonradi
Copy link
Contributor Author

@alyssawilk did you want to take another pass at this?

Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall! I've added a few comments but I'm going to throw it over to snow for the non-googler pass


// The amount of time that Envoy will wait for the request headers to be received. The timer is
// activated when the first byte of the headers is received, and is disarmed when the last byte of
// the headers has been received (i.e. all decoding filters have processed the headers). If not
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the info in parens is not accurate. Maybe just remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done.

// stream.
Event::TimerPtr request_header_timer_;
// Per-stream alive duration. This timer is enabled once when the stream is created and, if
// triggered, will close the stream.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional, think it's worth mentioning which of these try to send a reply?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to punt on this since whether or not a reply is sent for some of these is dependent on a runtime override.

request_headers_timeout->set_seconds(1);
request_headers_timeout->set_nanos(0);
});
setDownstreamProtocol(Http::CodecClient::Type::HTTP1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be inclined to return if the protocol isn't HTTP/1, so we don't run this twice (and the H2 test confusingly runs HTTP/1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// Track locally queued bytes, to make sure the outbound client queue doesn't back up.
uint64_t bytes_to_send = send_buffer.length();
raw_connection->addBytesSentCallback([&](uint64_t bytes) { bytes_to_send -= bytes; });
raw_connection->write(send_buffer, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can replace most of this with a call to
sendRawHttpAndWaitForResponse()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test needs to do stuff to the connection while the dispatcher would otherwise be waiting, but I cribbed the use of RawConnectionDriver and that simplified things.

Signed-off-by: Alex Konradi <akonradi@google.com>
…er-timer

Signed-off-by: Alex Konradi <akonradi@google.com>
@akonradi
Copy link
Contributor Author

akonradi commented Nov 3, 2020

@snowp can I get a review on this?

@alyssawilk
Copy link
Contributor

Oops - snow is out today - I'll update the maintainer calendar and hopefully he can take a pass tomorrow.

Copy link
Contributor

@snowp snowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this seems good to me, just one test suggestion

Comment on lines +542 to +543
EXPECT_TRUE(connection_driver->closed());
EXPECT_THAT(response, HasSubstr("408"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe check for the error text as well to make sure that we're hitting the timeout we think we are here? Or check stats

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Signed-off-by: Alex Konradi <akonradi@google.com>
@akonradi akonradi requested a review from ggreenway as a code owner November 4, 2020 15:46
snowp
snowp previously approved these changes Nov 4, 2020
Copy link
Contributor

@snowp snowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@akonradi
Copy link
Contributor Author

akonradi commented Nov 4, 2020

@envoyproxy/api-shepherds review needed for the new timer config

@mattklein123 mattklein123 self-assigned this Nov 4, 2020
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API LGTM. Can you please update https://www.envoyproxy.io/docs/envoy/latest/faq/configuration/timeouts? Thank you.

/wait

Signed-off-by: Alex Konradi <akonradi@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@repokitteh-read-only repokitteh-read-only bot removed the api label Nov 9, 2020
@mattklein123 mattklein123 merged commit e1c138b into envoyproxy:master Nov 9, 2020
@akonradi akonradi deleted the http-request-header-timer branch November 9, 2020 19:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants