Skip to content

http: global connection manager per-stream idle timeouts.#3879

Merged
htuch merged 8 commits intoenvoyproxy:masterfrom
htuch:global-idle-timeout
Jul 23, 2018
Merged

http: global connection manager per-stream idle timeouts.#3879
htuch merged 8 commits intoenvoyproxy:masterfrom
htuch:global-idle-timeout

Conversation

@htuch
Copy link
Member

@htuch htuch commented Jul 17, 2018

This is a followup to #3841, where we introduce HCM-wide stream idle timeouts. This has two effects:

  1. We can now timeout immediately after stream creation, potentially before receiving request
    headers and routing.

  2. A default timeout can be configured across all routes. This is overridable on a per-route basis.

The default and overriding semantics are explained in the docs. Also added as a bonus some docs
about how timeouts interact more generally in Envoy.

Fixes #3853.

Risk Level: Low. While there is some change to the per-route vs. HCM wide semantics for stream idle
timeouts, it's not anticipated this feature is in common use yet (it's only a couple of days since
landing), and the caveats in #3841 with the new 5 minute default timeout should already apply.
Testing: Unit/integration tests added.

Signed-off-by: Harvey Tuch htuch@google.com

This is a followup to envoyproxy#3841, where we introduce HCM-wide stream idle timeouts. This has two effects:

1. We can now timeout immediately after stream creation, potentially before receiving request
   headers and routing.

2. A default timeout can be configured across all routes. This is overridable on a per-route basis.

The default and overriding semantics are explained in the docs. Also added as a bonus some docs
about how timeouts interact more generally in Envoy.

Fixes envoyproxy#3853.

Risk Level: Low. While there is some change to the per-route vs. HCM wide semantics for stream idle
  timeouts, it's not anticipated this feature is in common use yet (it's only a couple of days since
  landing), and the caveats in envoyproxy#3841 with the new 5 minute default timeout should already apply.
Testing: Unit/integration tests added.

Signed-off-by: Harvey Tuch <htuch@google.com>
Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this one - it's great to see our HTTP idle timeouts getting better and better!

google.protobuf.Duration idle_timeout = 11 [(gogoproto.stdduration) = true];

// The stream idle timeout for connections managed by the connection manager.
// If not specified, this defaults to 5 minutes. The default value was select
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> selected

// If not specified, this defaults to 5 minutes. The default value was select
// so as not to interfere with any smaller configured timeouts that may have
// existed in configurations prior to the introduction of this feature, while
// introducing robustness to TCP connections that terminate without FIN.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a FIN?

// headers, the :ref:`stream_idle_timeout
// <envoy_api_field_config.filter.network.http_connection_manager.v2.HttpConnectionManager.stream_idle_timeout>`
// applies. Each time an encode/decode event for headers or data is processed
// for the stream, the timer will be reset. If the timeout fires, the stream
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing of note - I don't believe there's any encode/decode events between initial receipt of headers and completed headers. While I've not seen giant request headers in prod (due to restricted size limits), I've seen users send 100k+ response headers, so I assume there's folks out there who do large headers on both ends. If headers are broken into many frames and shipped over a small connection, I think the connection could still idle out, which is unfortunate since it is making forward progress. I think documenting this may cause more confusion than it's worth but at least calling it out here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add something on the potential mismatch between wire traffic and observed events.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good point @alyssawilk. I agree we probably don't need to worry about that now but good to keep in the back of our minds.

Timeouts
--------

Various configurable timeouts apply to an HTTP connection and its constituent streams:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

if (idle_timer_ == nullptr) {
idle_timer_ = connection_manager_.read_callbacks_->connection().dispatcher().createTimer(
[this]() -> void { onIdleTimeout(); });
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a test for setting a general timeout and overriding with 0? It looks to me like we'd set idle_timeout_ms_ = 0, then resetIdleTimer and arm it for 'immediately'. Maybe resetIdleTimer should do the "no idle timeout set" check?

Copy link
Member Author

@htuch htuch Jul 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good catch, we also need to change the documented per-route override semantics for this to make it possible to disable at the route level.

htuch added 4 commits July 18, 2018 12:28
Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Harvey Tuch <htuch@google.com>
Signed-off-by: Harvey Tuch <htuch@google.com>
alyssawilk
alyssawilk previously approved these changes Jul 18, 2018
Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM though if @mattklein123 has time to do a pass today that'd be great.

@mattklein123
Copy link
Member

mattklein123 commented Jul 18, 2018 via email

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one small comment request, also needs a master merge. Thank you for adding this!

// headers, the :ref:`stream_idle_timeout
// <envoy_api_field_config.filter.network.http_connection_manager.v2.HttpConnectionManager.stream_idle_timeout>`
// applies. Each time an encode/decode event for headers or data is processed
// for the stream, the timer will be reset. If the timeout fires, the stream
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good point @alyssawilk. I agree we probably don't need to worry about that now but good to keep in the back of our minds.

Timeouts
--------

Various configurable timeouts apply to an HTTP connection and its constituent streams:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

idle_timer_ = connection_manager_.read_callbacks_->connection().dispatcher().createTimer(
[this]() -> void { onIdleTimeout(); });
}
} else if (idle_timer_ != nullptr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind adding a few comments here since this logic block is so important and might not be immediately obvious to the reader? (Basically describing in code that the route told us to disable the idle timeout so we are killing the connection level one if it exists).

Signed-off-by: Harvey Tuch <htuch@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants