Skip to content

thrift: don't close the downstream on an upstream overflow #19133

Merged
zuercher merged 5 commits intoenvoyproxy:mainfrom
rgs1:thrift-handle-upstream-connection-overflow
Dec 2, 2021
Merged

thrift: don't close the downstream on an upstream overflow #19133
zuercher merged 5 commits intoenvoyproxy:mainfrom
rgs1:thrift-handle-upstream-connection-overflow

Conversation

@rgs1
Copy link
Copy Markdown
Member

@rgs1 rgs1 commented Nov 29, 2021

When we fail to get an upstream connection (e.g.: PoolFailureReason::Overflow)
there's no need to close the downstream connection, since the request never
made it through. So we keep it open and avoid an issue that happens when
closing remote connections after a local response - see below.

There is a potentially separate issue which isn't addressed in this change:
when sendLocalReply() is called with end_stream=true the
local_response_sent_ marker is not set:

https://github.com/envoyproxy/envoy/blob/main/source/extensions/filters/network/thrift_proxy/conn_manager.cc#L702

Because using end_stream=true will call close with Network::ConnectionCloseType::FlushWrite
we end with a delayed close() which means that the subsequent calls
to applyDecoderFilters() might be racy (e.g.: downstream socket might
or might not be closed). This can generate a crash.

A possible solution is setting local_response_sent_=true regardless
of end_stream=true, given that I can't see any reasons why this would
be problematic.

I'll follow-up with more unit tests and an integration test
illustrating the issue with the delayed close() calls.

Signed-off-by: Raul Gutierrez Segales rgs@pinterest.com

…rors

When we fail to get an upstream connection (e.g.: PoolFailureReason::Overflow)
there's no need to close the downstream connection, since the request never
made it through. So we keep it open and avoid an issue that happens when
closing remote connections after a local response - see below.

There is a potentially separate issue which isn't addressed in this change:
when sendLocalReply() is called with `end_stream=true` the
`local_response_sent_` marker is not set:

https://github.com/envoyproxy/envoy/blob/main/source/extensions/filters/network/thrift_proxy/conn_manager.cc#L702

Because using `end_stream=true` will call close with `Network::ConnectionCloseType::FlushWrite`
we end with a delayed close() which means that the subsequent calls
to `applyDecoderFilters()` might be racy (e.g.: downstream socket might
or might not be closed). This can generate a crash.

A possible solution is setting `local_response_sent_=true` regardless
of `end_stream=true`, given that I can't see any reasons why this would
be problematic.

I'll follow-up with more unit tests and an integration test
illustrating the issue with the delayed close() calls.

Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
@rgs1 rgs1 requested a review from zuercher as a code owner November 29, 2021 23:47
@rgs1
Copy link
Copy Markdown
Member Author

rgs1 commented Nov 29, 2021

cc: @fishcakez

Raul Gutierrez Segales added 2 commits November 29, 2021 19:50
Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
zuercher
zuercher previously approved these changes Dec 1, 2021
Copy link
Copy Markdown
Member

@zuercher zuercher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to do this.

Raul Gutierrez Segales added 2 commits December 2, 2021 13:44
Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
@zuercher zuercher enabled auto-merge (squash) December 2, 2021 18:54
@zuercher zuercher merged commit 2e0ad0b into envoyproxy:main Dec 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants