-
Notifications
You must be signed in to change notification settings - Fork 5.3k
connection: propagate I/O errors to network filters. #13941
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
232c0ff
b62641f
8f58ce7
848af7d
ae3fc5b
4593733
8c5a5da
c46d41e
6e04b67
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -8,9 +8,11 @@ namespace Network { | |
| */ | ||
| enum class PostIoAction { | ||
| // Close the connection. | ||
| Close, | ||
| CloseGraceful, | ||
| // Keep the connection open. | ||
| KeepOpen | ||
| KeepOpen, | ||
| // Close the connection because of an error. | ||
| CloseError, | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems that a majority of Close cases are being renamed CloseError. Should we also rename Close to CloseGraceful or similar as a way to improve our chances of catching all cases that need to be adjusted?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
| }; | ||
|
|
||
| } // namespace Network | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -25,8 +25,12 @@ namespace { | |
|
|
||
| constexpr absl::string_view kTransportSocketConnectTimeoutTerminationDetails = | ||
| "transport socket timeout was reached"; | ||
| constexpr absl::string_view kDownstreamConnectionTerminationDetails = | ||
| "downstream connection was terminated"; | ||
| constexpr absl::string_view kUpstreamConnectionTerminationDetails = | ||
| "upstream connection was terminated"; | ||
|
|
||
| } | ||
| } // namespace | ||
|
|
||
| void ConnectionImplUtility::updateBufferStats(uint64_t delta, uint64_t new_total, | ||
| uint64_t& previous_total, Stats::Counter& stat_total, | ||
|
|
@@ -585,7 +589,20 @@ void ConnectionImpl::onReadReady() { | |
| // a connection close. | ||
| if ((!enable_half_close_ && result.end_stream_read_)) { | ||
| result.end_stream_read_ = false; | ||
| result.action_ = PostIoAction::Close; | ||
| result.action_ = PostIoAction::CloseGraceful; | ||
| } | ||
|
|
||
| if (result.action_ == PostIoAction::CloseError) { | ||
| if (dynamic_cast<ServerConnectionImpl*>(this)) { | ||
| stream_info_.setConnectionTerminationDetails(kDownstreamConnectionTerminationDetails); | ||
| stream_info_.setResponseFlag(StreamInfo::ResponseFlag::DownstreamConnectionTermination); | ||
| } else { | ||
| stream_info_.setUpstreamTransportFailureReason(kUpstreamConnectionTerminationDetails); | ||
| stream_info_.setConnectionTerminationDetails(kUpstreamConnectionTerminationDetails); | ||
| stream_info_.setResponseFlag(StreamInfo::ResponseFlag::UpstreamConnectionTermination); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: delegate stats update to virtual method so you can avoid the dynamic cast. |
||
| } | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Those values are set on the upstream connection's stream info, and they are never propagated to downstream connection's stream info or access logs. I'm not sure what's the best way to solve that.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could the high level request object query the connection as part of the termination process to determine if termination was graceful or abrupt? |
||
| // Force "end_stream" so that filters can process this error. | ||
| result.end_stream_read_ = true; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I usually associate end_stream_read_ with graceful termination. Is it appropriate to set this? Do we need alternate signaling to notify filters about abrupt terminations via either additional arguments to push pipeline or a new virtual method which is called on abrupt termination.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree; I think of end_stream as always a graceful operation. I think a different signal for error is more appropriate.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The issue is that there is no different signal right now, and if we don't set this, then network filters won't be executed at all. Note that this PR is meant to be a temporary fix until we have a proper solution for #13940, but that's going to require a lot more changes and time.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems like a massive hack. How much effort would it be to implement the real fix?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That depends on what do you consider a real fix. I'm not too familiar with that part of the codebase, but if we want to propagate upstream connection events to network filters, then I imagine it's quite a lot of changes, so I think it would be better if one of maintainers could work on this.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know what to suggest. It would be good to figure out how to do a catch-all cleanup of WASM resources on abnormal connection termination. Could WASM detect connection termination based on an L4 filter that does the relevant signaling on destruction?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FWIW, we have a workaround in #13836 that works for Wasm. |
||
| } | ||
|
|
||
| read_end_stream_ |= result.end_stream_read_; | ||
|
|
@@ -598,7 +615,7 @@ void ConnectionImpl::onReadReady() { | |
| } | ||
|
|
||
| // The read callback may have already closed the connection. | ||
| if (result.action_ == PostIoAction::Close || bothSidesHalfClosed()) { | ||
| if (result.action_ != PostIoAction::KeepOpen || bothSidesHalfClosed()) { | ||
| ENVOY_CONN_LOG(debug, "remote close", *this); | ||
| closeSocket(ConnectionEvent::RemoteClose); | ||
| } | ||
|
|
@@ -651,11 +668,22 @@ void ConnectionImpl::onWriteReady() { | |
| uint64_t new_buffer_size = write_buffer_->length(); | ||
| updateWriteBufferStats(result.bytes_processed_, new_buffer_size); | ||
|
|
||
| if (result.action_ == PostIoAction::CloseError) { | ||
| if (dynamic_cast<ServerConnectionImpl*>(this)) { | ||
| stream_info_.setConnectionTerminationDetails(kDownstreamConnectionTerminationDetails); | ||
| stream_info_.setResponseFlag(StreamInfo::ResponseFlag::DownstreamConnectionTermination); | ||
| } else { | ||
| stream_info_.setUpstreamTransportFailureReason(kUpstreamConnectionTerminationDetails); | ||
| stream_info_.setConnectionTerminationDetails(kUpstreamConnectionTerminationDetails); | ||
| stream_info_.setResponseFlag(StreamInfo::ResponseFlag::UpstreamConnectionTermination); | ||
| } | ||
| } | ||
|
|
||
| // NOTE: If the delayed_close_timer_ is set, it must only trigger after a delayed_close_timeout_ | ||
| // period of inactivity from the last write event. Therefore, the timer must be reset to its | ||
| // original timeout value unless the socket is going to be closed as a result of the doWrite(). | ||
|
|
||
| if (result.action_ == PostIoAction::Close) { | ||
| if (result.action_ != PostIoAction::KeepOpen) { | ||
| // It is possible (though unlikely) for the connection to have already been closed during the | ||
| // write callback. This can happen if we manage to complete the SSL handshake in the write | ||
| // callback, raise a connected event, and close the connection. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -83,7 +83,7 @@ Network::PostIoAction TsiSocket::doHandshakeNextDone(NextResultPtr&& next_result | |
|
|
||
| if (status != TSI_INCOMPLETE_DATA && status != TSI_OK) { | ||
| ENVOY_CONN_LOG(debug, "TSI: Handshake failed: status: {}", callbacks_->connection(), status); | ||
| return Network::PostIoAction::Close; | ||
| return Network::PostIoAction::CloseError; | ||
| } | ||
|
|
||
| if (next_result->to_send_->length() > 0) { | ||
|
|
@@ -110,7 +110,7 @@ Network::PostIoAction TsiSocket::doHandshakeNextDone(NextResultPtr&& next_result | |
| } else { | ||
| ENVOY_CONN_LOG(debug, "TSI: Handshake validation failed: {}", callbacks_->connection(), | ||
| err); | ||
| return Network::PostIoAction::Close; | ||
| return Network::PostIoAction::CloseGraceful; | ||
| } | ||
| } else { | ||
| ENVOY_CONN_LOG(debug, "TSI: Handshake validation skipped.", callbacks_->connection()); | ||
|
|
@@ -145,7 +145,7 @@ Network::PostIoAction TsiSocket::doHandshakeNextDone(NextResultPtr&& next_result | |
| if (read_error_ || (!handshake_complete_ && end_stream_read_)) { | ||
| ENVOY_CONN_LOG(debug, "TSI: Handshake failed: end of stream without enough data", | ||
| callbacks_->connection()); | ||
| return Network::PostIoAction::Close; | ||
| return Network::PostIoAction::CloseError; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are 2 remaining uses of Network::PostIoAction::Close; in this file when finding a read 0 during handshakes. Would it make sense to also consider those cases CloseError?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I changed one to
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess the question is wherever or it is worth having the extra complexity that comes from having 2 close types when out of the 3 remaining uses 2 of them are handshake errors. I guess it's fine to use CloseGraceful for the 3 remaining cases. |
||
| } | ||
|
|
||
| if (raw_read_buffer_.length() > 0) { | ||
|
|
@@ -167,21 +167,21 @@ Network::IoResult TsiSocket::doRead(Buffer::Instance& buffer) { | |
| ENVOY_CONN_LOG(debug, "TSI: raw read result action {} bytes {} end_stream {}", | ||
| callbacks_->connection(), enumToInt(result.action_), result.bytes_processed_, | ||
| result.end_stream_read_); | ||
| if (result.action_ == Network::PostIoAction::Close && result.bytes_processed_ == 0) { | ||
| if (result.action_ != Network::PostIoAction::KeepOpen && result.bytes_processed_ == 0) { | ||
| return result; | ||
| } | ||
|
|
||
| if (!handshake_complete_ && result.end_stream_read_ && result.bytes_processed_ == 0) { | ||
| return {Network::PostIoAction::Close, result.bytes_processed_, result.end_stream_read_}; | ||
| return {Network::PostIoAction::CloseError, result.bytes_processed_, result.end_stream_read_}; | ||
| } | ||
|
|
||
| end_stream_read_ = result.end_stream_read_; | ||
| read_error_ = result.action_ == Network::PostIoAction::Close; | ||
| read_error_ = result.action_ == Network::PostIoAction::CloseError; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this be: or consider renaming read_error_ to got_close_error_ I guess one thing to keep in mind is that as of this PR RawBufferSocket can only return KeepOpen or CloseError from doRead. Should we consider all Close PostIoActions as errors? |
||
| } | ||
|
|
||
| if (!handshake_complete_) { | ||
| Network::PostIoAction action = doHandshake(); | ||
| if (action == Network::PostIoAction::Close || !handshake_complete_) { | ||
| if (action != Network::PostIoAction::KeepOpen || !handshake_complete_) { | ||
| return {action, 0, false}; | ||
| } | ||
| } | ||
|
|
@@ -244,7 +244,7 @@ void TsiSocket::onNextDone(NextResultPtr&& result) { | |
| handshaker_next_calling_ = false; | ||
|
|
||
| Network::PostIoAction action = doHandshakeNextDone(std::move(result)); | ||
| if (action == Network::PostIoAction::Close) { | ||
| if (action != Network::PostIoAction::KeepOpen) { | ||
| callbacks_->connection().close(Network::ConnectionCloseType::NoFlush); | ||
| } | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -230,7 +230,7 @@ Network::PostIoAction SslHandshakerImpl::doHandshake() { | |
| // It's possible that we closed during the handshake callback. | ||
| return handshake_callbacks_->connection().state() == Network::Connection::State::Open | ||
| ? PostIoAction::KeepOpen | ||
| : PostIoAction::Close; | ||
| : PostIoAction::CloseGraceful; | ||
| } else { | ||
| int err = SSL_get_error(ssl(), rc); | ||
| switch (err) { | ||
|
|
@@ -242,7 +242,7 @@ Network::PostIoAction SslHandshakerImpl::doHandshake() { | |
| return PostIoAction::KeepOpen; | ||
| default: | ||
| handshake_callbacks_->onFailure(); | ||
| return PostIoAction::Close; | ||
| return PostIoAction::CloseError; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does the PostIoAction::Close on line 233 above a graceful or error close?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Graceful. |
||
| } | ||
| } | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -36,9 +36,11 @@ class NotReadySslSocket : public Network::TransportSocket { | |
| absl::string_view failureReason() const override { return NotReadyReason; } | ||
| bool canFlushClose() override { return true; } | ||
| void closeSocket(Network::ConnectionEvent) override {} | ||
| Network::IoResult doRead(Buffer::Instance&) override { return {PostIoAction::Close, 0, false}; } | ||
| Network::IoResult doRead(Buffer::Instance&) override { | ||
| return {PostIoAction::CloseError, 0, false}; | ||
| } | ||
| Network::IoResult doWrite(Buffer::Instance&, bool) override { | ||
| return {PostIoAction::Close, 0, false}; | ||
| return {PostIoAction::CloseError, 0, false}; | ||
| } | ||
| void onConnected() override {} | ||
| Ssl::ConnectionInfoConstSharedPtr ssl() const override { return nullptr; } | ||
|
|
@@ -110,7 +112,7 @@ Network::IoResult SslSocket::doRead(Buffer::Instance& read_buffer) { | |
| if (info_->state() != Ssl::SocketState::HandshakeComplete && | ||
| info_->state() != Ssl::SocketState::ShutdownSent) { | ||
| PostIoAction action = doHandshake(); | ||
| if (action == PostIoAction::Close || info_->state() != Ssl::SocketState::HandshakeComplete) { | ||
| if (action != PostIoAction::KeepOpen || info_->state() != Ssl::SocketState::HandshakeComplete) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also change post actions in NotReadySslSocket to CloseError
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
| // end_stream is false because either a hard error occurred (action == Close) or | ||
| // the handshake isn't complete, so a half-close cannot occur yet. | ||
| return {action, 0, false}; | ||
|
|
@@ -154,7 +156,7 @@ Network::IoResult SslSocket::doRead(Buffer::Instance& read_buffer) { | |
| // Renegotiation has started. We don't handle renegotiation so just fall through. | ||
| default: | ||
| drainErrorQueue(); | ||
| action = PostIoAction::Close; | ||
| action = PostIoAction::CloseError; | ||
| break; | ||
| } | ||
|
|
||
|
|
@@ -182,7 +184,7 @@ void SslSocket::onPrivateKeyMethodComplete() { | |
|
|
||
| // Resume handshake. | ||
| PostIoAction action = doHandshake(); | ||
| if (action == PostIoAction::Close) { | ||
| if (action != PostIoAction::KeepOpen) { | ||
| ENVOY_CONN_LOG(debug, "async handshake completion error", callbacks_->connection()); | ||
| callbacks_->connection().close(Network::ConnectionCloseType::FlushWrite); | ||
| } | ||
|
|
@@ -231,7 +233,7 @@ Network::IoResult SslSocket::doWrite(Buffer::Instance& write_buffer, bool end_st | |
| if (info_->state() != Ssl::SocketState::HandshakeComplete && | ||
| info_->state() != Ssl::SocketState::ShutdownSent) { | ||
| PostIoAction action = doHandshake(); | ||
| if (action == PostIoAction::Close || info_->state() != Ssl::SocketState::HandshakeComplete) { | ||
| if (action != PostIoAction::KeepOpen || info_->state() != Ssl::SocketState::HandshakeComplete) { | ||
| return {action, 0, false}; | ||
| } | ||
| } | ||
|
|
@@ -270,7 +272,7 @@ Network::IoResult SslSocket::doWrite(Buffer::Instance& write_buffer, bool end_st | |
| // Renegotiation has started. We don't handle renegotiation so just fall through. | ||
| default: | ||
| drainErrorQueue(); | ||
| return {PostIoAction::Close, total_bytes_written, false}; | ||
| return {PostIoAction::CloseError, total_bytes_written, false}; | ||
| } | ||
|
|
||
| break; | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.