-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Websocket should only call async_close once #4848
Conversation
@@ -348,6 +356,8 @@ BaseWSPeer<Handler, Impl>::on_write_fin(error_code const& ec) | |||
return fail(ec, "write_fin"); | |||
wq_.pop_front(); | |||
if (do_close_) | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should do_close_
be reset to false
here? What if on_write_fin
is called again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it doesn't look like it's needed since write()
is not going to be called in on_write_fin()
if do_close_
is true and send()
is not going to call write()
because closing_
is true
. Then I don't think two flags are needed. Can keep do_close_
and have the check on line 264 and remove do_close_
on line 280. The check on line 264 prevents from initiating async_close
again in close()
. And do_close_
set to true
will allow async_close
in on_write_fin()
if the queue was not empty when close()
was called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question. I don't think that setting do_close_
to false
is a good idea, though. Then, if wq_
is not empty we'd call on_write()
. We're trying to shut things down.
If it's a problem that we need to address, and it feels like maybe it is, we could add yet another data member bool closed_
and set it true
just as soon as we're inside this if. That would make this if
look like...
if (do_close_ && !closed)
{
closed = true;
assert(closing_);
...
}
else if (!wq_.empty() && !closed)
But I don't like the idea of adding yet another bool
. Perhaps it would work better with an enum
for a state machine?
enum class Finish {
not_closing,
closing,
do_close,
closed
};
Just a thought...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback. Coincidentally, I had the realization about not needing closing_
while having dinner, and came in to fix it before I even saw your comments. 😀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Suggested commit message:
|
note: 694fc8a is mistaken and would need review |
* upstream/develop: Set version to 2.0.0 Set version to 2.0.0-rc7 Set version to 2.0.0-rc6 Revert "Apply transaction batches in periodic intervals (4504)" (4852) Revert "Add ProtocolStart and GracefulClose P2P protocol messages (3839)" (4850)
2f20878
to
4029ca2
Compare
Branch is reverted back to the version that was approved by reviewers, and merged with latest |
LGTM! QA Approved. |
Prevent WebSocket connections from trying to close twice. The issue only occurs in debug builds (assertions are disabled in release builds, including published packages), and when the WebSocket connections are unprivileged. The assert (and WRN log) occurs when a client drives up the resource balance enough to be forcibly disconnected while there are still messages pending to be sent. Thanks to @lathanbritz for discovering this issue in XRPLF#4822.
High Level Overview of Change
Prevents Websocket connections from trying to close twice.
Context of Change
While testing #4822, @lathanbritz discovered that
rippled
would assert under heavy websocket client load. Later testing determined that this issue was unrelated to that PR, and in fact, exists at least as far back as 1.12.0. In fact, the relevant code appears to be 7 years old. It's entirely possible that the code was fine as originally written, and changes in the underlying Boost library uncovered it.The issue only occurs in debug builds (assertions are disabled in release builds, including published packages), and when the websocket connections are unprivileged.
More specifically, the issue seems to occur when a client drives up the resource balance enough to be forcibly disconnected while there are still messages pending to be sent. Relevant log messages will look like:
Type of Change
Test Plan
Running the following python script several times in parallel reliably recreates the issue, usually within a few seconds on a debug build from a non-admin host.