-
Notifications
You must be signed in to change notification settings - Fork 3k
Fix channel close procedure when the peer dies or our handler goes down #9103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: maint
Are you sure you want to change the base?
Conversation
CT Test Results 2 files 29 suites 20m 11s ⏱️ Results for commit ca10880. ♻️ This comment has been updated with latest results. To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass. See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally. Artifacts// Erlang/OTP Github Action Bot |
There is possibility to get a crash if the channel handler goes down before receiving 'channel-open-confirmation' from the peer. I have an update to fix that but need some time to make a test. |
What is expected from me? I don't see any comments, have I missed something? |
You wrote:
I thought you want to add some testcase code ... |
The test is updated by the latest commit. |
Account for the case when user channel handler goes down before the channel opening procedure is completed: if channel open confirmation is received for such channel - the channel is automatically closed. Add a test for such scenario.
557de4f
to
704973a
Compare
If the peer fails to respond to ssh_msg_channel_close the corresponding channel entry will be removed from cache after the timeout (assuming the connection is still alive with probably other channels open).
704973a
to
ca10880
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR and digging into this problem!
Please check comments inline.
@@ -2052,6 +2090,11 @@ cond_set_idle_timer(D) -> | |||
_ -> {{timeout,idle_time}, infinity, none} | |||
end. | |||
|
|||
channel_close_timer(D, ChannelId) -> | |||
%{{timeout, {channel_close, ChannelId}}, 3000, none}. %?GET_OPT(idle_time, (D#data.ssh_params)#ssh.opts), none}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls remove comment. I think it it is not needed ... ?
local_id=Id}, Acc) when U == ChannelPid -> | ||
ssh_client_channel:cache_delete(Cache, Id), | ||
Acc; | ||
%% Here we first collect the list of channel id's handled by the process |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls fix indentation for this function. typically emacs formatting is used in our repo.
] | ||
), | ||
{ok, Channel0} = ssh_connection:session_channel(ConnRef, 50000), | ||
{ok, Channel1} = ssh_connection:session_channel(ConnRef, 50000), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls add _
so compilation warning is avoided.
|
||
%% connect to it with a regular Erlang SSH client: | ||
ChannelCloseTimeout = 3000, | ||
{ok, ConnRef} = std_connect(HostPort, Config, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please fix indentation
Parent ! {self(), Result} | ||
end), | ||
try | ||
TestResult = receive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix indentation. Try to avoid more than 100 chars per line.
do_ensure_channels(_ConnRef, NumExpected, NumExpected) -> | ||
ok; | ||
do_ensure_channels(ConnRef, NumExpected, _ChannelListLen) -> | ||
receive after 100 -> ok end, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use ct:sleep
?
ssh:stop_daemon(Pid) | ||
end. | ||
|
||
ensure_channels(ConnRef, Expected) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow ensure_channels/2
and do_ensure_channels/3
functions.
- why having 2 functions for this? maybe not needed in this simple case ...
- what is the exit criteria if ChannelList does not reach expected number?
shouldn't we have some counter and do only certain number of iterations?
@@ -1943,6 +1945,134 @@ max_channels_option(Config) when is_list(Config) -> | |||
ssh:close(ConnectionRef), | |||
ssh:stop_daemon(Pid). | |||
|
|||
handler_down_before_open(Config) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this testcase work?
could be environment issue, but it was passed also when I reverted PR modification done in src folder.
i was expecting it to fail ...
If it fails for you? What is the error and associated stacktrace?
ct:log("~p:~p open incomplete channel done - should not have happened",[?MODULE,?LINE]), | ||
Parent ! {self(), {fail, "Unexpected channel success"}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are those 2 lines a dead code? if so, maybe replace them with code comment?
Fix proposal for the issue #9102