-
-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new mptcp connections stuck in SYN-SENT #431
Comments
Note that the SYN_SENT TCP subflow will time-out (close) quite some time after the above event on the other end. The exact time depends on net.ipv4.tcp_syn_retries and net.ipv4.tcp_syn_linear_timeouts but I can't extract a simple expression to compute the exact value on top of my head. With default setting should be very roughly ~1'. And all the tcp syn trans need to be dropped in-between to really experience the tcp-level timeout. I think the same scenario will happen when tcp syn (and retransmissions) are dropped in between by whatever means/cause (e.g. firewall/ct exceeding max entries) |
@daire-byrne: I posted a couple of patches that should address the above (and possibly issues/430, as I think they are related/almost the same): https://patchwork.kernel.org/project/mptcp/list/?series=779909 Could you please have a run in your testbed? , |
Yea, it is worth mentioning that I have not seen the SYN-SENT hung rsync processes today with production workloads and I think I would have expected to see one by now. It does seem likely that this is fixed by the patches for #430. |
Opening a new ticket in my ongoing series of "why do rsync transfers hang when using mptcp" series... :)
So as we patch, filter and better understand the causes of hanging rsync commands (thanks!), this issue seems to be recurring with our production workloads (if not yet reproducible) but is likely not connected to the previous issues?
My observation is that these "stuck" rsyncs hang in connect and often occur in timed flurries and mostly have the remote "server" (rsyncd) in common.
So I have seen 3 hang within the same minute on serverA, 2 in that same minute on serverB all trying to connect to serverC. I am not able to ascertain if any TCP connections were ever started, but certainly there no signs of it on the clients (serverA, serverB) and the server (serverC).
The hung rsync client processes have to be manually killed as they never timeout.
I have noticed the odd syn flood message in the logs of various servers but the timing of these never line up with rsync hangs.
I have also increased some sysctls and have not seen the flood messages since (but still see SYN-SENT hangs).
My gut feeling is that this is also something related to v6.3+ as I would have thought I would have noticed the frequency of this before (although I said that about #429 too...). Out of maybe 50,000 rsync connections per day, I'm seeing around 3-5 hanging like this.
Despite a few connections hanging like this around the same time, the subsequent connections all seem to work fine again so whatever causes it seems pretty fleeting.
And maybe this would still happen with normal TCP + rsync? Although I don't think I've ever seen it happen in the wild.
I'll attach more info as I have it...
The text was updated successfully, but these errors were encountered: