Race condition in BufferedIoOutputStream #263

tomaswolf · 2022-11-02T08:49:24Z

SFTP tests are flaky. Turns out that they are running into

23:13:59.123 | ERROR | MinaProcessor-25 | o.a.s.c.c.BufferedIoOutputStream | org.apache.sshd.common.channel.BufferedIoOutputStream             219 | finishWrite(sftp-out@0)[ChannelAsyncOutputStream[ChannelSession[id=0, recipient=0]-ServerSessionImpl[testTransferIntegrityWithBufferLargerThanPacket[REKEY_BLOCK_SIZE 65,536]@/127.0.0.1:59555]] cmd=SSH_MSG_CHANNEL_DATA] - pending byte counts underflow (-64525) after 18070131 bytes
23:13:59.123 | ERROR | m-42976-thread-1 | o.a.s.s.s.SftpSubsystem          | org.apache.sshd.common.util.logging.LoggingUtils                  693 | run(ServerSessionImpl[testTransferIntegrityWithBufferLargerThanPacket[REKEY_BLOCK_SIZE 65,536]@/127.0.0.1:59555]) SshChannelBufferedOutputException caught in SFTP subsystem after 863 buffers: Pending byte counts underflow
23:23:59.498 | INFO  | )-timer-thread-1 | o.a.s.s.s.ServerSessionImpl      | org.apache.sshd.common.session.helpers.SessionHelper             1183 | Disconnecting(ServerSessionImpl[testTransferIntegrityWithBufferLargerThanPacket[REKEY_BLOCK_SIZE 65,536]@/127.0.0.1:59555]): SSH2_DISCONNECT_PROTOCOL_ERROR - Detected IdleTimeout after 600375/600000 ms.

Finished org.apache.sshd.sftp.client.SftpTransferTest:testTransferIntegrityWithBufferLargerThanPacket[REKEY_BLOCK_SIZE 65,536] in 610192 ms

Let's ignore the fact that test hangs for 10 minutes until that timeout expires; that a different minor problem. The real problem is the "pending byte counts underflow", which indicates that the BufferedIoOutputStream somehow managed to write more data than was passed to it.

This is a race condition between BufferedIoOutputStream.startWriting() and BufferedIoOutputStream.finishWrite(). The following sequence of events is possible when a write that was in progress just finished and the user of the stream just initiates a new write:

Thread 1: startWriting gets the head of the write queue -- this is still the future/buffer that was just written
Thread 2: finishWrite removes that write from the write queue
Thread 2: finishWrite clears the "current" write
Thread 1: startWriting sets the "current" write and happily writes the future/buffer again.

Combined with the fact that the underlying ChannelAsyncOutputStream may in some cases copy the buffer and actually write that copy, but doesn't consume the original buffer in that case, this leads to the same data being written again, and when that second write finishes, this byte count gets subtracted again from the number of total bytes to be written. So the count is added once but subtracted twice, and eventually this "pending byte counts underflow" is hit.

This race condition in BufferedIoOutputStream needs to be fixed. Optionally ChannelAsyncOutputStream should consume the original buffer if it makes a copy; though I'm undecided on that: if it did, it might have hidden this bug (because then the second write of that buffer would have been a zero-byte write).

(The problem is not specific to the MINA transport back-end; it also occurs with NIO2 from time to time. CI builds are green because the race condition appears to be hit infrequently, and on a re-run the test usually succeeds.)

The text was updated successfully, but these errors were encountered:

Don't try to write a future that's already done in startWrite(). Just skip it and try the next one, if any. Fixes apache#263. Bug: apache#263

116-7 · 2024-08-04T14:53:42Z

@tomaswolf sorry to bring this up in this old thread but I have run into an issue with an older version of mina-sshd, 2.8.0 and I am wondering if this fix would resolve the issue. Unfortunately I cannot just update the version myself as the dependency is deep within a piece of software that I do not control the source code to and is also behind multiple layers of corporate bureaucracy let's say.

The gist of the issue is that when attempting to establish an ssh session to a mina-sshd 2.8.0 instance there appears to be a race condition that shows up intermittently as a "banner exchange: Connection to xxx.xxx.xxx.xxx port 22: invalid format" error.

The sequence appears to be the following:

The client opens the connection and the initial TCP handshake completes successfully
The server's first data packet with a relative TCP sequence number of 1 (e.g. first packet after the SYN,ACK) is the key exchange init rather than the protocol negotiation.
OpenSSH sends a TCP RST and the connection attempt fails because it panics if the first input on the socket isn't the protocol negotiation packet.

At step 2 for a successful connection the protocol negotiation packet is sent first and the connection opens as expected but it seems that around 1/20 attempts to connect result in the key exchange init being sent first.

Java, and especially multi-threaded Java is not my area of expertise so I'm not sure if this patch would also fix the above issue. I had a look around the project's code to see if I could see anything and it seems like there are some futures and whatnot at play during a session initialisation but I couldn't tell if the key exchange waited for the protocol negotiation to complete. I'm also not sure if your patch being at the level of the output stream writing phase would side-step the need for key exchange to wait for the protocol negotiation future to complete.

tomaswolf · 2024-08-05T07:27:38Z

@116-7 I've never seen this in any Apache MINA sshd version. In any case it would be unrelated to the race condition mentioned in this issue.

We did have one report of what you describe: SSHD-1204, but it involved an unknown server behind a Palto Alto Networks firewall appliance that identified as "SSH2.0-PaloAltoNetworks_0.2" and an Apache MINA sshd client. With a packet trace obtained via Wireshark we were able to prove that this server did send its initial key exchange proposal first, and only afterwards the protocol negotiation, i.e., its identification string. (The packet traces are attached to that issue SSHD-1204.) This firewall thing appeared to be a decrypting SSH proxy, so it's most likely the fault of that Palo Alto firewall, not of whatever server was behind it.

Other people also had trouble with that firewall.

If you can provide a Wireshark packet trace of the first few packets, which are unencrypted, I could take a look to see if your case is similar.

116-7 · 2024-08-05T08:20:10Z

@tomaswolf providing the Wireshark cap files will probably not be possible given the aforementioned bureaucracy. For context it's probably safe to assume that there are VPN tunnels, firewalls, and probably proxies in-between the client and server.

For context the server is a Bitbucket server and clients are openssh through the git command.

Given that the issue only occurs about 1/20 times even when attempting from the same client in the space of about 2 minutes and even when testing during times of low network activity (e.g. very early hours of the morning) my gut feel is that it may be mina-sshd related. There does seem to be an increase in the frequency of the error during times of server load e.g. towards the end of day when more people are committing their code.

It's possible that there is some network path that is terminating and re-writing the packets incorrectly but given all the possible intermediate network hardware as well as the general opaqueness of the network/vlan/vpn/etc. it's not possible to trace the issue of such a device.

I do have some screenshots from Wireshark though.

In the unsuccessful case the TCP handshake completes, the client sends its protocol negotiation but the server responds with its key exchange which then makes its way to the application layer which then panics when it sees the null chars at the start of the key exchange packet.

In the successful case the server's protocol negotiation comes through as expected with a relative sequence of 1 and the following server key exchange init has a sequence of 28.

tomaswolf · 2024-08-05T17:32:16Z

@116-7 Looks like your problem is different than SSHD-1204. That TCP packet for the server's KEX_INIT is given as having 992 bytes, which matches the length recorded inside the packet (0x3dc = 988 bytes, + 4 bytes for the length itself). So you're not getting a single TCP message with KEX_INIT and server ident swapped as it happened in SSHD-1204.

I do not see anything obvious in Apache MINA sshd that would explain how a server could send the KEX_INIT without having sent its identification first. The identification string is sent in the constructor of the server session. Now "sent" actually means the message is placed into the write queue to be sent by the underlying transport (NIO2, MINA, or Netty).

In all three cases I don't see how the server's KEX_INIT could overtake that identification message in that queue; nor do I see any way how the server could put its KEX_INIT message into that queue first (so that the identification message would be the second message by mistake). For NIO2, there is an explicit message queue. For MINA, there is an explicit message queue inside the MINA library. For Netty, message order is ensured by chaining futures and using writeAndFlush always. All three implementations appear correct. Even if I go back to source code revisions from before Apache MINA sshd 2.8.0 the message handling at that transport level appears to be fine in that respect.

I've not succeeded writing a unit test that would reproduce the issue you describe.

On top of that, I don't know which transport Bitbucket server is using. Could be any of the three we provide, or they might have written their own.

We could add code to make sure that the server's KEX_INIT does explicitly wait until that identification message has been sent (as opposed to be queued for sending). It's a bit more complicated to do than I like, but it is doable, and it might perhaps be a good idea in any case. It won't help with your particular problem, though, since apparently you cannot control the server side software.

116-7 · 2024-08-05T23:13:39Z

@tomaswolf thank you for looking into the issue.

I will persevere on my end to try and get to the bottom of what is happening. It will probably take some time due to the corporate processes I will have to deal with, but I will update this thread once I get to the bottom of the issue.

As you mentioned it might be any number of problems and adding more complicated logic to mina-sshd when it might not be the root cause might just be adding complication for complication's sake.

I will get back to you.

tomaswolf · 2024-08-06T06:35:52Z

@116-7

I will update this thread once I get to the bottom of the issue

If you do, and there is something we should do in Apache MINA sshd, please open a new issue. We can always reference this discussion, or copy it across, but having a separate issue for this makes tracking easier.

tomaswolf added the bug An issue describing a bug in the code label Nov 2, 2022

tomaswolf self-assigned this Nov 2, 2022

tomaswolf mentioned this issue Nov 2, 2022

GH-263: Fix race condition in BufferedIoOutputStream #264

Merged

tomaswolf added this to the 2.9.2 milestone Nov 2, 2022

tomaswolf closed this as completed in ba82c13 Nov 2, 2022

tomaswolf added a commit that referenced this issue Nov 2, 2022

Changes.md: mention GH-263

b724ac2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Race condition in BufferedIoOutputStream #263

Race condition in BufferedIoOutputStream #263

tomaswolf commented Nov 2, 2022 •

edited

Loading

116-7 commented Aug 4, 2024

tomaswolf commented Aug 5, 2024

116-7 commented Aug 5, 2024 •

edited

Loading

tomaswolf commented Aug 5, 2024

116-7 commented Aug 5, 2024

tomaswolf commented Aug 6, 2024

Race condition in BufferedIoOutputStream #263

Race condition in BufferedIoOutputStream #263

Comments

tomaswolf commented Nov 2, 2022 • edited Loading

116-7 commented Aug 4, 2024

tomaswolf commented Aug 5, 2024

116-7 commented Aug 5, 2024 • edited Loading

tomaswolf commented Aug 5, 2024

116-7 commented Aug 5, 2024

tomaswolf commented Aug 6, 2024

tomaswolf commented Nov 2, 2022 •

edited

Loading

116-7 commented Aug 5, 2024 •

edited

Loading