The server response does not contain an SSH identification string. #1107

rkreisel · 2023-03-31T14:33:56Z

When testing from a developer machine, the connection to the remote server is successful. But when deployed to an Azure function I get this error upon executing the Connect() method. The referenced ietf document is "greek" to me.

Renci.SshNet.Common.SshConnectionException: The server response does not contain an SSH identification string. The connection to the remote server was closed before any data was received. More information on the Protocol Version Exchange is available here: https://tools.ietf.org/html/rfc4253#section-4.2

WojciechNagorski · 2023-11-22T09:15:16Z

I've reproduced this problem here:
https://ci.appveyor.com/project/drieseng/ssh-net/builds/48584754

@Rob-Hague do you have any idea what might have happened?

Rob-Hague · 2023-11-22T09:36:54Z

In #1250 it looks like the same as #1220 (comment)

I had guessed that that was related to the connections being re-established too quickly. I started looking at SO_REUSEADDR and SO_REUSEPORT, but I have quite a lot of learning to do there. I'm not sure whether @rkreisel's problem would have the same cause or whether it is something different.

Probably the best thing would be to get a packet capture by running tcpdump on the docker instance... but I wouldn't know how to do that either 🙂

raimana · 2024-01-15T23:29:45Z

FWIW I've been troubleshooting a similar problem, i.e. works without issues locally, and failed intermittently in Azure.

The problem was that Azure Functions or App Services (except ASE/Isolated tier, that cost an arm and a leg) have a list of outbound IP addresses it can "pick" for outbound connections.
The IP selected by Azure can change across Function execution, the actual issue was that some IPs were allowed and some were blacklisted (the company hosting the SFTP server was unaware that some Azure IPs were locked, until I showed them the packet capture).
These IPs could have been used by other tenants - before being assigned to your Function App - engaging in "suspicious" activities.

Because the issue was intermittent and similar tickets pointed to SSH.NET potentially not handling connections properly(?), I initially looked into the SSH.NET code but after debugging it extensively I came to the conclusion it had nothing to do with it, then I started to look at the network (should have started there).

This was manifesting itself by a FIN/ACK packet sent - by the remote site - immediately after the TCP handshake.
The client starts the SSH protocol version exchange unaware that the server is initiating the TCP connection termination (screenshot below).
Hence why the server never returns its identification string since it's closing the connection.

You can run a packet capture from Azure by upgrading to a premium plan temporarily (if running on a consumption plan).

Go to "Change App Service Plan" -> Select "Function Premium"
Go to "Diagnose and solve problems" -> "Collect Network Trace"
Use Wireshark or similar to review the packet capture

A few options to solve this particular problem are:

Whitelist the function apps' data center IPs, if you control the SFTP server or can convince the vendor (it's a long list)

https://learn.microsoft.com/en-us/azure/azure-functions/ip-addresses?tabs=portal#data-center-outbound-ip-addresses

Route traffic from your Function App to a network appliance with a static IP (NAT gateway, outbound load balancer etc.)

https://learn.microsoft.com/en-us/azure/nat-gateway/nat-overview

others...

sundman · 2024-10-31T14:20:07Z

After updating to 2024.1 we started to notice the same error message when trying to connect to an AWS sftp server, but only about 50% of the time, seemingly at random.

After a lot of digging in stuff I really don't understand that well, my current understanding is that it seems like there is some race condition when the protocol version exchange is sent very close in time after an ACK related to the initial connection:

Here we see wireshark logs of first a failed connection attempt at about 11:53:50, which ends up in a loop of some retransmission requests until the server gives up on us. I know too little about the insides of TCP connections to know who is to blame for the parts not agreeing any longer, but somehow they end up talking past each other.

The connection attempt at 11:58:50 ends up working tough and everything is fine...

After some tinkering I found that this small change "fixes" the problem:

I'm not suggesting that this is a long term solution for anyone, but perhaps this sheds enough light into the problem so that someone actually can solve the bug before we need to upgrade to a new version.

Until then this hack seems to have resolved our immediate issues.

Rob-Hague · 2024-11-09T17:51:38Z

No.7906 in your trace seems strange to me, the Ack number from the server is 100 less than it should be, i.e. it is 969_232_960 but I would have thought it should be 969_233_060. No idea how that could happen

raimana mentioned this issue Jan 16, 2024

SSHException Channel was closed intermittent behavior #511

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The server response does not contain an SSH identification string. #1107

The server response does not contain an SSH identification string. #1107

rkreisel commented Mar 31, 2023

WojciechNagorski commented Nov 22, 2023

Rob-Hague commented Nov 22, 2023

raimana commented Jan 15, 2024 •

edited

Loading

sundman commented Oct 31, 2024

Rob-Hague commented Nov 9, 2024

The server response does not contain an SSH identification string. #1107

The server response does not contain an SSH identification string. #1107

Comments

rkreisel commented Mar 31, 2023

WojciechNagorski commented Nov 22, 2023

Rob-Hague commented Nov 22, 2023

raimana commented Jan 15, 2024 • edited Loading

sundman commented Oct 31, 2024

Rob-Hague commented Nov 9, 2024

raimana commented Jan 15, 2024 •

edited

Loading