Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indefinite blocks on NativeCrypto.SSL_do_handshake #864

Open
juneoh opened this issue Jul 21, 2020 · 9 comments · Fixed by apache/beam#23300
Open

Indefinite blocks on NativeCrypto.SSL_do_handshake #864

juneoh opened this issue Jul 21, 2020 · 9 comments · Fixed by apache/beam#23300

Comments

@juneoh
Copy link

juneoh commented Jul 21, 2020

We have user reports from Google Cloud Dataproc that threads/tasks would intermittently hang on NativeCrypto.SSL_do_handshake. There were some Java and native thread dumps shared in GoogleCloudDataproc/hadoop-connectors/issues/153, but we were unable to produce a solid reproduction so far.

Dataproc currently uses Conscrypt 1.4.2 on java-8-openjdk-amd64, and on the OS side we have heard from both Debian 9 (Debian-provided OpenJDK) and Debian 10 (AdoptOpenJDK). One user shared with GCP Support that they still saw the same hang even after manually updating to Conscrypt 2.4.0.

Below is an example of the thread dump.

"gcsfs-batch-helper-3912" #4130 daemon prio=5 os_prio=0 tid=0x00007f92a4040800 nid=0x3130 runnable [0x00007f9330815000]
   java.lang.Thread.State: RUNNABLE
	at org.conscrypt.NativeCrypto.SSL_do_handshake(Native Method)
	at org.conscrypt.NativeSsl.doHandshake(NativeSsl.java:392)
	at org.conscrypt.ConscryptFileDescriptorSocket.startHandshake(ConscryptFileDescriptorSocket.java:225)
	at org.conscrypt.ConscryptFileDescriptorSocket.waitForHandshake(ConscryptFileDescriptorSocket.java:474)
	at org.conscrypt.ConscryptFileDescriptorSocket.getOutputStream(ConscryptFileDescriptorSocket.java:461)
	at sun.net.www.http.HttpClient.openServer(HttpClient.java:465)
	at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
	- locked <0x0000000651a1d8a8> (a sun.net.www.protocol.https.HttpsClient)
	at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264)
	at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
	at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
	at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1156)
	at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1050)
	at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
	at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:162)
	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:104)
	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981)
	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
	at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.BatchHelper.execute(BatchHelper.java:175)
	at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.BatchHelper.lambda$queue$0(BatchHelper.java:163)
	at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.BatchHelper$$Lambda$96/631118845.call(Unknown Source)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
@prbprbprb
Copy link
Collaborator

Looking at the native stack trace in the referenced bug, I'm a bit surprised that it literally just calls poll, there ought to be some intermediate calls although maybe this code has just changed a lot since 1.4.2. Not important but it would be nice to pin down exactly where it's hanging. The upshot though is that Conscrypt is waiting for data from the server which isn't arriving and is hanging forever.

The fact you're not seeing this with other TLS implementations and people are still seeing it with 2.4.0 does suggest it's a Conscrypt bug. However we're actively migrating away from this implementation of SSLSocket (mainly due to the complexity of the native code) so unless you have a reliable repro, there's not a lot we can do.

It would be great if you guys could test against the new implementation, but that would mean updating to 2.4.0 or later, and the dataproc bug suggests you're blocked by #864. On the plus side, we have someone actively working on that. And for testing purposes you could probably use 2.4.0 with some code to install your own TrustManager or ConscryptHostnameVerifier.

@medb
Copy link
Member

medb commented Jul 24, 2020

May you elaborate on how to test a new socket implementation instead of SSLSocket in Conscrypt 2.4.0?

Maybe the bug here could be as simple as that Conscrypt does not set a read timeout (or some other timeout) on SSLSocket?

@prbprbprb
Copy link
Collaborator

prbprbprb commented Jul 24, 2020

Ah sorry, to test the engine-based implementation, call Conscrypt.setUseEngineSocketByDefault(true); before creating any sockets or socket factories or you can set the Java property org.conscrypt.useEngineSocketByDefault to true on the command line. Note that we plan to switch the default in 2.5.0 anyway which was targeted for this week but is currently blocked by an unrelated bug.

By default I believe the socket timeouts are 0 (i.e. indefinite). There's some complexity here between the implementations...

Both implementations support read timeouts using the standard Socket.setSoTimeout() API. If your app is setting this timeout and you're still seeing input hangs then that's a definite bug.

ConscryptFileDescriptorSocket implements a separate write timeout. ConscryptEngineSocket currently does not. There is an unsupported reflexive API for setting it but it's discouraged (and is a no-op on ConscryptEngineSocket).

ConscryptFileDescriptorSocket also has the notion of a handshake timeout which overrides the read and write timeouts for the duration of the TLS handshake. There is, however, no public API for setting this and no plans to add one. If you need a different timeout for the handshake just call setSoTimeout() again once the connection is established.

@medb
Copy link
Member

medb commented Sep 18, 2020

Dataproc was updated to use Conscrypt 2.5.1 by default, I think that we can close this issue until we will have an evidence that it re-occurs on Conscrypt 2.5.1+.

@medb
Copy link
Member

medb commented Jan 16, 2021

We got reoccurrence of the same hanging with Conscrypt 2.5.1 that uses new ConscryptEngineSocket by default:

08      gcsfs-batch-helper-151  RUNNABLE        Monitor(java.lang.Object@1728667131}), Monitor(sun.net.www.protocol.https.HttpsURLConnectionImpl@525072463}), Monitor(sun.net.www.protocol.https.HttpsClient@575051682}), Monitor(sun.net.www.protocol.https.DelegateHttpsURLConnection@389579666}), Lock(java.util.concurrent.ThreadPoolExecutor$Worker@2076466387})
java.net.SocketInputStream.socketRead0(Native Method)
java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
java.net.SocketInputStream.read(SocketInputStream.java:171)
java.net.SocketInputStream.read(SocketInputStream.java:141)
org.conscrypt.ConscryptEngineSocket$SSLInputStream.readFromSocket(ConscryptEngineSocket.java:920)
org.conscrypt.ConscryptEngineSocket$SSLInputStream.processDataFromSocket(ConscryptEngineSocket.java:884)
org.conscrypt.ConscryptEngineSocket$SSLInputStream.access$100(ConscryptEngineSocket.java:706)
org.conscrypt.ConscryptEngineSocket.doHandshake(ConscryptEngineSocket.java:230)
org.conscrypt.ConscryptEngineSocket.startHandshake(ConscryptEngineSocket.java:209) => holding Monitor(java.lang.Object@1728667131})
org.conscrypt.ConscryptEngineSocket.waitForHandshake(ConscryptEngineSocket.java:547)
org.conscrypt.ConscryptEngineSocket.getOutputStream(ConscryptEngineSocket.java:290)
sun.net.www.http.HttpClient.openServer(HttpClient.java:465)
sun.net.www.http.HttpClient.openServer(HttpClient.java:558) => holding Monitor(sun.net.www.protocol.https.HttpsClient@575051682})
sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264)
sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1340) => holding Monitor(sun.net.www.protocol.https.DelegateHttpsURLConnection@389579666})
sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1315) => holding Monitor(sun.net.www.protocol.https.DelegateHttpsURLConnection@389579666})
sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:264) => holding Monitor(sun.net.www.protocol.https.HttpsURLConnectionImpl@525072463})
com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:113)
com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:84)
com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1011)
com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.batch.BatchRequest.execute(BatchRequest.java:241)
com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.BatchHelper.lambda$flushPendingRequests$2(BatchHelper.java:224)
com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.BatchHelper$$Lambda$228/726710777.call(Unknown Source)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)

@samwiise
Copy link

Issue is still there in Conscrypt 2.5.1

"pool-7-thread-4" #33 prio=5 os_prio=0 tid=0x00007f3b383f3800 nid=0x35fe runnable [0x00007f3b3ca4e000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at org.conscrypt.ConscryptEngineSocket$SSLInputStream.readFromSocket(ConscryptEngineSocket.java:920)
at org.conscrypt.ConscryptEngineSocket$SSLInputStream.processDataFromSocket(ConscryptEngineSocket.java:884)
at org.conscrypt.ConscryptEngineSocket$SSLInputStream.access$100(ConscryptEngineSocket.java:706)
at org.conscrypt.ConscryptEngineSocket.doHandshake(ConscryptEngineSocket.java:230)
at org.conscrypt.ConscryptEngineSocket.startHandshake(ConscryptEngineSocket.java:209)
- locked <0x00000000bc87ad60> (a java.lang.Object)
at org.conscrypt.ConscryptEngineSocket.waitForHandshake(ConscryptEngineSocket.java:547)
at org.conscrypt.ConscryptEngineSocket.getOutputStream(ConscryptEngineSocket.java:290)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:465)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
- locked <0x00000000bc87ad70> (a sun.net.www.protocol.https.HttpsClient)
at sun.net.www.protocol.https.HttpsClient.(HttpsClient.java:264)
at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:167)
at net.elastica.discoveryutil.ServiceDiscovery.getHttpsConnection(ServiceDiscovery.java:169)
at net.elastica.discoveryutil.ServiceDiscovery.getServiceFromIP(ServiceDiscovery.java:183)
at net.elastica.audit.ServiceIpToServiceIdResolverJob$ServiceMap$1.run(ServiceIpToServiceIdResolverJob.java:297)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

@medb
Copy link
Member

medb commented Feb 4, 2021

@prbprbprb Maybe this hang is caused by the JDK-8238579 issue?

@cnauroth
Copy link

I think I've found the root cause. It requires a careful reading of the sun.net.www JDK classes and how they interact with Conscrypt classes. For reference, here is the Java version used in my tests:

openjdk version "1.8.0_332"
OpenJDK Runtime Environment (Temurin)(build 1.8.0_332-b09)
OpenJDK 64-Bit Server VM (Temurin)(build 25.332-b09, mixed mode)

All of the following stack traces and code walkthrough will refer to OpenJDK 8u322 and Conscrypt 2.5.2.

Following are stack traces from running a simple test harness that uses HttpsURLConnection with a 20-second read timeout. To simulate a network fault, I'm applying an iptables rule on connections to a GCS IP address and then using /etc/hosts to route all GCS connections to that IP address:

IP=$(nslookup storage.googleapis.com | grep -E 'Address: [0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | head -1 | awk '{ print $NF }')

sudo iptables -A OUTPUT -p tcp --source 10.240.1.19 --destination "${IP}" -m state --state ESTABLISHED -j DROP

cnauroth@cnauroth-conscrypt-on-m:~$ cat /etc/hosts
127.0.0.1	localhost
::1		localhost ip6-localhost ip6-loopback
ff02::1		ip6-allnodes
ff02::2		ip6-allrouters

108.177.111.128 storage.googleapis.com

10.240.0.199 cnauroth-conscrypt-on-m.us-central1-c.c.hadoop-cloud-dev.google.com.internal cnauroth-conscrypt-on-m  # Added by Google
169.254.169.254 metadata.google.internal  # Added by Google

The iptables rule allows for the initial syn/ack to get the connection established, but then all subsequent inbound packets are dropped before reaching the application layer. We'd expect to see the first socket read during the TLS handshake to timeout after 20 seconds. Instead, the read hangs indefinitely.

First, here is the test using Conscrypt, and running jstack on the process after it hangs:

"main" #1 prio=5 os_prio=0 tid=0x00007fc3f800a800 nid=0x54ac runnable [0x00007fc3ff14a000]
   java.lang.Thread.State: RUNNABLE
	at java.net.SocketInputStream.socketRead0(Native Method)
	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
	at java.net.SocketInputStream.read(SocketInputStream.java:171)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at org.conscrypt.ConscryptEngineSocket$SSLInputStream.readFromSocket(ConscryptEngineSocket.java:920)
	at org.conscrypt.ConscryptEngineSocket$SSLInputStream.processDataFromSocket(ConscryptEngineSocket.java:884)
	at org.conscrypt.ConscryptEngineSocket$SSLInputStream.access$100(ConscryptEngineSocket.java:706)
	at org.conscrypt.ConscryptEngineSocket.doHandshake(ConscryptEngineSocket.java:230)
	at org.conscrypt.ConscryptEngineSocket.startHandshake(ConscryptEngineSocket.java:209)
	- locked <0x0000000772a610a0> (a java.lang.Object)
	at org.conscrypt.ConscryptEngineSocket.waitForHandshake(ConscryptEngineSocket.java:547)
	at org.conscrypt.ConscryptEngineSocket.getOutputStream(ConscryptEngineSocket.java:290)
	at sun.net.www.http.HttpClient.openServer(HttpClient.java:465)
	at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
	- locked <0x0000000772a3da00> (a sun.net.www.protocol.https.HttpsClient)
	at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264)
	at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
	at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:203)
	at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
	at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
	at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:189)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1572)
	- locked <0x0000000772a39df8> (a sun.net.www.protocol.https.DelegateHttpsURLConnection)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1500)
	- locked <0x0000000772a39df8> (a sun.net.www.protocol.https.DelegateHttpsURLConnection)
	at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:268)
	- locked <0x0000000771eadb40> (a sun.net.www.protocol.https.HttpsURLConnectionImpl)
	at TestHttpsURLConnection.main(TestHttpsURLConnection.java:17)

Second, here is the test using the default security provider, resulting in an exception after the 20-second read timeout:

Exception in thread "main" javax.net.ssl.SSLException: Read timed out
	at sun.security.ssl.Alert.createSSLException(Alert.java:127)
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:324)
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:267)
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:262)
	at sun.security.ssl.SSLTransport.decode(SSLTransport.java:138)
	at sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1397)
	at sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1305)
	at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:440)
	at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
	at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:197)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1572)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1500)
	at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:268)
	at TestHttpsURLConnection.main(TestHttpsURLConnection.java:18)
Caused by: java.net.SocketTimeoutException: Read timed out
	at java.net.SocketInputStream.socketRead0(Native Method)
	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
	at java.net.SocketInputStream.read(SocketInputStream.java:171)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:464)
	at sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:165)
	at sun.security.ssl.SSLTransport.decode(SSLTransport.java:109)
	... 9 more

It's interesting to note that both processes are performing TLS handshake, but they are doing the handshake from different points of execution in AbstractDelegateHttpsURLConnection#connect. With Conscrypt, we are on line 189, invoking HttpURLConnection#plainConnect. With the default security provider, we are past that and instead down to line 197, calling HttpsClient#afterConnect.

Execution with Conscrypt proceeds into HttpURLConnection#plainConnect0 line 1162. However, note here that the read timeout is not applied until line 1163, after this completes!

                                http = getNewHttpClient(url, p, connectTimeout);
                                http.setReadTimeout(readTimeout);

Thus, it appears that Conscrypt, unlike the default security provider, is performing the TLS handshake at a time when the read timeout has not yet been applied. It seems like the sun.net.www classes have a hidden assumption that TLS handshake will not be performed until HttpsClient#afterConnect. Conscrypt, instead, triggers the TLS handshake earlier in ConscryptEngineSocket#getOutputStream. The default provider's SSLSocketImpl#getOutputStream does not trigger TLS handshake by side effect like this.

To summarize:

  • The Sun HttpClient seems to assume that it can call getOutputStream() before completing TLS handshake.
  • This assumption is safe with the default security provider.
  • With Conscrypt, the call to getOutputStream() starts the TLS handshake by side effect.
  • HttpURLConnection does not apply the socket read timeout until after this point of execution.
  • Therefore, Conscrypt is performing TLS handshake without the timeout enforced.

It doesn't appear that there is any Java upgrade path that would resolve this issue. Reviewing git diff jdk8u332-ga..master, I don't see any recent changes in sun.net.www that would change this behavior. Similarly, more recent JVMs retain the same behavior, e.g. Java 17:

https://github.com/openjdk/jdk17u/blob/c1e17197222932a03a04d3b8d9c0d7f94be07947/src/java.base/share/classes/sun/net/www/protocol/http/HttpURLConnection.java#L1242

cnauroth added a commit to cnauroth/hadoop-connectors that referenced this issue Aug 23, 2022
Set socket read timeout (`fs.gs.http.read-timeout`) as early as possible
on new sockets returned from the custom `SSLSocketFactory`. This
guarantees the timeout is enforced during TLS handshakes when using
Conscrypt as the security provider.

See also google/conscrypt#864 .
cnauroth added a commit to GoogleCloudDataproc/hadoop-connectors that referenced this issue Aug 30, 2022
Set socket read timeout (`fs.gs.http.read-timeout`) as early as possible
on new sockets returned from the custom `SSLSocketFactory`. This
guarantees the timeout is enforced during TLS handshakes when using
Conscrypt as the security provider.

See also google/conscrypt#864 .
cnauroth added a commit to cnauroth/hadoop-connectors that referenced this issue Aug 30, 2022
Set socket read timeout (`fs.gs.http.read-timeout`) as early as possible
on new sockets returned from the custom `SSLSocketFactory`. This
guarantees the timeout is enforced during TLS handshakes when using
Conscrypt as the security provider.

See also google/conscrypt#864 .

(cherry picked from commit 61f8629)
cnauroth added a commit to GoogleCloudDataproc/hadoop-connectors that referenced this issue Aug 30, 2022
Set socket read timeout (`fs.gs.http.read-timeout`) as early as possible
on new sockets returned from the custom `SSLSocketFactory`. This
guarantees the timeout is enforced during TLS handshakes when using
Conscrypt as the security provider.

See also google/conscrypt#864 .

(cherry picked from commit 61f8629)
cnauroth added a commit to cnauroth/hadoop-connectors that referenced this issue Aug 30, 2022
Set socket read timeout (`fs.gs.http.read-timeout`) as early as possible
on new sockets returned from the custom `SSLSocketFactory`. This
guarantees the timeout is enforced during TLS handshakes when using
Conscrypt as the security provider.

See also google/conscrypt#864 .

(cherry picked from commit 61f8629)
(cherry picked from commit 667836b)
cnauroth added a commit to GoogleCloudDataproc/hadoop-connectors that referenced this issue Sep 6, 2022
Set socket read timeout (`fs.gs.http.read-timeout`) as early as possible
on new sockets returned from the custom `SSLSocketFactory`. This
guarantees the timeout is enforced during TLS handshakes when using
Conscrypt as the security provider.

See also google/conscrypt#864 .

(cherry picked from commit 61f8629)
(cherry picked from commit 667836b)
@Xavron
Copy link

Xavron commented May 13, 2024

Am seeing FX File Explorer (network) client use on Android stuck during org.conscrypt.NativeCrypto.SSL_do_handshake(Native Method).

Samsung Files (network) client also fails here.

For my use case, its showing up with Android to Android and is happening on Android 8 test device. Android 14 test device isn't seeing this (NativeCrypto.SSL_do_handshake(Native Method)) with the exact same code pushed out to it.

If SocketInputStream.socketRead0 is still the same issue then that's what I'm seeing on Android 14. I didn't look to see how that's connected.

Update 1: I randomly noticed today that SocketInputStream.socketRead0 can simply show from handshake timeout being set and happening sslSocket.setSoTimeout(). Where the client side can overcome this as 2 different clients have 2 different results in my case (one works and the other doesn't and so for that one the timeout occurs). For whatever reason, I didn't notice this until now. I don't yet know if this should be definitely be a client problem or not, but in my usage, clients can deal with it.

Update 2: Two clients now working and SocketInputStream.socketRead0 eliminated as a thing from the 2nd. Still have it for some other clients that I now need to look into. I'm definitely so far only seeing this as a client software issue and not as a Conscrypt or Java one. There's going to be a hang if the client causes one :)

Final update: NativeCrypto.SSL_do_handshake(Native Method) also seen fixed by same client problem as in update 2 for the rest of the clients. Should be fixed via client (in my case). Although, I'm using server side workaround to fix the clients. This is not a workaround to post, as the situation I see is different in code and won't be of any help at all. At the same time, both sides need to communicate properly, so if one side expects something and the other expects something and neither give, then this is what I'm seeing cause both of these two to show up. It could be different problem from the issue or the same. Either way, that would be they're the same or there's something else involved and not the title problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants