-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resteasy Reactive first POST with InputStream hang #15479
Comments
/cc @FroMage, @geoand, @stuartwdouglas |
I added a new end-point that call the POST using a JAX-RS client and simulate the problem. Just launch in debug and call this at first:
you will see vertx complaining for a blocked thread. |
Thanks for the all information. I'll take a look soon. |
I just tried the sample project and I was not able to reproduce the issue. Is there anything special I need to know? |
Hi Georgios, pleased that you took charge of this issue. In the real application is very easy, you can build the jar and run-it, not only on my workstation but all the guys in the team. The reproducer project is a little trickier, you need to start it in debug (I added the suspended configuration in the plugin) and the start the debug to let the application startup up. Doing this I'm able to reproduce it quite always. Now I try to add some context giving you some dumps during debug. |
I tried again multiple times and could not reproduce the issue. Is it possible that in the real world your application is receiving traffic before Quarkus has completely started? |
Well, really we reproduced starting on our workstation and sending a POST
after the Quarkus started message.
Il giorno ven 5 mar 2021 alle ore 15:11 Georgios Andrianakis <
[email protected]> ha scritto:
… I tried again multiple times and could not reproduce the issue.
Is it possible that in the real world your application is receiving
traffic before Quarkus has completely started?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#15479 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAPXERTIVC2SNCJYQYAGIT3TCDRBBANCNFSM4YTUZZXQ>
.
--
****************************************
http://www.lucamasini.net
http://twitter.com/lmasini
http://www.linkedin.com/pub/luca-masini/7/10/2b9
****************************************
|
@stuartwdouglas can you try and reproduce this please? |
Thanks @geoand , now I ask a colleague out of context if he can reproduce. |
I debugged and when I have a problem this method raise the exception:
but is not true that the request has been fullfilled, the business method has not been called yet. Then the exception handler inside the org.jboss.resteasy.reactive.common.core.AbstractResteasyReactiveContext.run() method doesn't print nothing, the thread is ready to process the next task, but the socket is still open !! |
More context. Adding the @Blocking annotation resolve the problem for the test-case and also for the real application. I don't want to make it blocking, I'll debug a little bit more to understand what is going on. |
Hi guys, I pushed a better reproducer. Now you can run directly the generater quarkus-run.jar and call the reproducer URL: [GET https://localhost:9081/resources/test](GET https://localhost:9081/resources/test) You no longer need to attach a debugger to generate the problem. I did this observing the problem in production, it happens only when the POST is called after a GET, of course without a body. I'll check further if I understand what is going on, but I admit that I don't understand well what the reactive part is doing under the hood. |
Thanks, I'll take a look at the updated reproducer soon |
I was able to reproduce the problem (it only happens if you hit |
Good @geoand , that is a "fake" resources that simulate what I do in production to have it, a GET and then a POST with a body. Finger crossed, I'm worried about the stress test, I mean if this really happens only the first time or if can happen under heavy load. I'll let you know. |
There seems to be some kind of race condition in Vert.x or Vert.x + Quarkus because I can reproduce the same problem (sometimes) with plain reactive routes (just add the following class to the reproducer above, and add import io.quarkus.vertx.web.Body;
import io.quarkus.vertx.web.Route;
import io.quarkus.vertx.web.RouteBase;
import io.smallrye.mutiny.Uni;
import io.vertx.ext.web.RoutingContext;
import okhttp3.OkHttpClient;
import okhttp3.Request;
import okhttp3.RequestBody;
import java.io.IOException;
import java.util.Collections;
import java.util.List;
import static io.vertx.core.http.HttpMethod.GET;
import static io.vertx.core.http.HttpMethod.POST;
@RouteBase(path = "/test2", produces = "application/json")
public class GreetingRoute {
@Route(methods = GET, path = "/1")
public List<TestDTO> getAll() {
System.out.println("Returning empty");
return Collections.emptyList();
}
@Route(methods = POST, path = "/2")
public Uni<TestDTO> save(@Body TestDTO n) {
System.out.println("Received input");
Uni<TestDTO> res = Uni.createFrom().item(n);
System.out.println("Returning uni");
return res;
}
@Route(methods = GET, path = "")
public void call(RoutingContext rc) throws IOException {
OkHttpClient client = new OkHttpClient();
System.out.println("Performing GET call");
Request request = new Request.Builder()
.url("http://localhost:8082/resources/test2/1")
.header("Authorization", "Basic YWRtaW46YWRtaW4=")
.header("Accept", "application/json")
.get()
.build();
client.newCall(request).execute();
System.out.println("Done with GET call");
System.out.println("Performing POST call");
request = new Request.Builder()
.url("http://localhost:8082/resources/test2/2")
.header("Authorization", "Basic YWRtaW46YWRtaW4=")
.header("Accept", "application/json")
.post(RequestBody.create(okhttp3.MediaType.get("application/json"), "{}"))
.build();
client.newCall(request).execute();
System.out.println("Done with POST call");
rc.response().end();
}
} Then with the I am not really sure what to make of this, @stuartwdouglas, @cescoffier if you have any ideas, I would be grateful :) |
@geoand in your example, you are blocking the event loop, as |
@cescoffier yes I know. With But now that I think about it, what I am seeing might be normal... Because I am blocking the event loop, the request to the application (over the network) could be attempted to be handled by the same event-loop thread - which is why it sometimes works (when not using the same thread) and sometimes doesn't (when the attempt is made to handle it from the same event loop thread). |
@masini does the reproducer represent the actual scenario you are seeing? |
@geoand exactly, if you block the event loop, it can't make progress. The other endpoint may be called on the same event loop or not, that's random. |
Yeah, I was just surprised that because there are multiple event loop threads, I didn't expect to run into the problem easily. |
@geoand in the production scenario the HTTP call come from an IOT device, so may be the "test" endpoint is not good. But I'm able to reproduce it calling in sequence "test/1" and "test/2", without the HTTP client. |
Yes, this is exactly my point. That the use of the client is problematic in itself. |
How can I help you ? This is really a problem from our side. |
I understand, but unless we have a reliable and properly representative way to reproduce the problem, there is little we can do |
Hi @geoand, I also cloned and reproduced the error by following these steps:
|
I'll try it again, but that is what I was doing and could never reproduce the problem |
Here is what I got when running your script:
|
Thank you @geoand for trying. I retried just now and I got a "lot" of timeout, but I notice that your environment is a lot faster than my dev workstation, what are you using ? I'll try on linux, production server run on that and not Mac OS X. |
It uses a Ryzen 3950x CPU
👌 |
Describe the bug
Using resteasy-reactive the first POST with an input stream hang forever
Expected behavior
Return the payload
Actual behavior
The socket stay connected forever if the client doesn't implement a timeout
To Reproduce
Clone this repo:
https://github.com/masini/test-case-resteasy-reactive-first-post-bug
Steps to reproduce the behavior:
Configuration
Environment (please complete the following information):
uname -a
: Darwin WMICTLM1P.lan 19.6.0 Darwin Kernel Version 19.6.0: Thu Oct 29 22:56:45 PDT 2020; root:xnu-6153.141.2.2~1/RELEASE_X86_64 x86_64java -version
:openjdk version "11.0.10" 2021-01-19
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.10+9)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 11.0.10+9, mixed mode)
mvnw --version
:Maven home: /Users/ictlm1/.m2/wrapper/dists/apache-maven-3.6.3-bin/1iopthnavndlasol9gbrbg6bf2/apache-maven-3.6.3
Java version: 11.0.10, vendor: AdoptOpenJDK, runtime: /Applications/sviluppo/java/jdk-11.0.10+9/Contents/Home
Default locale: en_GB, platform encoding: UTF-8
OS name: "mac os x", version: "10.15.7", arch: "x86_64", family: "mac"
Additional context
This is a reproducer of a real problem in a real application.
In the real application this happens also without debugger and with the quarkus-run.jar, the reproducer instead is able to reproduce only with a debugger attached and not always. I think it's a matter of timings, but I'm not sure, I need to debug more.
The text was updated successfully, but these errors were encountered: