-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WebSockets Next: performance improvements #39148
Comments
The #40183 is related to this issue. |
For this one, we'll need some benchmarks (executed automatically). Ideally, to compare WS next with CC @franz1981 |
I'm not aware of websocket benchmark suites sadly... |
Me neither. We'll need to come with something... ;-) |
Let's talk this week in a call and we can think about something |
NOTE: This extension contains a bunch of lambdas. We should consider rewriting those lambdas to anonymous classes. |
We need to think about scenarios to test the performances. Response time is not a meaningful metric. The number of messages and connections are more sensitive in this case. (Of course, memory is important too). |
Yeah, although if you can achieve 10K msg/sec with a single outlier with 10 seconds of latency - is something you wish to know. I have contacted the Jetty team (Simone Bordet) - and they have rolled out a coordination omission free distributed (if required) load generator for websocket; let's see what we can do in Hyperfoil or by reusing such, which is used for this exact purpose |
oh, that's would we good! |
And clearly it is not covering websocket, see https://github.com/jetty-project/jetty-load-generator :"( Which means that we should prioritize supporting websockets for Hyperfoil or find a different benchmarking tool (which is coordinated-omission free - not easy) |
For load tests where we don't care about coordinated-omission and throughput, we could try to use the Gatling WebSocket protocol or even a simple Vert.x client. |
I used Gatling in the past. |
So I started to play with a simple Vertx client, Gatling, etc. here: https://github.com/mkouba/ws-next-perf. And it seems that at moderate load the performance of Apparently the biggest problem is switching to a worker thread because the tested @franz1981 Could you pls take a look at the attached flamegraph? |
I see there's a fun problem with important note:
The last point is very key to understand that if users are making use of the worker thread pool they are supposed to perform blocking operations (in the form of 10/100 ms work each), this will guarantee 2 effects:
As usual, I love you're so proactive and quick to react @mkouba thanks again for taking both time and effort for the test + collecting data: this will make so much easier for me to help! ❤ |
Yes, I noticed this part as well.
That would be great.
It depends. I don't think that all callbacks with a blocking signature will execute code that would block the thread. But for sure, we need more scenarios that would cover all common use cases. Currently, we only call Thanks Franz! |
FYI I've just noticed the following sentence in the javadoc of And also "The internal state is protected using the synchronized keyword. If always used on the same event loop, then we benefit from biased locking which makes the overhead of synchronized near zero.". So obviously, it's not optimized for the blocking use case ;-). |
@mkouba Yep and ideally this could be improved on Vertx 5, but there is still some low hanging fruit on Vertx 4 - which we can easily explored if is worthy i.e. franz1981/vert.x@3ca72f8 what is doing is fairly simple, and is based on the analysis I've performed for https://github.com/franz1981/java-puzzles/blob/583d468a58a6ecaa5e7c7c300895392638f688dd/src/main/java/red/hat/puzzles/concurrent/LockCoarsening.java#L76-L85 which is the motivation behind the vertx 5 changes in this regard. |
FYI : this part in Vertx 5 has been rewritten, so this analysis does not hold for it |
Unfortunately, it does not seem to be an easy task to switch the @cescoffier @vietj Any tip how to try this out? |
Hey Julien, do you have some benchmarks in Vert.x to test the performance of WebSockets server/client? |
What I would do is to cherry pick the commit to the right vertx tag, use mvn install and either replace the jar in the lib of quarkus or hope that the local mvn repo will do the right thing(TM) I have found another good improvement to fix the buffer copies too - which I can send to vertx 5 regardless |
Ah, ofc. This worked. And quick and dirty results seem to be much better, comparable to |
@mkouba ok so this seems a painless change if @vietj and @cescoffier agreed and you see benefits. |
Do you have a link to the commit to cherry-pick? |
|
The commit looks good. It avoids entering synchronized blocks. I'm not sure of the various assertions. Let's see what @vietj says. |
@cescoffier What committee? 😆 |
Yep @cescoffier the checks on asserts should be enabled on both quarkus and vertx maven surefire tests to make use the new methods are not misused while still not impacting the hot path at runtime (asserts are fully removed) |
I have created franz1981/vert.x@9a0f516 to fix the buffer problem saw few comments earlier too |
FYI I'm working on a pull request to disable CDI request context activation for endpoint callbacks unless really needed, i.e. an endpoint has a |
Description
Follow-up of #39142.
Implementation ideas
AtomicLongFieldUpdater
in theConcurrencyLimiter
: Initial version of the new declarative WebSocket server API #39142 (comment)The text was updated successfully, but these errors were encountered: