Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dispatched events are not scheduled as tasks #2159

Closed
1 task done
OrKoN opened this issue Aug 16, 2023 · 18 comments · Fixed by #2160
Closed
1 task done

Dispatched events are not scheduled as tasks #2159

OrKoN opened this issue Aug 16, 2023 · 18 comments · Fixed by #2160

Comments

@OrKoN
Copy link

OrKoN commented Aug 16, 2023

Is there an existing issue for this?

  • I've searched for any related issues and avoided creating a duplicate issue.

Description

I am not sure this is a bug or rather a topic for discussion. I have noticed that there is a subtle difference between ws and the WebSockets standard. The standard says that when a message is received the agent has to queue a new task. This has an important side effect that the microtask queue will be processed after the task is over and before the next task is picked from the queue. This behaviour is important in the cases where the messages need to be processed in a specified order.

This can be demonstrated by the following server code:

import { WebSocketServer } from 'ws';

const wss = new WebSocketServer({ port: 8080 });

wss.on('connection', function connection(ws) {
  ws.send('something1');
  ws.send('something2');
});

And the client code:

import { WebSocket } from 'ws'; // in the browser, omit this line.

const ws = new WebSocket('ws://127.0.0.1:8080');

ws.addEventListener('message', function message(event) {
  console.log('received: %s', event.data);
  new Promise(resolve => {
    resolve()
  }).then(() => {
    console.log('microtask', event.data.toString())
  })
});

If you run this code in Node and in the browser, you should observe that the order of console messages is different. It's probably not feasible to change the behaviour of 'ws' as it could be a breaking change but I wonder if there could be some compatibility mode for this? I believe it's possible to workaround that using setTimeout or setImmediate but the downside is that the same code cannot be used across platforms. If I have overlooked an existing issue about this topic, I apologize.

ws version

8.13.0

Node.js Version

v20.3.1

System

System:
OS: macOS 13.5
CPU: (10) arm64 Apple M1 Max
Memory: 696.63 MB / 64.00 GB
Shell: 5.9 - /bin/zsh

Expected result

The order of console messages is the same in Node and in the browser and matches the behaviour defined by the standard:

received: something1
microtask something1
received: something2
microtask something2 

Actual result

The order of console messages in Node is different (microtasks are processed after all messages):

received: something1
received: something2
microtask something1
microtask something2

Attachments

No response

@lpinca
Copy link
Member

lpinca commented Aug 17, 2023

Yes, ws does not strictly follow the WHATWG specification. I don't think this is a bug and to be honest I find ws behavior less surprising.

To match the browser behavior only a single 'message' event per microtask should be emitted. We could do this at the parser level, for example by using a Transform stream instead of a Writable stream, pausing it after each 'data' event and resuming it after the queueMicrotask() callback is called. However, this should be thought carefully as it might affect performance and have side effects and I'm not sure if it's worth it. I think there is not much code that depends on this behavior.

@lpinca
Copy link
Member

lpinca commented Aug 18, 2023

@KhafraDev fyi, the WebSocket implementation in https://github.com/nodejs/undici has the same non spec compliant behavior.

lpinca added a commit that referenced this issue Aug 19, 2023
To improve compatibility with the WHATWG standard, emit at most one of
`'message'`, `'ping'`, and `'pong'` events per tick.

Fixes #2159
lpinca added a commit that referenced this issue Aug 19, 2023
To improve compatibility with the WHATWG standard, emit at most one of
`'message'`, `'ping'`, and `'pong'` events per tick.

Fixes #2159
@lpinca
Copy link
Member

lpinca commented Aug 19, 2023

@OrKoN #2160 should fix this but I did not run any benchmark and have not thought about possible side effects yet.

@OrKoN
Copy link
Author

OrKoN commented Aug 21, 2023

@lpinca thanks!

@OrKoN
Copy link
Author

OrKoN commented Aug 23, 2023

In puppeteer, we worked around the problem by scheduling all event handlers triggered by 'ws' using setImmediate. We have not noticed any performance regressions (we don't handle a lot of traffic though).

@lpinca
Copy link
Member

lpinca commented Aug 23, 2023

Here are some benchmark results run under WSL 2.

20 bytes messages - websockets/ws#8.13.0

$ ./load_test 32 127.0.0.1 8080 0 0
Using message size of 20 bytes
Running benchmark now...
Msg/sec: 59225.000000
Msg/sec: 60525.500000
Msg/sec: 59986.500000
Msg/sec: 59474.250000
Msg/sec: 59477.250000

20 bytes messages - websockets/ws#fix/issue-2159

$ ./load_test 32 127.0.0.1 8080 0 0
Using message size of 20 bytes
Running benchmark now...
Msg/sec: 52816.500000
Msg/sec: 54566.250000
Msg/sec: 54582.500000
Msg/sec: 54458.750000
Msg/sec: 54205.500000

1024 bytes messages - websockets/ws#8.13.0

$ ./load_test 32 127.0.0.1 8080 0 0 1024
Using message size of 1024 bytes
Running benchmark now...
Msg/sec: 49375.750000
Msg/sec: 49289.250000
Msg/sec: 49235.000000
Msg/sec: 48787.750000
Msg/sec: 49216.750000

1024 bytes messages - websockets/ws#fix/issue-2159

$ ./load_test 32 127.0.0.1 8080 0 0 1024
Using message size of 1024 bytes
Running benchmark now...
Msg/sec: 47471.500000
Msg/sec: 46715.250000
Msg/sec: 46860.250000
Msg/sec: 47072.250000
Msg/sec: 47157.500000

load_test is this tool: https://github.com/uNetworking/uWebSockets/blob/master/benchmarks/load_test.c. The server is a simple echo server.

import { WebSocketServer } from 'ws';

const server = new WebSocketServer({ port: 8080 }, function () {
  console.log('Server listening on *:8080');
});

server.on('connection', function (ws) {
  ws.on('message', function (data, isBinary) {
    ws.send(data, { binary: isBinary });
  });
});

The impact for small messages (20 bytes) is noticeable (~10%).

@lpinca
Copy link
Member

lpinca commented Aug 23, 2023

On macOS with 250 clients I see no significant overhead. If you are using Linux on bare metal, may I ask you to run some benchmarks like the ones above (increasing the number of clients up to 1000 might make sense)? Thank you.

@OrKoN
Copy link
Author

OrKoN commented Aug 24, 2023

Unfortunately, I don't have a Linux on bare metal.

@OrKoN
Copy link
Author

OrKoN commented Aug 24, 2023

This is the numbers I get on an Intel Macbook (node 20):


# 8.13.0

./load_test 32 127.0.0.1 8080 0 0 1024
Using message size of 1024 bytes
Running benchmark now...
Msg/sec: 34488.250000
Msg/sec: 34700.750000
Msg/sec: 34885.500000
Msg/sec: 34716.750000
Msg/sec: 31545.000000
Msg/sec: 34272.500000
Msg/sec: 34531.500000

# fix

./load_test 32 127.0.0.1 8080 0 0 1024
Using message size of 1024 bytes
Running benchmark now...
Msg/sec: 36548.500000
Msg/sec: 33161.750000
Msg/sec: 32867.000000
Msg/sec: 32287.500000
Msg/sec: 32432.750000
Msg/sec: 31692.500000
Msg/sec: 32399.750000
Msg/sec: 32129.500000

# 8.13.0

./load_test 32 127.0.0.1 8080 0 0
Using message size of 20 bytes
Running benchmark now...
Msg/sec: 46923.750000
Msg/sec: 42516.000000
Msg/sec: 41810.500000
Msg/sec: 42102.500000
Msg/sec: 42115.000000
Msg/sec: 42091.000000

# fix

./load_test 32 127.0.0.1 8080 0 0
Using message size of 20 bytes
Running benchmark now...
Msg/sec: 38408.500000
Msg/sec: 38996.250000
Msg/sec: 39031.250000
Msg/sec: 39019.750000
Msg/sec: 38887.500000
Msg/sec: 37871.500000
Msg/sec: 38127.000000

@lpinca
Copy link
Member

lpinca commented Aug 24, 2023

It seems to be consistent with my results (~10% impact).

We can make this behavior optional by using and option that can be passed to the server and the client like the maxPayload option but

  1. I don't like it.
  2. I have no idea how to call the option. Suggestions are welcome.

Another option is an env variable like WS_NO_BUFFER_UTIL and WS_NO_UTF_8_VALIDATE but I'm also not very happy with it.

@OrKoN
Copy link
Author

OrKoN commented Aug 24, 2023

My opinion would be that it's better to match the spec by default and have performance optimisations as an opt-in. Also, I think the change does not have to apply to the server implementation as only the client-side events are spec'ed. I assume that clients handle fewer messages than servers so it might have lower impact on performance and would match the browser's implementations. Maybe if the change applies only to the clients, there will be no need to configure this behaviour?

@lpinca
Copy link
Member

lpinca commented Aug 24, 2023

My opinion would be that it's better to match the spec by default and have performance optimisations as an opt-in.

I'm not sure I agree. The current behavior is not a bug and there are other areas where we diverge from the WHATWG spec. A client in another language with a completely different interface is still a valid client as long as it follows RFC 6455, RFC 7692, etc. We try to be compatible with the WHATWG spec for convenience.

Also, I think the change does not have to apply to the server implementation as only the client-side events are spec'ed.

The same WebSocket class is used by both the server and the client. We can make only WebSocket clients have this behavior but it does not make much sense in my opinion. It would be inconsistent and confusing.

@OrKoN
Copy link
Author

OrKoN commented Aug 24, 2023

I'm not sure I agree. The current behavior is not a bug and there are other areas where we diverge from the WHATWG spec. A client in another language with a completely different interface is still a valid client as long as it follows RFC 6455, RFC 7692, etc. We try to be compatible with the WHATWG spec for convenience.

That makes sense to me too.

The same WebSocket class is used by both the server and the client. We can make only WebSocket clients have this behavior but it does not make much sense in my opinion. It would be inconsistent and confusing.

Yeah, might be confusing if both client and server are used (especially within one codebase). We so far only had the need for the client.

@lpinca
Copy link
Member

lpinca commented Aug 24, 2023

Yes, that is basically my point. In all these years this is the first time this is reported and as written in my first comment I don't think there is much code that depends on this behavior. This is why I would prefer to make it opt-in. On the other hand making it opt-in kind of defeats the compatibility goal as the code would not be portable. I wonder if there is some way to reduce the performance impact.

@lpinca
Copy link
Member

lpinca commented Aug 24, 2023

Surprisingly this

diff --git a/lib/receiver.js b/lib/receiver.js
index c7dcf8a..f4497b4 100644
--- a/lib/receiver.js
+++ b/lib/receiver.js
@@ -165,17 +165,12 @@ class Receiver extends Writable {
           // `WAIT_MICROTASK`.
           this._loop = false;
 
-          const next = () => {
+          // `queueMicrotask()` is a little slower and is not available in
+          // Node.js < 11.
+          Promise.resolve().then(() => {
             this._state = GET_INFO;
             this.startLoop(cb);
-          };
-
-          if (typeof queueMicrotask === 'function') {
-            queueMicrotask(next);
-          } else {
-            // `queueMicrotask()` is not available in Node.js < 11.
-            Promise.resolve().then(next);
-          }
+          });
 
           return;
         }

is a little faster. I'll keep investigating.

@OrKoN
Copy link
Author

OrKoN commented Aug 25, 2023

Some numbers from a virtual Linux:

# main
./load_test 32 127.0.0.1 8080 0 0
Using message size of 20 bytes
Running benchmark now...
Msg/sec: 40121.250000
Msg/sec: 44508.750000
Msg/sec: 43910.750000
Msg/sec: 41611.250000
Msg/sec: 44852.750000
Msg/sec: 37529.750000
Msg/sec: 37837.500000
Msg/sec: 35302.250000
Msg/sec: 35914.750000
Msg/sec: 37568.250000

# https://github.com/websockets/ws/pull/2160
./load_test 32 127.0.0.1 8080 0 0
Using message size of 20 bytes
Running benchmark now...
Msg/sec: 42166.250000
Msg/sec: 43883.000000
Msg/sec: 39499.750000
Msg/sec: 40256.250000
Msg/sec: 44213.000000
Msg/sec: 40567.500000
Msg/sec: 36763.500000
Msg/sec: 36296.000000
Msg/sec: 38419.750000
Msg/sec: 35596.75000

@OrKoN
Copy link
Author

OrKoN commented Aug 25, 2023

I have quickly tried wrapping all emit calls in setImmediate to schedule them for the next cycle and it seems to be performing better (with a caveat: I am not 100% I have done it correctly and done it only for emit('message'))

@lpinca
Copy link
Member

lpinca commented Aug 25, 2023

Simply deferring the 'message' event to the next iteration of the event loop does not stop the parser and this makes me a bit uncomfortable as it might have unintended consequences. Anyway, I noticed that the impact shrinks as the number of clients grows. If it is in the order of ~5%, it is acceptable.

lpinca added a commit that referenced this issue Aug 26, 2023
To improve compatibility with the WHATWG standard, emit at most one of
`'message'`, `'ping'`, and `'pong'` events per tick.

Fixes #2159
lpinca added a commit that referenced this issue Aug 26, 2023
To improve compatibility with the WHATWG standard, emit at most one of
`'message'`, `'ping'`, and `'pong'` events per tick.

Fixes #2159
lpinca added a commit that referenced this issue Aug 28, 2023
To improve compatibility with the WHATWG standard, emit at most one of
`'message'`, `'ping'`, and `'pong'` events per tick.

Fixes #2159
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants