Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does net.Socket write() method always fire callback in next tick of even loop? #1504

Closed
ORESoftware opened this issue Sep 18, 2018 · 6 comments

Comments

@ORESoftware
Copy link

ORESoftware commented Sep 18, 2018

Say we create a new socket connection:

const s = net.createConnection({port});

and we call write:

s.write('foo\n', () => {
   // does this callback always fire in the next tick of the event loop?
});

I am wondering if the "success" callback always fires in the next tick of the event loop? Wondering if it's always async, I would assume so, but no 100% sure. I am also wondering why there is never an error passed to the callback? Looks like the API for the callback has no arguments.

@addaleax
Copy link
Member

The callback can be called later, if data has queued up and can’t be sent yet. So, if you write a lot of data in a close loop without waiting for the callback first, this can definitely occur. (Windows buffers a lot more data internally than other OS do, so it will take longer for this to happen then, at the cost of increased memory usage.)

I think the callback can be called with an error if a write failed, but I’m not sure how often that happens – most of the time, a broken socket would lead to a regular 'error' first.

@ORESoftware
Copy link
Author

yeah, to be sure I have just used this:

s.write('foo\n', () => {
   process.nextTick(cb);
});

instead of

s.write('foo\n', cb);

@addaleax
Copy link
Member

Oh – sorry, I think I might have misunderstood. The callback can be called later than the next tick of the event loop, but it will never be called synchronously (similar to setImmediate()).

@ORESoftware
Copy link
Author

ORESoftware commented Sep 19, 2018

Got it, yeah the OP was just about whether it always fires asynchronously, sounds like that is the case, thanks. Obviously not every callback in Node.js core fires asynchronously, sometimes synchronously, such as in the streams API. I would be cool to mark functions in the docs with 3 different types:

  1. always sync
  2. sometimes sync sometimes async
  3. always aysnc

that would be a nice way to improve callback API docs

@addaleax
Copy link
Member

@ORESoftware Just fyi, the 2. option is almost never used (the only exception is the inspector module) – it’s usually too hard to get right for users. Most of our callbacks are always-async, so I guess it might make sense to mark the other two types?

@ORESoftware
Copy link
Author

ORESoftware commented Sep 21, 2018

@addaleax as I was saying the steam API definitely has a mixture, for example on 'data' callbacks etc, event emitters in general are mixture. the socket.write() callback is a great example of an ambiguous one, is it 1, 2 or 3?

EdmondChuiHW added a commit to EdmondChuiHW/react-native that referenced this issue Jun 9, 2024
…4811)

Summary:
Pull Request resolved: facebook#44811

Changelog:
[General][Added] - New messages for debugger disconnections
[General][Changed] - Temporary disconnections with the debugger no longer terminates the debugger immediately

The debugger is currently disconnected if a ping-pong message is missed.

This causes the debugger to be unusable if it happens to be lagging, e.g. when the initialisation is competing with the flood of log spam T191394188

There are a few ways to fix this as discused with motiz88 and robhogan:

1. Ensure the websocket has a chance to respond, e.g. in via web worker
1. Lengthen the time allowed for the pong resopnse

I've done some digging to find the root cause of the UI being blocked in CDT, However, profiling shows that most of the work is not simple to break up, i.e. the number of expensive re-layout calls. Diving into that rabbit hole could mean accidentally writing React.

Because we ping every 10 seconds, we could get un/lucky where CDT happens to be busy _at that exact moment_, making this a flaky symptom to fix, even if we lengthen the allowed time-to-respond.

# V2

So upon further investigation, CDT websocket is actually responding to the pings in due time:

{F1679132204}

(CDT doesn't show the ping/pong API as frames, so a custom tick/tock message was used to visualise the timing)

Over here in dev-middleware, we currently start a timeout to terminate the socket after sending the ping:

https://www.internalfb.com/code/fbsource/[813870db697a8701f2512d25a7fed730f0ec6ed9]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/InspectorProxy.js?lines=306-307

If CDT doesn't respond in time, websocket would be terminated.

But we saw CDT respond immediately above, even during the log spam, so the delay must be coming from somewhere else.

The intuition is that during the log-spam, the middleware takes a perf hit too when it's processing the spam from the device and forwarding it to the CDT websocket.

We can confirm this by passing a "sent" callback via `socket.ping(cb)`:

https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/websocket.js#L246-L254

This gives us the timing between calling `socket.ping()` and when the ping is actually sent.

Regular, stress-free operation without log-spam shows most pings are sent within the same millisecond:

 {F1679223326}

With the pong response grace period at 5 seconds, there's plenty of time for CDT to `pong` back. That's why it has been working in most cases.

However, during the log-spam, we easily see this send-sent delay over 5 seconds. In extreme cases, almost 30 seconds would have passed before middleware sent a message to CDT, which then responded under 2 seconds:

 {F1679163335}

This means while CDT is getting flooded and has observable lag in the UI, the smoking gun is actually the middleware.

Digging a little deeper, we know that incoming messages from the target goes into a Promise queue, including the console logs:

https://www.internalfb.com/code/fbsource/[d5d312082e9c]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/Device.js?lines=155-157

This means during the flood of logs from the target, the Promise queue keeps getting chained rapidly for each message.

Meanhile, the `ws` lib uses the underlying NodeJS `Socket.write` method for `ping(…)` and `send(…)`:

https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/sender.js#L349

…which is guaranteed to fire the callback asynchronously:

nodejs/help#1504 (comment)

Promise queue is in the macro task queue, which gets priority before the micro task queue. So if the Promise queue is not cleared yet, the websocket queue will have a hard time getting executed in time – explaining the extreme send-sent durations during a log spam.

The fix is simple:

1. Start the terminate-socket-timer until the `ping` is actually sent
1. Treat any incoming message (along with `pong`s) as a terminate-socket-timer reset
    1. This also applies if `pong` comes in between `send` and `sent`, which can happen sometimes due to the async nature of the callback:

 {F1679288626}

# V1

~~In this diff, a more forgiving mechanism is introduced, i.e. CDT is allowed to miss a ping-pong roundtrip 3 times before the websocket connection is terminated.~~

~~This allows a bit more breathing room for CDT's initialisation during log spam while maintaining the same ping-pong interval for VS Code to keep the auto SSH tunnel alive.~~

Differential Revision: D58220230
EdmondChuiHW added a commit to EdmondChuiHW/react-native that referenced this issue Jun 9, 2024
…4811)

Summary:
Pull Request resolved: facebook#44811

Changelog:
[General][Added] - New messages for debugger disconnections
[General][Changed] - Temporary disconnections with the debugger no longer terminates the debugger immediately

The debugger is currently disconnected if a ping-pong message is missed.

This causes the debugger to be unusable if it happens to be lagging, e.g. when the initialisation is competing with the flood of log spam T191394188

There are a few ways to fix this as discused with motiz88 and robhogan:

1. Ensure the websocket has a chance to respond, e.g. in via web worker
1. Lengthen the time allowed for the pong resopnse

I've done some digging to find the root cause of the UI being blocked in CDT, However, profiling shows that most of the work is not simple to break up, i.e. the number of expensive re-layout calls. Diving into that rabbit hole could mean accidentally writing React.

Because we ping every 10 seconds, we could get un/lucky where CDT happens to be busy _at that exact moment_, making this a flaky symptom to fix, even if we lengthen the allowed time-to-respond.

# V2

So upon further investigation, CDT websocket is actually responding to the pings in due time:

{F1679132204}

(CDT doesn't show the ping/pong API as frames, so a custom tick/tock message was used to visualise the timing)

Over here in dev-middleware, we currently start a timeout to terminate the socket after sending the ping:

https://www.internalfb.com/code/fbsource/[813870db697a8701f2512d25a7fed730f0ec6ed9]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/InspectorProxy.js?lines=306-307

If CDT doesn't respond in time, websocket would be terminated.

But we saw CDT respond immediately above, even during the log spam, so the delay must be coming from somewhere else.

The intuition is that during the log-spam, the middleware takes a perf hit too when it's processing the spam from the device and forwarding it to the CDT websocket.

We can confirm this by passing a "sent" callback via `socket.ping(cb)`:

https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/websocket.js#L246-L254

This gives us the timing between calling `socket.ping()` and when the ping is actually sent.

Regular, stress-free operation without log-spam shows most pings are sent within the same millisecond:

 {F1679223326}

With the pong response grace period at 5 seconds, there's plenty of time for CDT to `pong` back. That's why it has been working in most cases.

However, during the log-spam, we easily see this send-sent delay over 5 seconds. In extreme cases, almost 30 seconds would have passed before middleware sent a message to CDT, which then responded under 2 seconds:

 {F1679163335}

This means while CDT is getting flooded and has observable lag in the UI, the smoking gun is actually the middleware.

Digging a little deeper, we know that incoming messages from the target goes into a Promise queue, including the console logs:

https://www.internalfb.com/code/fbsource/[d5d312082e9c]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/Device.js?lines=155-157

This means during the flood of logs from the target, the Promise queue keeps getting chained rapidly for each message.

Meanhile, the `ws` lib uses the underlying NodeJS `Socket.write` method for `ping(…)` and `send(…)`:

https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/sender.js#L349

…which is guaranteed to fire the callback asynchronously:

nodejs/help#1504 (comment)

Promise queue is in the macro task queue, which gets priority before the micro task queue. So if the Promise queue is not cleared yet, the websocket queue will have a hard time getting executed in time – explaining the extreme send-sent durations during a log spam.

The fix is simple:

1. Start the terminate-socket-timer until the `ping` is actually sent
1. Treat any incoming message (along with `pong`s) as a terminate-socket-timer reset
    1. This also applies if `pong` comes in between `send` and `sent`, which can happen sometimes due to the async nature of the callback:

 {F1679288626}

# V1

~~In this diff, a more forgiving mechanism is introduced, i.e. CDT is allowed to miss a ping-pong roundtrip 3 times before the websocket connection is terminated.~~

~~This allows a bit more breathing room for CDT's initialisation during log spam while maintaining the same ping-pong interval for VS Code to keep the auto SSH tunnel alive.~~

Differential Revision: D58220230
EdmondChuiHW added a commit to EdmondChuiHW/react-native that referenced this issue Jun 10, 2024
…4811)

Summary:
Pull Request resolved: facebook#44811

Changelog:
[General][Fixed] - Debugger frontend socket-termination countdown now begins after the ping message is actually sent

The debugger is currently disconnected if a ping-pong message is missed.

This causes the debugger to be unusable if it happens to be lagging, e.g. when the initialisation is competing with the flood of log spam T191394188

There are a few ways to fix this as discused with motiz88 and robhogan:

1. Ensure the websocket has a chance to respond, e.g. in via web worker
1. Lengthen the time allowed for the pong resopnse

I've done some digging to find the root cause of the UI being blocked in CDT, However, profiling shows that most of the work is not simple to break up, i.e. the number of expensive re-layout calls. Diving into that rabbit hole could mean accidentally writing React.

Because we ping every 10 seconds, we could get un/lucky where CDT happens to be busy _at that exact moment_, making this a flaky symptom to fix, even if we lengthen the allowed time-to-respond.

# V2+

So upon further investigation, CDT websocket is actually responding to the pings in due time:

{F1679132204}

(CDT doesn't show the ping/pong API as frames, so a custom tick/tock message was used to visualise the timing)

Over here in dev-middleware, we currently start a timeout to terminate the socket after sending the ping:

https://www.internalfb.com/code/fbsource/[813870db697a8701f2512d25a7fed730f0ec6ed9]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/InspectorProxy.js?lines=306-307

If CDT doesn't respond in time, websocket would be terminated.

But we saw CDT respond immediately above, even during the log spam, so the delay must be coming from somewhere else.

The intuition is that during the log-spam, the middleware takes a perf hit too when it's processing the spam from the device and forwarding it to the CDT websocket.

We can confirm this by passing a "sent" callback via `socket.ping(cb)`:

https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/websocket.js#L246-L254

This gives us the timing between calling `socket.ping()` and when the ping is actually sent.

Regular, stress-free operation without log-spam shows most pings are sent within the same millisecond:

 {F1679223326}

With the pong response grace period at 5 seconds, there's plenty of time for CDT to `pong` back. That's why it has been working in most cases.

However, during the log-spam, we easily see this send-sent delay over 5 seconds. In extreme cases, almost 30 seconds would have passed before middleware sent a message to CDT, which then responded under 2 seconds:

 {F1679163335}

This means while CDT is getting flooded and has observable lag in the UI, the smoking gun is actually the middleware.

Digging a little deeper, we know that incoming messages from the target goes into a Promise queue, including the console logs:

https://www.internalfb.com/code/fbsource/[d5d312082e9c]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/Device.js?lines=155-157

This means during the flood of logs from the target, the Promise queue keeps getting chained rapidly for each message.

Meanhile, the `ws` lib uses the underlying NodeJS `Socket.write` method for `ping(…)` and `send(…)`:

https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/sender.js#L349

…which is guaranteed to fire the callback asynchronously:

nodejs/help#1504 (comment)

Promise queue is in the macro task queue, which gets priority before the micro task queue. So if the Promise queue is not cleared yet, the websocket queue will have a hard time getting executed in time – explaining the extreme send-sent durations during a log spam.

The fix is simple:

1. Start the terminate-socket-timer until the `ping` is actually sent
1. Treat any incoming message (along with `pong`s) as a terminate-socket-timer reset
    1. This also applies if `pong` comes in between `send` and `sent`, which can happen sometimes due to the async nature of the callback:

 {F1679288626}

# V1

~~In this diff, a more forgiving mechanism is introduced, i.e. CDT is allowed to miss a ping-pong roundtrip 3 times before the websocket connection is terminated.~~

~~This allows a bit more breathing room for CDT's initialisation during log spam while maintaining the same ping-pong interval for VS Code to keep the auto SSH tunnel alive.~~

Differential Revision: D58220230
EdmondChuiHW added a commit to EdmondChuiHW/react-native that referenced this issue Jun 10, 2024
…4811)

Summary:
Pull Request resolved: facebook#44811

Changelog:
[General][Fixed] - Debugger frontend socket-termination countdown now begins after the ping message is actually sent

The debugger is currently disconnected if a ping-pong message is missed.

This causes the debugger to be unusable if it happens to be lagging, e.g. when the initialisation is competing with the flood of log spam T191394188

There are a few ways to fix this as discused with motiz88 and robhogan:

1. Ensure the websocket has a chance to respond, e.g. in via web worker
1. Lengthen the time allowed for the pong resopnse

I've done some digging to find the root cause of the UI being blocked in CDT, However, profiling shows that most of the work is not simple to break up, i.e. the number of expensive re-layout calls. Diving into that rabbit hole could mean accidentally writing React.

Because we ping every 10 seconds, we could get un/lucky where CDT happens to be busy _at that exact moment_, making this a flaky symptom to fix, even if we lengthen the allowed time-to-respond.

# V2+

So upon further investigation, CDT websocket is actually responding to the pings in due time:

{F1679132204}

(CDT doesn't show the ping/pong API as frames, so a custom tick/tock message was used to visualise the timing)

Over here in dev-middleware, we currently start a timeout to terminate the socket after sending the ping:

https://www.internalfb.com/code/fbsource/[813870db697a8701f2512d25a7fed730f0ec6ed9]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/InspectorProxy.js?lines=306-307

If CDT doesn't respond in time, websocket would be terminated.

But we saw CDT respond immediately above, even during the log spam, so the delay must be coming from somewhere else.

The intuition is that during the log-spam, the middleware takes a perf hit too when it's processing the spam from the device and forwarding it to the CDT websocket.

We can confirm this by passing a "sent" callback via `socket.ping(cb)`:

https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/websocket.js#L246-L254

This gives us the timing between calling `socket.ping()` and when the ping is actually sent.

Regular, stress-free operation without log-spam shows most pings are sent within the same millisecond:

 {F1679223326}

With the pong response grace period at 5 seconds, there's plenty of time for CDT to `pong` back. That's why it has been working in most cases.

However, during the log-spam, we easily see this send-sent delay over 5 seconds. In extreme cases, almost 30 seconds would have passed before middleware sent a message to CDT, which then responded under 2 seconds:

 {F1679163335}

This means while CDT is getting flooded and has observable lag in the UI, the smoking gun is actually the middleware.

Digging a little deeper, we know that incoming messages from the target goes into a Promise queue, including the console logs:

https://www.internalfb.com/code/fbsource/[d5d312082e9c]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/Device.js?lines=155-157

This means during the flood of logs from the target, the Promise queue keeps getting chained rapidly for each message.

Meanhile, the `ws` lib uses the underlying NodeJS `Socket.write` method for `ping(…)` and `send(…)`:

https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/sender.js#L349

…which is guaranteed to fire the callback asynchronously:

nodejs/help#1504 (comment)

Promise queue is in the macro task queue, which gets priority before the micro task queue. So if the Promise queue is not cleared yet, the websocket queue will have a hard time getting executed in time – explaining the extreme send-sent durations during a log spam.

The fix is simple:

1. Start the terminate-socket-timer until the `ping` is actually sent
1. Treat any incoming message (along with `pong`s) as a terminate-socket-timer reset
    1. This also applies if `pong` comes in between `send` and `sent`, which can happen sometimes due to the async nature of the callback:

 {F1679288626}

# V1

~~In this diff, a more forgiving mechanism is introduced, i.e. CDT is allowed to miss a ping-pong roundtrip 3 times before the websocket connection is terminated.~~

~~This allows a bit more breathing room for CDT's initialisation during log spam while maintaining the same ping-pong interval for VS Code to keep the auto SSH tunnel alive.~~

Differential Revision: D58220230
EdmondChuiHW added a commit to EdmondChuiHW/react-native that referenced this issue Jun 10, 2024
…4811)

Summary:
Pull Request resolved: facebook#44811

Changelog:
[General][Fixed] - Debugger frontend socket-termination countdown now begins after the ping message is actually sent

The debugger is currently disconnected if a ping-pong message is missed.

This causes the debugger to be unusable if it happens to be lagging, e.g. when the initialisation is competing with the flood of log spam T191394188

There are a few ways to fix this as discused with motiz88 and robhogan:

1. Ensure the websocket has a chance to respond, e.g. in via web worker
1. Lengthen the time allowed for the pong resopnse

I've done some digging to find the root cause of the UI being blocked in CDT, However, profiling shows that most of the work is not simple to break up, i.e. the number of expensive re-layout calls. Diving into that rabbit hole could mean accidentally writing React.

Because we ping every 10 seconds, we could get un/lucky where CDT happens to be busy _at that exact moment_, making this a flaky symptom to fix, even if we lengthen the allowed time-to-respond.

# V2+

So upon further investigation, CDT websocket is actually responding to the pings in due time:

{F1679132204}

(CDT doesn't show the ping/pong API as frames, so a custom tick/tock message was used to visualise the timing)

Over here in dev-middleware, we currently start a timeout to terminate the socket after sending the ping:

https://www.internalfb.com/code/fbsource/[813870db697a8701f2512d25a7fed730f0ec6ed9]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/InspectorProxy.js?lines=306-307

If CDT doesn't respond in time, websocket would be terminated.

But we saw CDT respond immediately above, even during the log spam, so the delay must be coming from somewhere else.

The intuition is that during the log-spam, the middleware takes a perf hit too when it's processing the spam from the device and forwarding it to the CDT websocket.

We can confirm this by passing a "sent" callback via `socket.ping(cb)`:

https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/websocket.js#L246-L254

This gives us the timing between calling `socket.ping()` and when the ping is actually sent.

Regular, stress-free operation without log-spam shows most pings are sent within the same millisecond:

 {F1679223326}

With the pong response grace period at 5 seconds, there's plenty of time for CDT to `pong` back. That's why it has been working in most cases.

However, during the log-spam, we easily see this send-sent delay over 5 seconds. In extreme cases, almost 30 seconds would have passed before middleware sent a message to CDT, which then responded under 2 seconds:

 {F1679163335}

This means while CDT is getting flooded and has observable lag in the UI, the smoking gun is actually the middleware.

Digging a little deeper, we know that incoming messages from the target goes into a Promise queue, including the console logs:

https://www.internalfb.com/code/fbsource/[d5d312082e9c]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/Device.js?lines=155-157

This means during the flood of logs from the target, the Promise queue keeps getting chained rapidly for each message.

Meanhile, the `ws` lib uses the underlying NodeJS `Socket.write` method for `ping(…)` and `send(…)`:

https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/sender.js#L349

…which is guaranteed to fire the callback asynchronously:

nodejs/help#1504 (comment)

Promise queue is in the macro task queue, which gets priority before the micro task queue. So if the Promise queue is not cleared yet, the websocket queue will have a hard time getting executed in time – explaining the extreme send-sent durations during a log spam.

The fix is simple:

1. Start the terminate-socket-timer until the `ping` is actually sent
1. Treat any incoming message (along with `pong`s) as a terminate-socket-timer reset
    1. This also applies if `pong` comes in between `send` and `sent`, which can happen sometimes due to the async nature of the callback:

 {F1679288626}

# V1

~~In this diff, a more forgiving mechanism is introduced, i.e. CDT is allowed to miss a ping-pong roundtrip 3 times before the websocket connection is terminated.~~

~~This allows a bit more breathing room for CDT's initialisation during log spam while maintaining the same ping-pong interval for VS Code to keep the auto SSH tunnel alive.~~

Differential Revision: D58220230
EdmondChuiHW added a commit to EdmondChuiHW/react-native that referenced this issue Jun 12, 2024
…4811)

Summary:
Pull Request resolved: facebook#44811

Changelog:
[General][Fixed] - Debugger frontend socket-termination countdown now begins after the ping message is actually sent

The debugger is currently disconnected if a ping-pong message is missed.

This causes the debugger to be unusable if it happens to be lagging, e.g. when the initialisation is competing with the flood of log spam T191394188

There are a few ways to fix this as discused with motiz88 and robhogan:

1. Ensure the websocket has a chance to respond, e.g. in via web worker
1. Lengthen the time allowed for the pong resopnse

I've done some digging to find the root cause of the UI being blocked in CDT, However, profiling shows that most of the work is not simple to break up, i.e. the number of expensive re-layout calls. Diving into that rabbit hole could mean accidentally writing React.

Because we ping every 10 seconds, we could get un/lucky where CDT happens to be busy _at that exact moment_, making this a flaky symptom to fix, even if we lengthen the allowed time-to-respond.

# V2+

So upon further investigation, CDT websocket is actually responding to the pings in due time:

{F1679132204}

(CDT doesn't show the ping/pong API as frames, so a custom tick/tock message was used to visualise the timing)

Over here in dev-middleware, we currently start a timeout to terminate the socket after sending the ping:

https://www.internalfb.com/code/fbsource/[813870db697a8701f2512d25a7fed730f0ec6ed9]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/InspectorProxy.js?lines=306-307

If CDT doesn't respond in time, websocket would be terminated.

But we saw CDT respond immediately above, even during the log spam, so the delay must be coming from somewhere else.

The intuition is that during the log-spam, the middleware takes a perf hit too when it's processing the spam from the device and forwarding it to the CDT websocket.

We can confirm this by passing a "sent" callback via `socket.ping(cb)`:

https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/websocket.js#L246-L254

This gives us the timing between calling `socket.ping()` and when the ping is actually sent.

Regular, stress-free operation without log-spam shows most pings are sent within the same millisecond:

 {F1679223326}

With the pong response grace period at 5 seconds, there's plenty of time for CDT to `pong` back. That's why it has been working in most cases.

However, during the log-spam, we easily see this send-sent delay over 5 seconds. In extreme cases, almost 30 seconds would have passed before middleware sent a message to CDT, which then responded under 2 seconds:

 {F1679163335}

This means while CDT is getting flooded and has observable lag in the UI, the smoking gun is actually the middleware.

Digging a little deeper, we know that incoming messages from the target goes into a Promise queue, including the console logs:

https://www.internalfb.com/code/fbsource/[d5d312082e9c]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/Device.js?lines=155-157

This means during the flood of logs from the target, the Promise queue keeps getting chained rapidly for each message.

Meanhile, the `ws` lib uses the underlying NodeJS `Socket.write` method for `ping(…)` and `send(…)`:

https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/sender.js#L349

…which is guaranteed to fire the callback asynchronously:

nodejs/help#1504 (comment)

Promise queue is in the macro task queue, which gets priority before the micro task queue. So if the Promise queue is not cleared yet, the websocket queue will have a hard time getting executed in time – explaining the extreme send-sent durations during a log spam.

The fix is simple:

1. Start the terminate-socket-timer until the `ping` is actually sent
1. Treat any incoming message (along with `pong`s) as a terminate-socket-timer reset
    1. This also applies if `pong` comes in between `send` and `sent`, which can happen sometimes due to the async nature of the callback:

 {F1679288626}

# V1

~~In this diff, a more forgiving mechanism is introduced, i.e. CDT is allowed to miss a ping-pong roundtrip 3 times before the websocket connection is terminated.~~

~~This allows a bit more breathing room for CDT's initialisation during log spam while maintaining the same ping-pong interval for VS Code to keep the auto SSH tunnel alive.~~

Differential Revision: D58220230
EdmondChuiHW added a commit to EdmondChuiHW/react-native that referenced this issue Jun 12, 2024
…4811)

Summary:
Pull Request resolved: facebook#44811

Changelog:
[General][Fixed] - Debugger frontend socket-termination countdown now begins after the ping message is actually sent

The debugger is currently disconnected if a ping-pong message is missed.

This causes the debugger to be unusable if it happens to be lagging, e.g. when the initialisation is competing with the flood of log spam T191394188

There are a few ways to fix this as discused with motiz88 and robhogan:

1. Ensure the websocket has a chance to respond, e.g. in via web worker
1. Lengthen the time allowed for the pong resopnse

I've done some digging to find the root cause of the UI being blocked in CDT, However, profiling shows that most of the work is not simple to break up, i.e. the number of expensive re-layout calls. Diving into that rabbit hole could mean accidentally writing React.

Because we ping every 10 seconds, we could get un/lucky where CDT happens to be busy _at that exact moment_, making this a flaky symptom to fix, even if we lengthen the allowed time-to-respond.

# V2+

So upon further investigation, CDT websocket is actually responding to the pings in due time:

{F1679132204}

(CDT doesn't show the ping/pong API as frames, so a custom tick/tock message was used to visualise the timing)

Over here in dev-middleware, we currently start a timeout to terminate the socket after sending the ping:

https://www.internalfb.com/code/fbsource/[813870db697a8701f2512d25a7fed730f0ec6ed9]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/InspectorProxy.js?lines=306-307

If CDT doesn't respond in time, websocket would be terminated.

But we saw CDT respond immediately above, even during the log spam, so the delay must be coming from somewhere else.

The intuition is that during the log-spam, the middleware takes a perf hit too when it's processing the spam from the device and forwarding it to the CDT websocket.

We can confirm this by passing a "sent" callback via `socket.ping(cb)`:

https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/websocket.js#L246-L254

This gives us the timing between calling `socket.ping()` and when the ping is actually sent.

Regular, stress-free operation without log-spam shows most pings are sent within the same millisecond:

 {F1679223326}

With the pong response grace period at 5 seconds, there's plenty of time for CDT to `pong` back. That's why it has been working in most cases.

However, during the log-spam, we easily see this send-sent delay over 5 seconds. In extreme cases, almost 30 seconds would have passed before middleware sent a message to CDT, which then responded under 2 seconds:

 {F1679163335}

This means while CDT is getting flooded and has observable lag in the UI, the smoking gun is actually the middleware.

Digging a little deeper, we know that incoming messages from the target goes into a Promise queue, including the console logs:

https://www.internalfb.com/code/fbsource/[d5d312082e9c]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/Device.js?lines=155-157

This means during the flood of logs from the target, the Promise queue keeps getting chained rapidly for each message.

Meanhile, the `ws` lib uses the underlying NodeJS `Socket.write` method for `ping(…)` and `send(…)`:

https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/sender.js#L349

…which is guaranteed to fire the callback asynchronously:

nodejs/help#1504 (comment)

Promise queue is in the macro task queue, which gets priority before the micro task queue. So if the Promise queue is not cleared yet, the websocket queue will have a hard time getting executed in time – explaining the extreme send-sent durations during a log spam.

The fix is simple:

1. Start the terminate-socket-timer until the `ping` is actually sent
1. Treat any incoming message (along with `pong`s) as a terminate-socket-timer reset
    1. This also applies if `pong` comes in between `send` and `sent`, which can happen sometimes due to the async nature of the callback:

 {F1679288626}

# V1

~~In this diff, a more forgiving mechanism is introduced, i.e. CDT is allowed to miss a ping-pong roundtrip 3 times before the websocket connection is terminated.~~

~~This allows a bit more breathing room for CDT's initialisation during log spam while maintaining the same ping-pong interval for VS Code to keep the auto SSH tunnel alive.~~

Reviewed By: huntie

Differential Revision: D58220230
facebook-github-bot pushed a commit to facebook/react-native that referenced this issue Jun 12, 2024
Summary:
Pull Request resolved: #44811

Changelog:
[General][Fixed] - Debugger frontend socket-termination countdown now begins after the ping message is actually sent

The debugger is currently disconnected if a ping-pong message is missed.

This causes the debugger to be unusable if it happens to be lagging, e.g. when the initialisation is competing with the flood of log spam T191394188

There are a few ways to fix this as discused with motiz88 and robhogan:

1. Ensure the websocket has a chance to respond, e.g. in via web worker
1. Lengthen the time allowed for the pong resopnse

I've done some digging to find the root cause of the UI being blocked in CDT, However, profiling shows that most of the work is not simple to break up, i.e. the number of expensive re-layout calls. Diving into that rabbit hole could mean accidentally writing React.

Because we ping every 10 seconds, we could get un/lucky where CDT happens to be busy _at that exact moment_, making this a flaky symptom to fix, even if we lengthen the allowed time-to-respond.

# V2+

So upon further investigation, CDT websocket is actually responding to the pings in due time:

{F1679132204}

(CDT doesn't show the ping/pong API as frames, so a custom tick/tock message was used to visualise the timing)

Over here in dev-middleware, we currently start a timeout to terminate the socket after sending the ping:

https://www.internalfb.com/code/fbsource/[813870db697a8701f2512d25a7fed730f0ec6ed9]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/InspectorProxy.js?lines=306-307

If CDT doesn't respond in time, websocket would be terminated.

But we saw CDT respond immediately above, even during the log spam, so the delay must be coming from somewhere else.

The intuition is that during the log-spam, the middleware takes a perf hit too when it's processing the spam from the device and forwarding it to the CDT websocket.

We can confirm this by passing a "sent" callback via `socket.ping(cb)`:

https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/websocket.js#L246-L254

This gives us the timing between calling `socket.ping()` and when the ping is actually sent.

Regular, stress-free operation without log-spam shows most pings are sent within the same millisecond:

 {F1679223326}

With the pong response grace period at 5 seconds, there's plenty of time for CDT to `pong` back. That's why it has been working in most cases.

However, during the log-spam, we easily see this send-sent delay over 5 seconds. In extreme cases, almost 30 seconds would have passed before middleware sent a message to CDT, which then responded under 2 seconds:

 {F1679163335}

This means while CDT is getting flooded and has observable lag in the UI, the smoking gun is actually the middleware.

Digging a little deeper, we know that incoming messages from the target goes into a Promise queue, including the console logs:

https://www.internalfb.com/code/fbsource/[d5d312082e9c]/xplat/js/react-native-github/packages/dev-middleware/src/inspector-proxy/Device.js?lines=155-157

This means during the flood of logs from the target, the Promise queue keeps getting chained rapidly for each message.

Meanhile, the `ws` lib uses the underlying NodeJS `Socket.write` method for `ping(…)` and `send(…)`:

https://github.com/websockets/ws/blob/9bdb58070d64c33a9beeac7c732aac0f4e7e18b7/lib/sender.js#L349

…which is guaranteed to fire the callback asynchronously:

nodejs/help#1504 (comment)

Promise queue is in the macro task queue, which gets priority before the micro task queue. So if the Promise queue is not cleared yet, the websocket queue will have a hard time getting executed in time – explaining the extreme send-sent durations during a log spam.

The fix is simple:

1. Start the terminate-socket-timer until the `ping` is actually sent
1. Treat any incoming message (along with `pong`s) as a terminate-socket-timer reset
    1. This also applies if `pong` comes in between `send` and `sent`, which can happen sometimes due to the async nature of the callback:

 {F1679288626}

# V1

~~In this diff, a more forgiving mechanism is introduced, i.e. CDT is allowed to miss a ping-pong roundtrip 3 times before the websocket connection is terminated.~~

~~This allows a bit more breathing room for CDT's initialisation during log spam while maintaining the same ping-pong interval for VS Code to keep the auto SSH tunnel alive.~~

Reviewed By: huntie

Differential Revision: D58220230

fbshipit-source-id: 7111c9878492d8755a6110a5cdf4ef622265001d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants