Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v13.x backport] stream: invoke buffered write callbacks on error #31179

Closed

Conversation

ronag
Copy link
Member

@ronag ronag commented Jan 3, 2020

Buffered write callbacks were only invoked upon error if autoDestroy was invoked.

Refs: #30596

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

@nodejs-github-bot nodejs-github-bot added stream Issues and PRs related to the stream subsystem. v13.x labels Jan 3, 2020
Buffered write callbacks were only invoked upon
error if `autoDestroy` was invoked.

Backport-PR-URL: nodejs#31179
PR-URL: nodejs#30596
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
@BridgeAR
Copy link
Member

BridgeAR commented Jan 3, 2020

@ronag PTAL. Two of the new tests are failing.

@nodejs-github-bot
Copy link
Collaborator

@lpinca
Copy link
Member

lpinca commented Jan 5, 2020

The original PR had 3 TSC approvals so I think it's useless to ping TSC again but is it possible to get more opinions on this behavior before landing? My observations:

@lpinca
Copy link
Member

lpinca commented Jan 5, 2020

Another problem after #31082 is that this calls zlib.flush() callback when the flush write is queued up and the stream is destroyed:

const { createDeflateRaw } = require('zlib');

const data = Buffer.alloc(1024, 'a');
const deflate = createDeflateRaw();

deflate.resume();
deflate.write(data);
deflate.flush(function() {
  throw new Error('Unexpected callback invocation');
});

process.nextTick(function() {
  deflate.close();
});

Without this change zlib.flush() callback is not called so I think this makes it semver-major. Similar considerations can be done for (userland?) streams that work like zlib.

@ronag
Copy link
Member Author

ronag commented Jan 5, 2020

@lpinca: What is your suggestion? Should we revert the original change in 13.x? If #30596 is not reverted then I believe this backport should land.

Though, I think the current consensus going forward is that all callbacks should be invoked? Was this discussed by TSC btw?

In my opinion the flush callback should be invoked with an error. Though I'm unsure whether that is currently the case and whether that is a breaking change.

@lpinca
Copy link
Member

lpinca commented Jan 5, 2020

I think the current consensus going forward is that all callbacks should be invoked? Was this discussed by TSC btw?

I don't know.

Yes in my opinion calling buffered write callbacks with an error when a stream is destroyed/errors is a step in the wrong direction.

  • Not all callbacks must be called. A lot of callbacks are only called in response of a specific event / operation. If that event or operation never occurred its associated callback should not be called in my opinion.
  • I guess the idea behind this is to make things work consistently but I think it creates more problems than it solves. One example is the write after 'connect' above.
  • AFAIK no issues have been created? about buffered write callbacks not being called on error in all these years so I think this behavior is not broken.
  • It is hard to adapt user land code to this new behavior. For example socket.write() will now call all buffered write callbacks with an error if a write error occurs. Doing the same on Node.js < 13 will be hard and hacky for userland libs. They should wait until Node.js 12 goes EOL to have the same behavior across all supported Node.js versions.

@ronag
Copy link
Member Author

ronag commented Jan 5, 2020

My main objection is that in an async API I would personally always expect the callback to be invoked. Otherwise patterns such as:

await new Promise(w.write(chunk, err => err ? reject(err) : resolve());
await new Promise(delate.flush(err => err ? reject(err) : resolve());

Which I would expect to work and would seemingly "randomly" deadlock.

Although I would find it very strange, I don't have a strong objection on this other than it should probably be more explicit in the docs if the callback might not be invoked.

A lot of callbacks are only called in response of a specific event / operation.

Do we have any idea of how often this is the case? Whether these are a bugs or not I guess depends on how we want/expect it to work.

I guess the idea behind this is to make things work consistently but I think it creates more problems than it solves. One example is the write after 'connect' above.

I feel like that is a different problem where we are not properly handling the "pending" state of a stream. _write should not be called at all while the stream is not ready. That's being worked on in #29656.

AFAIK no issues have been created? about buffered write callbacks not being called on error in all these years so I think this behavior is not broken.

Good point. Though no issues does no necessarily mean not broken or no problems.

It is hard to adapt user land code to this new behavior. For example socket.write() will now call all buffered write callbacks with an error if a write error occurs. Doing the same on Node.js < 13 will be hard and hacky for userland libs. They should wait until Node.js 12 goes EOL to have the same behavior across all supported Node.js versions.

Good point. Not sure how big of a problem this is or not. This does kind of relate to an edge case so the ecosystem might or might not expect the callback to be invoked. The only time it is (or not) invoked is during a bit of an edge case. I don't know.

Sorry for the hypotheticals. I guess we just need to gain some consensus on what the expectations should be. Maybe we should have a separate issue just focused on this and ping node/streams?

@lpinca
Copy link
Member

lpinca commented Jan 5, 2020

Sorry for the hypotheticals. I guess we just need to gain some consensus on what the exception should be. Maybe we should have a separate issue to discuss this and raise awareness of the discussion?

No worries, this is a constructive discussion.

I think it is ok for a promise like

new Promise(function(resolve, reject) {
  w.write(chunk, err => err ? reject(err) : resolve()
});

to never be settled because it means the stream already errored or was destroyed so no more writes are done.

Good point. Not sure how big of a problem this is or not. This does kind of relate to an edge case so the ecosystem might or might not expect the callback to be invoked. The only time it is (or not) invoked is during a bit of an edge case. I don't know.

It is a problem for me (guess what? ws) or I wouldn't be doing all this noise 😄.

@ronag
Copy link
Member Author

ronag commented Jan 5, 2020

It is a problem for me (guess what? ws) or I wouldn't be doing all this noise 😄.

Oh yea, sorry about that :D. I wouldn't mind helping out sorting these kind of breakage in ws if that might be useful.

I think it is ok for a promise like

Let me extend the sample:

EDIT: ugly/incomplete fix for 'error'.

const src = fs.createReadStream(...);
const dst = fs.createWriteStream(...);
try {
  // Don't cause unhandled exception, we handle errors through the callback.
  // This assumes that the _write implementation uses the callback instead of 
  // (incorrectly) calling .destroy(err) explicitly. Also assumes that no
  // error occurs while opening the stream :/.

  // Another alternative is to create an errorPromise and use Promise.race.
  // But at that point there is no longer any need for cb to be invoked on error.
  // Thus this example is slightly contrived.
  dst.on('error', nop); 

  await for (const chunk of src) {
    new Promise(function(resolve, reject) {
      dst.write(chunk, err => err ? reject(err) : resolve());
    });
  }
  await new Promise(function(resolve, reject) {
    dst.end(err => err ? reject(err) : resolve());
  });
} finally {
  src.destroy(); // Uhoh? What if write/end never completes?
  dst.destroy();
}

Not very efficient in terms of batching writes on the dst but still seemingly valid given the current API.

@lpinca
Copy link
Member

lpinca commented Jan 5, 2020

I think that code is broken in multiple ways. Not only because 'error' events are not handled, writable.end() doesn't take a callback if there is no chunk to write, etc., but also because it assumes writes will continue even after the destination stream is closed. This has never been true and it still isn't even with this patch.

If users really want to shoot themselves in the foot with code like that they can destroy the source stream when the destination stream closes/errors.

Also I think this is the reason why we have extensive documentation and utilities to pipe from async iterators.

I don't think this is something that should be fixed because in my opinion it is not broken and works as intended.

@ronag
Copy link
Member Author

ronag commented Jan 5, 2020

Not only because 'error' events are not handled,

Async iterators do. But the writable side is indeed a bit problematic, which is why I've previously proposed that 'error' w callback should not cause unhandled exception. But that's a different discussion.

EDIT: Even with that errors during the "pending" state would not be handled through callbacks currently.

"Fixed" the example in this regard.

writable.end() doesn't take a callback if there is no chunk to write

This is wrong. It does. Don't remember what version this was fixed, maybe 13.x?

but also because it assumes writes will continue even after the destination stream is closed.

I don't follow.

Also I think this is the reason why we have extensive documentation and utilities to pipe from async iterators.

Actually I think those are slightly broken. At least the async iterator example is. await once(writable, 'drain'); doesn't handle 'error'.

Though I feel were getting slightly off topic. You've found enough questions in my example to make me reconsider its relevance.

@lpinca
Copy link
Member

lpinca commented Jan 5, 2020

In the example above the destination stream does not buffer anything so I'm not sure if it can actually hang. Only callbacks of buffered writes are not called if the stream is closed prematurely.

This is wrong. It does. Don't remember what version this was fixed, maybe 13.x?

Yes, my bad sorry, too tired.

once(writable, 'drain') handles 'error' events.

@ronag
Copy link
Member Author

ronag commented Jan 5, 2020

In the example above the destination stream does not buffer anything so I'm sure if your example can actually hang. Only callbacks of buffered writes are not called if the stream is closed prematurely.

It might buffer if it is still waiting for the stream to be ready/open, i.e. open the file descriptor.

In terms of generally all callbacks should be invoked I don't know if that's really relevant.

once(writable, 'drain') handles 'error' events.

Oh yes, that's actually pretty cool. My mistake.

@ronag
Copy link
Member Author

ronag commented Jan 5, 2020

Though I would prefer if callback were always invoked. I'm alright (-0) with changing this back to how it was. Maybe with some more explicit docs. Makes things simpler implementation/maintenance wise which I appreciate.

But I would like to have some more input from the rest of the @nodejs/streams team.

@lpinca
Copy link
Member

lpinca commented Jan 5, 2020

@nodejs/streams

@nodejs-github-bot
Copy link
Collaborator

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@lpinca
Copy link
Member

lpinca commented Jan 6, 2020

@mcollina @nodejs/streams the ping was to get your opinions on the possible revert of #30596 based on the discussion on #30596 and here.

@lpinca
Copy link
Member

lpinca commented Jan 6, 2020

For completeness, the writable.end() callback in the above example does not take an error as it is a listener for the 'finish' event and is another example of a callback that may never be called?

@ronag
Copy link
Member Author

ronag commented Jan 6, 2020

For completeness, the writable.end() callback in the above example does not take an error as it is a listener for the 'finish' event and is another example of a callback that may never be called?

It does. Please see #29747.

@lpinca
Copy link
Member

lpinca commented Jan 6, 2020

Ok, documentation must be updated then.

@ronag
Copy link
Member Author

ronag commented Jan 6, 2020

Ok, documentation must be updated then.

Indeed. Created an issue (#31220). I will fix that once we agreed on what to do. If we go the revert direction, we might need to revert #29747 as well to be consistent.

@lpinca: I'll try to ping you as well for these kinds of PR's in the future. Btw, did you get added to nodejs/streams?

Also, I still think this discussion would be better in a dedicated issue. One way or another I believe the TSC might need to get involved once we've agreed on the options.

@ronag
Copy link
Member Author

ronag commented Jan 6, 2020

For completeness, the writable.end() callback in the above example does not take an error as it is a listener for the 'finish' event and is another example of a callback that may never be called?

Even though this is not correct. The example is still not very good since as you pointed out before it doesn't properly handle 'error'. I've tried to add notes in this regard.

@lpinca
Copy link
Member

lpinca commented Jan 6, 2020

Btw, did you get added to nodejs/streams?

Yes.

Also, I still think this discussion would be better in a dedicated issue. One way or another I believe the TSC might need to get involved once we've agreed on the options.

Perhaps but afaik TSC prefers issues to be resolved by dedicated teams.

@lpinca
Copy link
Member

lpinca commented Jan 6, 2020

Btw I think @mcollina is the only active member of the streams team. We really need to expand it.

@BridgeAR
Copy link
Member

BridgeAR commented Jan 6, 2020

As a note: I won't include this PR in 13.6.0. @mcollina please have a closer look at the comments and maybe leave a comment about your opinion on the issue.

@BridgeAR BridgeAR added the blocked PRs that are blocked by other issues or PRs. label Jan 6, 2020
@mcollina
Copy link
Member

mcollina commented Jan 8, 2020

the ping was to get your opinions on the possible revert of #30596 based on the discussion on #30596 and here.

@lpinca can you open up a separate issue on why we should rever that? I'm ok to consider it, but I generally feel that having a callback that is not reliable to be called is extremely problematic when writing values.

@lpinca
Copy link
Member

lpinca commented Jan 8, 2020

Will do.

I generally feel that having a callback that is not reliable to be called is extremely problematic when writing values.

It is reliable until the stream errors, or is destroyed. After that buffered writes will stay buffered indefinitely. They will never be processed, so calling the callbacks of those writes is a mistake in my opinion. It's like calling the 'connect' callback when a connection is never established.

@mcollina
Copy link
Member

mcollina commented Jan 9, 2020

The moment we allow callbacks to not be called (if not stated/documented), then we are at risk of introducing memory leaks and hard to track bugs. We should probably call the callback with an error if the stream is already destroyed then.

@lpinca
Copy link
Member

lpinca commented Jan 9, 2020

We should probably call the callback with an error if the stream is already destroyed then

And we do if write() is called after the stream is no longer writable/destroyed as the write is no longer buffered. I mean what's wrong with current behavior? Why are we changing it? The "deadlock" example above is a no issue because no write is buffered so callback is always called.

@ronag
Copy link
Member Author

ronag commented Jan 9, 2020

The "deadlock" example above is a no issue because no write is buffered so callback is always called.

A write can be buffered before the writable is 'ready', e.g. opening the destination file.

@lpinca
Copy link
Member

lpinca commented Jan 9, 2020

Wait for 'ready', 'connect' or whatever? For net.Socket the write is delayed after connect anyway.

@lpinca lpinca removed the blocked PRs that are blocked by other issues or PRs. label Feb 27, 2020
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@ronag ronag added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Feb 27, 2020
@nodejs-github-bot
Copy link
Collaborator

@codebytere
Copy link
Member

Landed in d0a0071.

@codebytere codebytere closed this Feb 29, 2020
codebytere pushed a commit that referenced this pull request Feb 29, 2020
Refs: #30596

Buffered write callbacks were only invoked upon
error if `autoDestroy` was invoked.

Backport-PR-URL: #31179
PR-URL: #30596
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Rich Trott <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. stream Issues and PRs related to the stream subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants