Skip to content
This repository was archived by the owner on Apr 22, 2023. It is now read-only.

Conversation

@kanongil
Copy link

There is a need to clarify the relationship between push(), _read(), and its callback. I propose the attached patch, which tries to limit the changes needed for existing extensions.

Reasoning
Currently, the read() callback is used inconsistently when combined with push(). If push() is used, it will signal that any pending _read() has happened, resulting in a new _read() call before any async callbacks have been handled.

This is illustrated in the attached revision to the test-stream-push-order.js test which fails on master.

The problem can be fixed by always waiting for the callback before calling _read() again but this would break the pattern described in the push() documentation, since it never calls the callback.

The patch
With this patch, every time _read(n, cb) is called you are expected to eventually call the callback before another _read() will be called.

As a new option, you when you omit the callback in your _read() function definition the Readable will infer a sync callback with empty data right after you return. This allows the push() pattern to continue working by simply removing the arguments.

As a sidenote, this also fixes _read() being called incorrectly during push()'es.

…stent

With this change every time _read(n, cb) is called you are expected to eventually
call the callback before another _read() will be called.

As a new option, you when you omit the callback in your _read() function definition
the Readable will infer a sync callback with empty data right after you
return.
@isaacs
Copy link

isaacs commented Feb 25, 2013

There are a few problems with this:

  1. It's doing too much in one commit.
  2. Branching behavior based on function.length is not wise.

I think that the semantics already are actually fine. _read() is a signal that the consumer is ready for more data. push() is a way to put some data in. The push function effectively IS the _read() callback.

If you have complaints about the semantics now, it would be best to start with a failing test, and we can discuss first whether the existing behavior is a problem or by design.

I like how onread and push are split up into parts here. It'd be good to do that, but in a few separate steps, without changing anything else.

@isaacs isaacs closed this Feb 25, 2013
@kanongil
Copy link
Author

Thanks for taking the time to review. I was not expecting this to be pushed just like that.

For a test that fails, check the modified test-stream-push-order.js test, which only changes the callback to be async (something I find more likely in a real scenario). I think that to have more consistent behavior, you either need to change the semantics of _read(), or throw / silently ignore any callback after a push.

@isaacs
Copy link

isaacs commented Feb 25, 2013

I think that this is a contrived edge case. If you have data available from whatever the source is, you should put it into the stream right now. If you can't guarantee the order of the actual data, then you can't guarantee the order of the data in the stream, and it ought to behave as your test is showing. All your proving is that a stream that pushes its data out of order will have data out of order.

@kanongil
Copy link
Author

From a user POV, I still find the duplicity of push() and the read callback confusing. Currently the Readable will wait for an async callback from _read() (as expected), but when something is push()ed from anywhere it suddenly won't (huh?). I would naturally expect that _read() won't be called again until the current callback is dealt with, and not deferred to some time in the future.

I'm not saying that push followed by an async callback is a good way to design your _read handler, but I expect that future implementors might experiment with this, and find it doesn't behave as they thought. Also, the solution might just be clearer documentation.

@TooTallNate
Copy link

From a user POV, I still find the duplicity of push() and the read callback confusing.

I agree. In fact, I'd push for undocumenting the .push() functions since it's just a crazy frankenstein thing, and having 2 implementing APIs to choose from isn't gonna make it easy for anyone.

An even better idea would be to implement a subclass of Readable: PushReadable. And that's where the .push() function gets defined, and we can come up with an actual good API to work with for sources that "push" their data.

@kanongil
Copy link
Author

@TooTallNate Thanks for the support, and I like the idea of a PushReadable with a clean implementation.

As I see it push() doesn't actually solve anything that can't already be handled from _read(), except potential throughput optimizations when dealing with multiple buffers. Ie. it's purely a convenience method, but one that is not always obvious how to use.

A potential issue with a PushReadable, is the other subclasses. Ie. will you have to create a PushDuplex and PushTransform as well?

@isaacs
Copy link

isaacs commented Mar 1, 2013

When you get down to it, push() is significantly more fundamental (and useful in real world use cases) than a single callback.

The solution here is to remove the callback to _read, and instead just use push() to put data in the queue.

Care to review? #4878

It's easy enough to implement CallbackReadable in terms of this API:

util.extends(CallbackReadable, Readable);
function CallbackReadable(options) {
  if (!(this instanceof CallbackReadable))
    return new CallbackReadable(options);
  Readable.call(this, options);
}
CallbackReadable.prototype._read = function(n) {
  var self = this;
  this._readCb(n, function(er, chunk) {
    if (er) this.emit('error', er);
    else this.push(chunk);
  });
};
CallbackReadable.prototype._readCb = function(n, cb) {
  cb(new Error('not implemented'));
};

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants