Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReadableStream[Symbol.asyncIterator] path to stability #333

Closed
benjamingr opened this issue Apr 26, 2018 · 38 comments
Closed

ReadableStream[Symbol.asyncIterator] path to stability #333

benjamingr opened this issue Apr 26, 2018 · 38 comments

Comments

@benjamingr
Copy link
Member

benjamingr commented Apr 26, 2018

Hey,

As after landing unhandledRejection I'd like to investigate and gather some community feedback about our API for Symbol.asyncIterator as well as identify possible problems before we make the feature non-experimental. I'd like to aim for Node v12 is a good target for marking the API as stable.

My current plan (which I'd love feedback on) is to:

  • Use code search to identify for await usage in Node projects using ReadableStream, analyze it and post findings.
  • Post a request for people that are using the feature to provide feedback. Interview those people and gather use cases, current usage and what they're doing.
  • Identify requirements for stabilizing the API and suggest a plan for marking the API as stable.

What do you think?

cc @bmeurer @mcollina @zenparsing @nodejs/streams

@benjamingr benjamingr changed the title Community feedback for Async Iterator initiative ReadableStream[Symbol.asyncIterator] path to stability Apr 26, 2018
@targos
Copy link
Member

targos commented Apr 26, 2018

I'm going to try and migrate a small project in a few months. The code is private but I'll be happy to provide feedback.

@benjamingr
Copy link
Member Author

Here are questions I was hoping to ask people, please edit in your own

  • How are you currently utilizing async iterators?
  • What native and community modules do you commonly use async iterators with?
  • Have you noticed any semantic issues with Symbol.asyncIterator?
  • Have you found the Symbol.asyncIterator semantics clear and convenient?
  • Has performance been an issue with Symbol.asyncIterator? Have there been any cases where you were unable to use Symbol.asyncIterator because of it?
  • Is there anything else you'd like to tell us about your ReadableStream usage?

Of course, suggestions to remove/replace these are also welcome

@benjamingr
Copy link
Member Author

benjamingr commented Apr 26, 2018

Also, if you're interested and we have time, we can do this in time for the Berlin collaborator summit and discuss it there - if it magically aligns for everyone.

@mcollina
Copy link
Member

Definitely +1 to all of what you said.

I think the folllwing are also needed:

  1. make sure that our semantics matches the ones of WHATWG streams. Ideally code that uses this language construct should work the same.
  2. export it in readable-stream@3, and get significant usage on that (this is a big to-do).
  3. verify our implementation also works when transpiled (should be) to browsers/node that do not support it.

We can drop the experimental flag in the 10 lts cycle without problems.

@benjamingr
Copy link
Member Author

make sure that our semantics matches the ones of WHATWG streams. Ideally code that uses this language construct should work the same.

@annevk @domenic - what would be the correct way for Node to verify that our semantics match WHATWG expectations?

@annevk
Copy link

annevk commented Apr 27, 2018

I suspect you can make use of https://github.com/whatwg/streams/tree/master/reference-implementation (in particular the tests).

@annevk
Copy link

annevk commented Apr 27, 2018

The tests are located at https://github.com/w3c/web-platform-tests/tree/master/streams.

@benjamingr
Copy link
Member Author

I created a Google Form, I'll start distributing it later https://goo.gl/forms/BCcCoM81wtpcRWg12

@mcollina
Copy link
Member

mcollina commented Aug 6, 2018

@benjamingr Should we remove experimental for Node 10? I fear we are not really getting any feedback atm, or everybody is happy with it.

@benjamingr
Copy link
Member Author

@mcollina the feedback I've been getting is that only a few people are actually using it - but those who are are pretty happy. Also that the google form I did was way too to answer lol. I did try to solicit feedback but did not get much other than "works fine" :/

@benjamingr
Copy link
Member Author

Personally I'm + 0.5 on removing experimental - it's not as if we have a lot of wiggle room in the semantics of streams with async iterators anyway.

@mcollina
Copy link
Member

mcollina commented Aug 6, 2018

I think there is value in having it out of experimental before Node 10 goes LTS.

@benjamingr
Copy link
Member Author

@mcollina feel free to ping me in the PR for a LGTM :)

@Wimpje
Copy link

Wimpje commented Sep 17, 2018

I've just used Symbol.asyncIterator, and liked using it 👍 I will fill in the form to provide some feedback ;) Just leaving a note here, since I hope this functionality will be part of the next LTS...

@mcollina
Copy link
Member

@benjamingr can you please share the answers with me ([email protected])?

@jhiesey
Copy link

jhiesey commented Sep 24, 2018

The current implementation breaks IE11 support (which the readme claims) since Symbol isn't defined. I'll open a separate issue for that.

@jhiesey
Copy link

jhiesey commented Sep 24, 2018

The IE11 issue is #364

@benjamingr
Copy link
Member Author

@mcollina I am not available until the end of next week (holiday during most of September) - in any case - shared the form with you (not many responses)

@MattMorgis
Copy link

MattMorgis commented Nov 12, 2018

I have been loving async generators and iterators in Node v10, as using try/catch for errors is very nice.

However, I am having trouble building pipelines and connecting them with other streams (maybe I'm missing something).

I want to write to a pipe():

const fs = require('fs');
const csvStream = fs.createReadStream('./large-csv-file');

async function* convertChunksToLines(stream) {
  ..
}

// This doesn't work
convertChunksToLines(stream).pipe(fs.createWriteStream('results.csv'))

I want to read from a pipe, specially the JSONStream module:

const request = require("request");
const JSONStream = require("JSONStream");

const httpStream = request.get(
  "large-json-url"
);

const parsedJsonStream = httpStream.pipe(
  JSONStream.parse("results.transcripts.*.transcript")
);

async function test() {
  for await (const chunk of parsedJsonStream) {
    console.log(chunk);
  }
}

// This doesn't work
test();

Gist: https://gist.github.com/MattMorgis/f2e71a83b2caaabd1ec61a35baa3999c

@MattMorgis
Copy link

I was able to get the second part of this to work by wrapping the pipe in a PassThrough stream:

const request = require("request");
const JSONStream = require("JSONStream");
const stream = require("stream");
const passThrough = new stream.PassThrough({
  objectMode: true
});

const httpStream = request.get(
  "large-json-url"
);

const parsedJsonStream = httpStream
  .pipe(JSONStream.parse("results.transcripts.*.transcript"))
  .pipe(passThrough);

async function test() {
  for await (const chunk of parsedJsonStream) {
    console.log(chunk);
  }
}


test();

@benjamingr
Copy link
Member Author

benjamingr commented Nov 13, 2018

@MattMorgis would a utility on "stream" help? Like a stream.fromAsyncIterator?

We can also alternatively teach wrap to wrap an async iterator.

@mcollina
Copy link
Member

@MattMorgis you have to pipe it to a PassThrough because JSONStream does not inherit from Transform or use this module. So, there is no async iterators support there. On that one, there is not much we could do.

@benjamingr I think we should develop a set of quick utilities to work with async iterators and streams.

@targos
Copy link
Member

targos commented Nov 13, 2018

I think we should develop a set of quick utilities to work with async iterators and streams.

That would be awesome. I can help on that front, but maybe we should first experiment in a userland module?

@MattMorgis
Copy link

MattMorgis commented Nov 13, 2018

JSONStream does not inherit from Transform or use this module.

Probably for a separate thread, but I'd love to learn more about this. There are variations of JSONStream that use readable-stream and/or through2 that I'm guessing would work if I try them out? I'm still figuring out this landscape as a newcomer to streams

I think we should develop a set of quick utilities to work with async iterators and streams.

I built hacked this up back in April that would allow you to wrap your generator chain and pipe to a writeable stream. I think it still needs some work to be in the middle of the pipeline, but I can now revisit that as I was blocked by the missing PassThrough stream.

I wasn't sure if there was a better way. I found a discussion from last year about a possible wrap method, which might do the trick.

@mcollina
Copy link
Member

Check out https://github.com/mcollina/stream-iterators-utils. @benjamingr @targos I've added both of you.

@reconbot
Copy link

@mcollina oh that's perfect I wish I saw that earlier. I've written https://github.com/reconbot/streaming-iterables and I just added writeToStream() the other day. Making a pipeline like

  await pipeline(getUsers, grabImagesToo, serializeNodeIfNotSaved, writeToStream(output))

Feels really nice.

Not having to managing the overhead, error handling, highWaterMarks etc of intermediate streams also feels good. And It's increased the cpu bound throughput by about 20%. I'm still testing but it looks like it's going to replace most of my stream based approaches at Bustle sooner than later.

@felixfbecker
Copy link

I love it for scanning process output:

const gitProcess = await spawn('git', ['log', '--format=%H'])
try {
	for await (const commit of gitProcess.stdout.pipe(split())) {
		const version = versionsByCommit.get(commit)
		if (version) {
			return version
		}
	}
	return undefined
} finally {
	gitProcess.kill()
}

@danm
Copy link

danm commented Feb 21, 2019

Im using this feature a lot with readline where the data coming in has to maintain order.

const rl = readline.createInterface({
  input: fs.createReadStream(file),
});
for await (const line of rl) {
  // do something with line
}

I hope it isn't an anti pattern!

Any idea when ReadableStream[Symbol.asyncIterator] will make it to stable?

@benjamingr
Copy link
Member Author

@danm not an anti-pattern, I believe this is intended usage :)

@alexwhitman
Copy link

This would be really good to get into Node 12 for the next LTS. Anyone know what the status is?

@mcollina
Copy link
Member

I need some help. I need somebody to go through the WHATWG Stream spec for AsyncIterators and check if our behavior is compatible with theirs. Ideally adding some
tests.

I’m convinced it is, but I need some more confirmation. The end goal should be that the same function could consume both.

Would you like to help?

@mcollina
Copy link
Member

PR to make them stable open in nodejs/node#26989

@guybedford
Copy link

If you're still looking for feedback, here's something that just caught me out when using this, coming at it from complete novice level just today.

In the following:

const test = fs.createWriteStream('./teststream');
for await (const chunk of readableStream) {
  test.write(chunk);
}
test.end();

I missed that I should be "awaiting" the write. That is a promise version of the write call would possibly have saved me here:

const test = fs.createWriteStream('./teststream');
for await (const chunk of readableStream) {
  await test.writeAsync(chunk);
}
test.end();

unless I've misunderstood a better way that is (which is likely).

@guybedford
Copy link

It would also be really useful to have toReadable(generator, opts) in core.

@reconbot
Copy link

reconbot commented Apr 1, 2019

This would be a different issue as it's not part of the async iterator. However, you probably want to check for back pressure and wait for the drain event. In any case after you call test.end() you want to wait for the finish event and then you know all your data has been flushed to disk.

I've been able to solve this problem with the following code borrowed from [streaming-iterables](https://github.com/bustle/streaming-iterables/] (it doesn't close streams and has error handling.)

// https://github.com/bustle/streaming-iterables/blob/master/lib/write-to-stream.ts
function once(event: string, stream: IWritable): Promise<any> {
  return new Promise(resolve => {
    stream.once(event, resolve)
  })
}

async function writeToStream(stream: IWritable, iterable: AnyIterable<any>): Promise<void> {
  for await (const value of iterable) {
    if (stream.write(value) === false) {
      await once('drain', stream)
    }
  }
  stream.end()
  await once('finish', stream)
}

@guybedford
Copy link

@reconbot thanks so much for sharing, that explains a lot, I was wondering how to handle backpressure but couldn't find anything on this. I also didn't understand at all why once was in the stream-iterators-utils... and that is now clear too :)

@guybedford
Copy link

@reconbot also note once should probably include error handling, see https://github.com/mcollina/stream-iterators-utils/blob/master/index.js#L31.

@targos
Copy link
Member

targos commented Apr 1, 2019

btw once is now in core :)
https://nodejs.org/dist/latest-v11.x/docs/api/events.html#events_events_once_emitter_name

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests