ReadableStream.prototype.arrayBuffer() #1019

ricea · 2019-10-15T05:26:17Z

It would be good to have a convenience method to concatenate the contents of a ReadableStream into a (promise for a) Uint8Array.

It is unclear what should happen if the stream contains chunks that are not ArrayBuffer or ArrayBufferViews.

Other similar methods we might consider are .blob() and .text().

The text was updated successfully, but these errors were encountered:

domenic · 2019-10-15T05:43:33Z

A more generic method might be stream.toArray() which returns a promise for an array of chunks. (If https://github.com/tc39/proposal-iterator-helpers lands this would basically become an alias for stream.values().toArray(), although potentially easier to optimize since it's more direct.)

If we had that, then we'd just need a way of turning an array of Uint8Arrays into a single Uint8Array. Surprisingly I can't find any easy way to do this, especially not performantly. So, maybe that is not the right path.

annevk · 2019-12-13T11:07:28Z

Probably also want to consider an equivalent API where the buffer is already allocated (similar to encodeInto(), BYOB).

MattiasBuelens · 2019-12-13T16:06:14Z

Probably also want to consider an equivalent API where the buffer is already allocated (similar to encodeInto(), BYOB).

I like the idea of a method that fills up an entire ArrayBuffer (or ArrayBufferView). I think it would work better as a reader method, for example ReadableBYOBReader.readFully(view) (taken from Java's DataInput.readFully()). Optionally, we could also add a shorthand method on ReadableStream, similar to how ReadableStream.cancel() is a shorthand for ReadableStreamDefaultReader.cancel(). The readInto() function in this example seems like a good starting point.

It would be different from ReadableStream.arrayBuffer() and/or ReadableStream.toArray() though. Those two methods would always read the entire contents of the stream, whereas a readFully() method may leave some bytes on the stream if the given buffer is not large enough to fit the entire contents.

jakearchibald · 2020-02-18T09:19:07Z

How about:

array() - Array of chunks
blob(BlobPropertyBag) - Blob. Rejects and cancels stream if any chunks are not BlobPart.
arrayBuffer({ into }) - Rejects and cancels stream if any chunks are not buffer sources, or if into is provided and isn't big enough to hold the data. Resolves with an arraybuffer, which will be a slice of into if it's provided.

Maybe we can look at text() later, but (await stream.array()).join('') doesn't seem too bad.

jimmywarting · 2021-05-24T13:48:38Z

The new Response(blob).body hack was the reason why we got blob.arrayBuffer(), blob.text() & blob.stream()
You can use the same hack here ... new Response(stream).blob()
Could you imagine what the fetch would look like instead if we had done response.body.blob()? then we would probably not have response.blob() in the first place

It would be nice to have all array/blob/text/arrayBuffer - hopefully node streams would follow suit than you could use express.js to do stuff like

app.post('route', (req, res) => {
  await req.arrayBuffer()
})

Rejects and cancels stream if any chunks are not BlobPart.

Not sure what you mean by this

Blob constructor can take any kind of object you throw at it. The default fallback behavior is to cast anything it can't handle into a string...
if you do: new Blob([{}]) then it will dump "[Object object]" as the content. Maybe should do the same thing.

jakearchibald · 2021-05-24T14:40:24Z

Could you imagine what the fetch would look like instead if we had done response.body.blob()? then we would probably not have response.blob() in the first place

That would have delayed shipping fetch and therefore service workers for many years. Also, the methods on response are still useful, because:

Space can be pre-allocated using the content-length
The mime type on Blob can be filled in from the request/response content-type

Blob constructor can take any kind of object you throw at it. The default fallback behavior is to cast anything it can't handle into a string...
if you do: new Blob([{}]) then it will dump "[Object object]" as the content. Maybe should do the same thing.

Yeah, agreed. My intention was: Chunks can be anything that can appear in the sequence used in the blob constructor. As you say, almost everything is acceptable, with a few exceptions like detached buffers.

taralx · 2021-05-30T17:50:42Z

FWIW, +1 to a readFully method on BYOB readers. Maybe a new issue for it?

ricea · 2021-07-12T19:09:25Z

An interesting use of .blob() that I hadn't thought of is to efficiently handle streams that would use excessive amounts of memory to load completely, and so would ideally be stored on disk. The platform doesn't provide a primitive to stream data into a Blob, but it appears implementations do have this as part of their implementation of the blob() method on Response.

See https://stackoverflow.com/questions/67362823/progress-for-a-fetch-blob-javascript/67364010#67364010 for an example we could make much cleaner.

o-t-w · 2021-08-19T23:07:50Z

Looks like Node has just implemented this for arrayBuffer, blob, json, and text, but the API is a bit different.
https://nodejs.org/api/webstreams.html#webstreams_utility_consumers

wffurr · 2023-09-20T14:27:21Z

Any chance of this ever landing? It's amazing to me that there's no "stream.collect" equivalent.

jimmywarting · 2023-09-20T16:02:56Z

just figured if this could be part of something like iterator helpers
sense streams are iterable then this should be essentially free. and they would then also work for node streams and for any kind of iterator that could yield any form of ArrayBufferView

bakkot · 2023-10-27T18:56:58Z

just figured if this could be part of something like iterator helpers

As mentioned above, iterator helpers have a .toArray method, and async iterator helpers will have the same. The helper will let you do await stream.values().toArray(), which is nicer than what you currently do - but getting all the bytes into a single buffer after is still going to be awkward, and even with a helper like this you'd still have a choice between either copying every chunk after reading all of them or giving up locality. This use case really wants a more specific "put all the bytes from a readable byte stream into a single buffer" helper.

And iterator helpers are intended as fully general utilities, so they're probably not the right place to have a method specifically for producers which yield chunks of bytes. That's a pretty uncommon thing to do with iterators anyway; the ReadableStream machinery is much better suited to handle such producers efficiently.

datner · 2023-11-09T06:26:11Z

I believe that is native streaming instruments had better api they would be used more, which should not be a controversial opinion..
Even if we are collectively 'stuck' on this issue, letting it rot will not move this forward. Streaming is a great concept, and is becoming more and more relevant in the browser.
Do we not think that this should be given another inspection?

jasnell · 2024-03-23T22:51:19Z

Related: #1238

Jarred-Sumner · 2024-03-24T20:55:34Z

We would like to implement this in Bun, along with ReadableStream.prototype.text, ReadableStream.prototype.blob, and ReadableStream.prototype.json.

We also would be interested in implementing ReadableStream.prototype.array and ReadableStream.prototype.formData but feel less strongly about those. FormData can accept an optional Content-Type argument and if provided parse as multi-part form and otherwise parse as url-encoded form, rejecting the Promise if invalid data is passed (and canceling the stream)

bakkot · 2024-03-24T21:25:00Z

Can I suggest instead having a .bytes() method which gives you a Uint8Array? See w3ctag/design-principles#463, whatwg/fetch#1724 (comment), etc. My proposal for base64 support in JS also operates on Uint8Arrays.

jasnell · 2024-03-24T23:04:51Z

I'm happy with this approach also (in contrast to what I suggest in #1238).

A key challenge I have with having these methods directly on ReadableStream is that they are not useful with ReadableStreams that are not byte-oriented. Putting these on a Reader instead makes it a bit nicer to limit to just byte-oriented streams similar to the how ReadableStreamBYOBReader does. If we do add these directly to ReadableStream.prototype then they should only work on byte-oriented streams.

ReadableStream.prototype.array() really isn't necessary given that Array.fromAsync(...) (https://github.com/tc39/proposal-array-from-async) is coming.

const chunks = Array.fromAsync(readable);

ReadableStream.prototype.text() is a bit tricky if we can't specify the text encoding. With response.text() we can use the content-type to identify the encoding, and at that point there's a bit of overlap with TextDecoderStream that we'd need to reconcile.

Honestly, the more I think about it, the more I'd really prefer the ...fromAsync(...) pattern for multiple things...

const u8 = await Uint8Array.fromAsync(iterable);  // iterable must provide BufferSources
const str = await String.fromAsync(iterable);  // iterable must provide stringifiable chunks
const val = await JSON.fromAsync(iterable);  // iterable must provide stringifiable chunks
// etc

For text encoding, the pattern is pretty natural:

const str = await String.fromAsync(readable.pipeThrough(new TextDecoderStream("...")));

Jarred-Sumner · 2024-03-25T00:25:03Z

If we do add these directly to ReadableStream.prototype then they should only work on byte-oriented streams.

If the streams are passing strings that should be fine too for json and text, right? Particularly if they're internally buffering rather than returning the next chunk of valid data

Honestly, the more I think about it, the more I'd really prefer the ...fromAsync(...) pattern for multiple things...

From an implementer perspective, it sounds nice.

From a JS developer perspective, I have trouble imagining developers prefer this over .text(), .bytes(), .json().

My gut here is that most JS developers would prefer the idea of both, but probably 98% of all usage would be one of .text(), .json() .arrayBuffer() .bytes() rather than the Constructor.fromAsync(iterable) variant.

People are used to fetch-like .text() instead of fromAsync and fromAsync is more complicated to explain.

This chain of reasoning is more complicated:

it's a ReadableStream

therefore it's an async iterable

therefore to get the value as a String you call String.fromAsync

Compared to:

call .text() on the ReadableStream

if we can't specify the text encoding

Isn't there precedent for defaulting to UTF-8? Or is that Windows-1252?

For text encoding, the pattern is pretty natural:
const str = await String.fromAsync(readable.pipeThrough(new TextDecoderStream("...")));

I think this pattern composes well, but would only feel natural to developers who understand a lot of streams and spec authors. It would feel complicated for less experienced developers

bakkot · 2024-03-25T00:25:04Z

I would guess that something like Uint8Array.fromAsync would be unlikely to be optimized well: that is, it would probably end actually allocating and reifying each chunk of bytes as a JS object. It's possible to optimize that away under some circumstances, but it would require special cases for each producer/consumer pair, and there's a lot of fiddly details which would need to get implemented. Whereas something like ReadableStream.prototype.bytes() could trivially avoid those intermediate allocations.

annevk · 2024-03-25T07:42:42Z

There is precedent for enforcing UTF-8 for text() and I'd object to doing anything else here. Folks can write their own wrapper if they can't be bothered to fix the legacy content.

ricea · 2024-03-25T08:06:19Z

It looks like we have a plan here, and we just need someone to do the work. I don't have time to commit to this in the near future, but if someone has time to draft a PR, I'd be happy to look at it.

I believe we have rough consensus on the following issues:

Chunks should be of the same types as accepted by the Blob constructor. As an implementer, I'm a little unhappy with implementing Blob chunks because reading it is async and will make my life hard, but I can live with it. I also have some ergonomics concerns that accepting USVString will result in a lot of [object Object] in the output, but people have been living with that with Blob and it isn't the end of the world.
We should only handle UTF-8 for input and output. TextDecoderStream already exists for people who need more encoding support.
text(), arrayBuffer() and blob() seem uncontroversial. json() is easy to do if we spec it as parsing at the end like fetch does but maybe there isn't much demand? Other methods seem to require further discussion.

annevk · 2024-03-25T08:14:54Z

I think it's worth considering doing bytes() (Uint8Array) instead of arrayBuffer() as @bakkot suggested. We should probably do bytes() on the other objects that offer such methods.

ricea · 2024-03-25T08:16:19Z

I like bytes() but I thought it might be controversial.

Jarred-Sumner · 2024-03-25T08:40:17Z

bytes() seems nice. I think it’s unusual for developers to want to create an ArrayBuffer, they’re anlmost always going to either pass it somewhere else or create a Uint8Array wrapping it. In JSC iirc creating an ArrayBuffer ends up being more expensive since it becomes a WastefulTypedArray internally. As a sidenote, are there plans for BodyMixin / Response / Request & Blob to get a bytes()?

annevk · 2024-03-25T09:10:38Z

No concrete plans, but PRs will be accepted.

jasnell · 2024-03-25T17:08:53Z

All SGTM!

bakkot · 2024-04-03T20:51:58Z

As a sidenote, are there plans for BodyMixin / Response / Request & Blob to get a bytes()?

I have whatwg/fetch#1732 but there's no expressed implementer interest yet.

saschanaz · 2024-04-04T14:06:29Z

I believe we have rough consensus on the following issues:

Chunks should be of the same types as accepted by the Blob constructor

Maybe I'm missing something but I don't see this part in this thread, has there been discussion about this somewhere else? new Response(stream).arrayBuffer() throws if the chunk is not Uint8Array, so this is a bit surprising to me.

ricea · 2024-04-08T12:18:47Z

Maybe I'm missing something but I don't see this part in this thread, has there been discussion about this somewhere else? new Response(stream).arrayBuffer() throws if the chunk is not Uint8Array, so this is a bit surprising to me.

This was based on @jimmywarting
and @jakearchibald's comments: #1019 (comment)

However, on re-reading I realised that @jakearchibald was only saying this should work for the blob() method. This comes for free if you create a Blob by converting the ReadableStream to an array and then calling the Blob(array) constructor, but I would like to stream data into the Blob if at all possible.

Anyway, it looks like we don't have consensus on this item, so let's discuss it a bit more.

First question: should all of text(), blob() and bytes() accept the same chunk types?

Arguments for:

Easy to explain.
Avoids people doing hacks like await (await response.body.blob()).text() just to accept the largest possible set of chunk types.

Arguments against:

Implicit conversions between text and binary are nasty.
Blob chunks are annoying to implement because the contents may only be available asynchronously.

Anything else?

saschanaz · 2024-04-08T13:37:23Z

First question: should all of text(), blob() and bytes() accept the same chunk types?

I vote yes.

My second question: do we really want to accept everything possible in non-byte ReadableStream instead of simply requiring ArrayBufferView as ReadableByteStreamControllerEnqueue does? If yes, should we also accept everything possible in the byte stream controller?

taralx · 2024-04-08T15:51:54Z

Is there an equivalent for getting the endings option if we allow strings?

If I had a vote I'd say "yes, but only buffer sources". Allowing Blobs and strings in a constructor is not the same as allowing them in a stream.

jasnell · 2024-04-09T21:32:48Z

I've made a first attempt at drafting some spec around this in #1311 ... it intentionally limits to just text, Uint8Array, and Blob, ignoring json() for now. Still likely a number of rough edges but it's a start

ricea · 2024-04-10T07:05:59Z

#1311 allows cancellation of blob(), text() and bytes() with an AbortSignal. It cancels the ReadableStream if abort is signalled. An alternative would be to leave data in the stream. Personally I prefer the cancellation approach since it is more likely to be consistent between implementations, but if anyone feels differently please speak up.

jasnell · 2024-04-10T13:38:40Z

Yeah, definitely worth talking about that point. I had considered make it so that it would only cancel the read loop and not the underlying stream. A preventCancel option like with pipeTo/pipeThrough might be worthwhile?

ricea mentioned this issue Oct 15, 2019

CompressStream w3ctag/design-reviews#410

Closed

5 tasks

ricea mentioned this issue Feb 18, 2020

Simplify example. Fixes #29. whatwg/compression#30

Merged

jan-ivar mentioned this issue Oct 21, 2020

arrayBuffer() should leave the stream locked w3c/webtransport#104

Closed

MattiasBuelens mentioned this issue Jul 12, 2021

ReadableStreamBYOBReader.prototype.readFully(view) #1143

Closed

MattiasBuelens mentioned this issue Aug 15, 2021

Add ReadableStreamBYOBReader.prototype.read(view, { min }) #1145

Merged

6 tasks

MattiasBuelens mentioned this issue Sep 26, 2021

The stream is not in a state that permits enqueue; new Response(readable).blob().text() resolve to [object ReadableStream] not underlying source MattiasBuelens/web-streams-polyfill#95

Closed

MattiasBuelens mentioned this issue Jul 5, 2022

Proposal: ReadableStreamBodyReader #1238

Closed

sparecycles mentioned this issue Nov 23, 2023

Functions to read from stdin nodejs/node#37925

Closed

jasnell mentioned this issue Apr 9, 2024

ReadableStream.prototype.text(), .blob(), and .bytes() #1311

Draft

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ReadableStream.prototype.arrayBuffer() #1019

ReadableStream.prototype.arrayBuffer() #1019

ricea commented Oct 15, 2019

domenic commented Oct 15, 2019

annevk commented Dec 13, 2019

MattiasBuelens commented Dec 13, 2019

jakearchibald commented Feb 18, 2020

jimmywarting commented May 24, 2021 •

edited

Loading

jakearchibald commented May 24, 2021 •

edited

Loading

taralx commented May 30, 2021

ricea commented Jul 12, 2021

o-t-w commented Aug 19, 2021 •

edited

Loading

wffurr commented Sep 20, 2023

jimmywarting commented Sep 20, 2023

bakkot commented Oct 27, 2023

datner commented Nov 9, 2023

jasnell commented Mar 23, 2024

Jarred-Sumner commented Mar 24, 2024 •

edited

Loading

bakkot commented Mar 24, 2024

jasnell commented Mar 24, 2024 •

edited

Loading

Jarred-Sumner commented Mar 25, 2024

bakkot commented Mar 25, 2024

annevk commented Mar 25, 2024

ricea commented Mar 25, 2024

annevk commented Mar 25, 2024

ricea commented Mar 25, 2024

Jarred-Sumner commented Mar 25, 2024

annevk commented Mar 25, 2024

jasnell commented Mar 25, 2024

bakkot commented Apr 3, 2024

saschanaz commented Apr 4, 2024

ricea commented Apr 8, 2024

saschanaz commented Apr 8, 2024

taralx commented Apr 8, 2024 •

edited

Loading

jasnell commented Apr 9, 2024

ricea commented Apr 10, 2024

jasnell commented Apr 10, 2024

ReadableStream.prototype.arrayBuffer() #1019

ReadableStream.prototype.arrayBuffer() #1019

Comments

ricea commented Oct 15, 2019

domenic commented Oct 15, 2019

annevk commented Dec 13, 2019

MattiasBuelens commented Dec 13, 2019

jakearchibald commented Feb 18, 2020

jimmywarting commented May 24, 2021 • edited Loading

jakearchibald commented May 24, 2021 • edited Loading

taralx commented May 30, 2021

ricea commented Jul 12, 2021

o-t-w commented Aug 19, 2021 • edited Loading

wffurr commented Sep 20, 2023

jimmywarting commented Sep 20, 2023

bakkot commented Oct 27, 2023

datner commented Nov 9, 2023

jasnell commented Mar 23, 2024

Jarred-Sumner commented Mar 24, 2024 • edited Loading

bakkot commented Mar 24, 2024

jasnell commented Mar 24, 2024 • edited Loading

Jarred-Sumner commented Mar 25, 2024

bakkot commented Mar 25, 2024

annevk commented Mar 25, 2024

ricea commented Mar 25, 2024

annevk commented Mar 25, 2024

ricea commented Mar 25, 2024

Jarred-Sumner commented Mar 25, 2024

annevk commented Mar 25, 2024

jasnell commented Mar 25, 2024

bakkot commented Apr 3, 2024

saschanaz commented Apr 4, 2024

ricea commented Apr 8, 2024

saschanaz commented Apr 8, 2024

taralx commented Apr 8, 2024 • edited Loading

jasnell commented Apr 9, 2024

ricea commented Apr 10, 2024

jasnell commented Apr 10, 2024

jimmywarting commented May 24, 2021 •

edited

Loading

jakearchibald commented May 24, 2021 •

edited

Loading

o-t-w commented Aug 19, 2021 •

edited

Loading

Jarred-Sumner commented Mar 24, 2024 •

edited

Loading

jasnell commented Mar 24, 2024 •

edited

Loading

taralx commented Apr 8, 2024 •

edited

Loading