Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for reading from a WebStreams #635

Merged
merged 13 commits into from
Jul 7, 2024

Conversation

Borewit
Copy link
Collaborator

@Borewit Borewit commented Jul 4, 2024

Changes:

  • Add support for reading from a WebStreams
  • Stream Blob via a WebStreams, instead of buffering the full content

Moves remaining native Node Readable streams to the Node.js specific entry point (index.js). As the fileTypeStream is based on the the native Node Readable stream, that function is no longer available in the primary (other then Node engine) entry point (core.js).

⚠️ fromBlob() requires Node.js ≥ 20 as the strtok3.fromStream requires a Readable from which a ReadableStreamBYOBReader can be created. Which is only available since Node.js 20.

Updating dependencies to:

graph TD;
    FTN("file-type (Node.js entry point)")-->FTP
    FTP("file-type (primary entry point)")-->S
    S(strtok3)-->P(peek-readable)
    S(strtok3)-->TO("@tokenizer/token")
    TY(token-types)-->TO
    TY-->IE("ieee754")
    FTP-->TY
    NS("node:stream")
    FTN-->NS
    FTP-->UAE(uint8array-extras)
    style NS fill:#F88,stroke:#A44
    style IE fill:#CCC,stroke:#888
    style FTN fill:#FAA,stroke:#A44
Loading

Previous dependency diagram can be found here: #578 (comment)

@Borewit Borewit force-pushed the add-support-for-webstreams branch from 3a3247a to 03e64c8 Compare July 4, 2024 21:31
@Borewit Borewit added API change Major change, dependents may need to update their code enhancement Add new functionality update dependencies and removed API change Major change, dependents may need to update their code labels Jul 4, 2024
@sindresorhus
Copy link
Owner

I would prefer to have a single fromStream method that supports both.

@Borewit
Copy link
Collaborator Author

Borewit commented Jul 4, 2024

I would prefer to have a single fromStream method that supports both.

That was quick.
I will give that some thought, I like to release this one together #633, as joined effort to work without Node Streams and Buffer.

@Borewit Borewit self-assigned this Jul 5, 2024
@Borewit
Copy link
Collaborator Author

Borewit commented Jul 5, 2024

I would prefer to have a single fromStream method that supports both.

@sindresorhus, are you not concerned that this make it more difficult to for tree-shaking to exclude node:buffer and nodes:stream when these are effectively not being used?

@Borewit Borewit force-pushed the add-support-for-webstreams branch from 03e64c8 to 1ba1549 Compare July 5, 2024 16:31
@Borewit Borewit added the API change Major change, dependents may need to update their code label Jul 5, 2024
@Borewit Borewit force-pushed the add-support-for-webstreams branch 4 times, most recently from 72dee28 to c22cd43 Compare July 6, 2024 10:26
@Borewit Borewit linked an issue Jul 6, 2024 that may be closed by this pull request
@Borewit Borewit force-pushed the add-support-for-webstreams branch 3 times, most recently from 2b20275 to d26ecf2 Compare July 6, 2024 10:55
@sindresorhus
Copy link
Owner

I was thinking that fromStream in core.js would only support web streams, same with the one in browser.js, but in index.js it would get the additional support for Node.js streams. That way, only Node.js users gets the Node.js imports.

@sindresorhus
Copy link
Owner

Alternatively, we could just drop support for Node.js streams, and just tell users to call .toWeb() on their Node.js streams. That may be even better actually.

@sindresorhus
Copy link
Owner

Alternatively, we could just drop support for Node.js streams, and just tell users to call .toWeb() on their Node.js streams. That may be even better actually.

Let's go with this.

@Borewit Borewit force-pushed the add-support-for-webstreams branch 2 times, most recently from 774743a to 569d467 Compare July 6, 2024 11:10
@Borewit
Copy link
Collaborator Author

Borewit commented Jul 6, 2024

Alternatively, we could just drop support for Node.js streams, and just tell users to call .toWeb() on their Node.js streams. That may be even better actually.

Let's go with this.

The [toWeb() method] (https://nodejs.org/api/stream.html#streamreadabletowebstreamreadable) is still experimental.

Maybe first release with as a hybrid solution, including Node.js & Web Streams. And indeed, moving the Node.js dependencies to the Node.js specific entry point index.js . Also note that in addition to the fileTypeFromStream(), there is the fileStream() which is still depends on Node.js streams, and also shared to with the primary entry point. I am honestly not very motivated to update that method, as it not complaint with the current architecture where file-type has access to the entire file. The argument for backward compatibility is now fading, as Web Streams are now becoming widely available.

- Stream Blob via a WebStreams, instead of buffering the full content
- Update strtok3 to v7.0.0
@Borewit Borewit force-pushed the add-support-for-webstreams branch from 569d467 to 1fce3c4 Compare July 6, 2024 13:28
@sindresorhus
Copy link
Owner

Maybe first release with as a hybrid solution, including Node.js & Web Streams. And indeed, moving the Node.js dependencies to the Node.js specific entry point index.js .

Ok, sure. But at least add a TODO comment to remove Node.js streams support and recommend toWeb() instead when it's stable.

@sindresorhus
Copy link
Owner

there is the fileStream() which is still depends on Node.js streams

It could be moved to index.js as it's already documented to only be available on Node.js.

@Borewit Borewit force-pushed the add-support-for-webstreams branch from 2684c67 to 9d0cfaf Compare July 6, 2024 14:39
* Typings for primary entry point, Node.js specific typings can be found in index.d.ts
*/

import type {ReadableStream as WebReadableStream} from 'node:stream/web';
Copy link
Owner

@sindresorhus sindresorhus Jul 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type should be available globally, so I don't think we need to import it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The global type in incompatible with Node.js type. If I change it here, the problem will appear elsewhere,

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what that means, but this is core.d.ts, so it shouldn't import types only available for Node.js.

Copy link
Collaborator Author

@Borewit Borewit Jul 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Node.js Readable types are incompatible with the global lib.dom.d.ts types, one things the challenges mentioned in #588 (comment), related to this issue DefinitelyTyped/DefinitelyTyped#60377, which unfortunately got closed with PR aiming to resolve that.

I used the Node.js, as Node.js has been the primary supported platform. Using the lib.dom.d.ts I need hack in types mapping cast somewhere.

Maybe good accept both, an do an ugly type cast, just for the convenience for our users.

Off-topic: The other mind f*ck, was the BYOB Readable Stream, The "Bring-Your-Own-Buffer" reader:

defines a reader for a ReadableStream that supports zero-copy reading from an underlying byte source

Well the first things the zero-copy method does is hijacking the buffer you bring, and essentially turns into junk (there is formal property for this state, forgot the name), meaning it can no longer be used. Which essentially forces you to create a new Buffer (as it becomes totally useless after providing it), and then copying the data to the Buffer you wanted to have it written in the first place. So the only feature BYOB has, versus the ReadableStreamDefaultReader is that you can control the chunk length to be written. How confusing, what a disappointment.

core.d.ts Outdated
@@ -511,7 +516,7 @@ export declare class FileTypeParser {
/**
Works the same way as {@link fileTypeFromStream}, additionally taking into account custom detectors (if any were provided to the constructor).
*/
fromStream(stream: ReadableStream): Promise<FileTypeResult | undefined>;
fromStream(stream: NodeReadableStream): Promise<FileTypeResult | undefined>;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this and toDetectionStream support web streams too?

Copy link
Collaborator Author

@Borewit Borewit Jul 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, should probably be done as well. (see #636)

@Borewit
Copy link
Collaborator Author

Borewit commented Jul 7, 2024

there is the fileStream() which is still depends on Node.js streams

It could be moved to index.js as it's already documented to only be available on Node.js.

That is very doable. We can raise an issue, to implement the same concept using Web Streams. I actually starting working on that, but it takes some time. Tricky stuff those streams.

Update, raised issue: #636 to implementing detecting web stream

@Borewit Borewit force-pushed the add-support-for-webstreams branch 6 times, most recently from d042dd5 to 1ab8b6d Compare July 7, 2024 11:10
@Borewit Borewit force-pushed the add-support-for-webstreams branch from 1ab8b6d to d5d55ef Compare July 7, 2024 11:15
@sindresorhus sindresorhus merged commit b815b5e into main Jul 7, 2024
6 checks passed
@sindresorhus sindresorhus deleted the add-support-for-webstreams branch July 7, 2024 22:05
@sindresorhus
Copy link
Owner

@Borewit Are we ready to do another release now or do you plan more changes? None of these changes are technically breaking, right? Since Buffer is also a Uint8Array, so 00e051b should not be breaking (it's only breaking if we would have changed a return value from Buffer to Uint8Array).

@Borewit
Copy link
Collaborator Author

Borewit commented Jul 8, 2024

@Borewit Are we ready to do another release now or do you plan more changes? None of these changes are technically breaking, right? Since Buffer is also a Uint8Array, so 00e051b should not be breaking (it's only breaking if we would have changed a return value from Buffer to Uint8Array).

This morning I realized, what could be documented better is the NodeFileTypeParser class

export class NodeFileTypeParser extends FileTypeParser {

which should be used instead of the FileTypeParser, for those implementing custom extensions and usign Node.js.

No objection to release now.

@Borewit Borewit mentioned this pull request Jul 8, 2024
@Borewit
Copy link
Collaborator Author

Borewit commented Jul 10, 2024

Part of v19.1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API change Major change, dependents may need to update their code enhancement Add new functionality update dependencies
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support reading from browser streams in server-side export Make Blob lazy
2 participants