Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(io): iterateReader[Sync]() #4247

Merged
merged 10 commits into from
Feb 19, 2024
Merged

feat(io): iterateReader[Sync]() #4247

merged 10 commits into from
Feb 19, 2024

Conversation

iuioiua
Copy link
Contributor

@iuioiua iuioiua commented Jan 29, 2024

While working on deprecation documentation, I realised that there's no new means for synchronous iteration for the ReaderSync interface. I thought we might as well un-deprecate this functionality, along with its asynchronous sibling.

Note: this hasn't yet been decided upon.

@kt3k
Copy link
Member

kt3k commented Jan 29, 2024

These used to be called iterateReader and iterateReaderSync (https://github.com/denoland/deno_std/blob/aad99992653b45d9bd09ccba1f645accb22f5cf4/streams/iterate_reader.ts). Is it intentional to change these names?

@iuioiua
Copy link
Contributor Author

iuioiua commented Jan 29, 2024

Yes. I slightly liked the name more. This is not a strong opinion. Happy to change it back if we'd like.

@kt3k
Copy link
Member

kt3k commented Jan 29, 2024

They were originally Deno.iter and Deno.iterSync, and then moved to std and renamed to iterateReader and iterateReaderSync (These names were suggested by @nayeemrmn IIRC).

I don't have strong opinions to them, but I'd like to hear more opinions from the community.

@kt3k kt3k added the feedback welcome We want community's feedback on this issue or PR label Jan 29, 2024
@iuioiua
Copy link
Contributor Author

iuioiua commented Jan 29, 2024

I've now renamed them to toIterator() and toIteratorSync() to align it with the adjacent toReadableStream() and toWritableStream().

@iuioiua iuioiua changed the title feat(io): re-introduce iterate() and iterateSync() feat(io): toIterator() and toIteratorSync() Jan 29, 2024
@Leokuma
Copy link

Leokuma commented Jan 29, 2024

Is there a risk of ambiguity with toIterator? Like maybe there's another function that creates an iterator from something other than Reader. Would the other function also be called toIterator, just imported from a different module?

@iuioiua
Copy link
Contributor Author

iuioiua commented Jan 31, 2024

No risk of ambiguity that I'm aware of.

@iuioiua iuioiua enabled auto-merge (squash) January 31, 2024 10:02
@Leokuma
Copy link

Leokuma commented Jan 31, 2024

I'm concerned that the name "toIterator" might not be descriptive enough that it targets a Reader, especially if we eventually add another toIterator to another module that turns something other than Reader into an iterator.

OTOH, I think we have toText() somewhere, which also doesn't indicate the datatype it receives, so maybe toIterator is more consistent with that.

@nayeemrmn
Copy link
Contributor

Almost any unary generator function can be called toIterator(). IMO it's better to name them based on what the body of the generator function is doing. Rather than a conversion function, it's a utility for 'iterating the content of' or 'iteratively reading' a Reader. So perhaps it should contain a read or iterate verb.

@iuioiua
Copy link
Contributor Author

iuioiua commented Jan 31, 2024

Valid points by all. Which sounds better; toReaderIterator() or iterateReader()?

@Leokuma
Copy link

Leokuma commented Jan 31, 2024

+1 toReaderIterator, but maybe readerToIterator would be better so that it aligns with readerFromIterable?

iterateReader doesn't look so good when assigned to a variable. It might give the impression that it's doing something mutable in the Reader or that it receives a callback to be run onto Reader.

@iuioiua iuioiua changed the title feat(io): toIterator() and toIteratorSync() feat(io): toReaderIterator() and toReaderIteratorSync() Jan 31, 2024
@iuioiua
Copy link
Contributor Author

iuioiua commented Jan 31, 2024

Ok, I think I've settled for toReaderIterator(), which addresses the concerns mentioned above and indicates the purpose of the function.

@iuioiua
Copy link
Contributor Author

iuioiua commented Feb 2, 2024

Are we happy with the name now?

@kt3k
Copy link
Member

kt3k commented Feb 2, 2024

'ReaderIterator' sounds a bit strange to me as I've never heard of that word combination in js context.

So perhaps it should contain a read or iterate verb.

Also does toReaderIterator address @nayeemrmn's above comment?

@Leokuma
Copy link

Leokuma commented Feb 2, 2024

It could also be iteratorFromReader, in addition to my previous suggestion readerToIterator.

Personally I'd prefer to have the noun "iterator" over the verb "iterate" because the function itself doesn't really iterate: it produces an iterator, which then can be iterated later, or never. It's different from Array.reduce for example, which iterates immediately.

@iuioiua
Copy link
Contributor Author

iuioiua commented Feb 2, 2024

'ReaderIterator' sounds a bit strange to me as I've never heard of that word combination in js context.

So perhaps it should contain a read or iterate verb.

Also does toReaderIterator address @nayeemrmn's above comment?

No, but IMO, it shouldn't. The function neither reads nor iterates. @Leokuma makes this same point, which I agree with. The function converts a Reader to an iterator. Hence, the name toReaderIterator(), which, I feel, conveys its purpose to readers well. Otherwise, readerToIterator() might be another alternative.

@nayeemrmn
Copy link
Contributor

nayeemrmn commented Feb 2, 2024

The problem is the methods on iterator also don't convey that IO is taking place -- they convey 'give me the next element'. That's why it's fine to say it in the generator function's name. To me it's similar to an async function, you don't need to add ..Promise to clarify that it doesn't happen immediately.

@iuioiua
Copy link
Contributor Author

iuioiua commented Feb 2, 2024

My opinion on this isn't strong enough to block this PR. Would we be happy with just resorting to the original iterateReader()?

@Leokuma
Copy link

Leokuma commented Feb 3, 2024

My opinion isn't strong either. I'm fine with iterateReader.

@iuioiua iuioiua changed the title feat(io): toReaderIterator() and toReaderIteratorSync() feat(io): iterateReader[Sync]() Feb 4, 2024
@iuioiua
Copy link
Contributor Author

iuioiua commented Feb 4, 2024

Name restored to the original iterateReader[Sync]().

@kt3k
Copy link
Member

kt3k commented Feb 6, 2024

I'd like to explore some other ideas. How about the below candidates:

@iuioiua
Copy link
Contributor Author

iuioiua commented Feb 6, 2024

I'd like to explore some other ideas. How about the below candidates:

I agree with Nayeem that the function name should contain "iterate". readThrough()'s functionality could be confused with that of readAll() from std/io.

@kt3k
Copy link
Member

kt3k commented Feb 6, 2024

I agree with Nayeem that the function name should contain "iterate".

Nayeem suggests it should include read or iterate verb. #4247 (comment)

readThrough()'s functionality could be confused with that of readAll() from std/io.

I think what this API does is similar to readAll, but in a different form (Promise vs Async Iterable). So I think giving them similar names isn't too unnatural.

@nayeemrmn
Copy link
Contributor

nayeemrmn commented Feb 6, 2024

Maybe we should have been comparing this to readLines() and readDelim() (now deprecated) in the sense that this is the base form of those that uses the raw delimitation. Just call it readChunks().

It also occurred to me that we shouldn't bring back the non-sync version. Why wouldn't that one be deprecated when its aforementioned siblings are, and why do they need to be coupled? We should just bring back readChunksSync().

By the way an old discussion from when std/streams was added is how std/io would be ugly, in the sense that the non-deprecated stuff there would be a small handful of random utils that barely aren't applicable to 'streams'. You can kinda see that: https://deno.land/[email protected]/io/mod.ts. It's even uglier now that Reader etc. have been deprecated from the CLI. I think we're only keeping it around to prevent breakages in the ecosystem.

Sorry for the scattered thoughts. I think we should:

  • Deprecate std/io as a whole. Stop adding new things to it. The interfaces it revolves around are deprecated in the CLI so it has no blessed significance in the Deno ecosystem. We don't have to take the last step of removing it to protect dependents, but I doubt anyone can tell me what the vision or structure for this module is. IO utils for a typical user exist in std/streams now.
  • Add std/streams/sync.ts for stuff like readChunksSync(), readAllSync(), writeAllSync(). Keep in mind ReaderSync and WriterSync are 'streams', just not 'web streams'. I argued that to get iterateReader() moved there in the past.
  • Make these sync methods accept Deno.FsFile as an argument instead of ReaderSync/WriterSync, since nothing else implements these in Deno proper. Maybe.

WDYT @iuioiua @kt3k

@kt3k
Copy link
Member

kt3k commented Feb 6, 2024

Deprecate std/io as a whole.

I don't think deprecation (again) of std/io is an option here (at least in the near future). We decided to bring std/io back in #4120 #4128 to keep supporting Reader, Writer, and related stuff.

I doubt anyone can tell me what the vision or structure for this module is.

The purpose of std/io is now to support Reader, Writer, and related operations, which are not primary way to perform I/O in Deno, but they are still supported for the cases where Web Streams are insufficient/inappropariate.

BTW I think this topic is far beyond this PR.

@iuioiua
Copy link
Contributor Author

iuioiua commented Feb 6, 2024

readChunks() sounds alright.

@syhol
Copy link
Contributor

syhol commented Feb 10, 2024

Is there any interest in having this be part of the runtime API (say on FsFile) by implementing Symbol.iterator and Symbol.asyncIterator?

I feel this would be in line with the Deno philosophy: aligning with Web APIs and JS primitives. Personally I love working with iterators as I think they are a great abstraction and should be treated as first class.

@syhol
Copy link
Contributor

syhol commented Feb 10, 2024

However, If it's out of the question then I would like to throw the following into the ring:

iterateChunks();

Iterate chunks tells me 2 things I want to know:

  1. I'm going to get an iterator from this function
  2. That iterator is going to contain chunks

I liked the chunks from readChunks() but "read" is too ambiguous for me. I find "iterate" is much clearer.

I noticed a few options used "Reader" but I don't think the input type is too important in the function name because I would aim to optimise for reading and it should already be obvious what the input is e.g. iterateChunks(file). This only becomes an issue if you intend to create a function that iterates chunks for a different type in the same directory. For example, if you wanted to iterate chunks on streams. However you could support both types in the same function.

Alternatively instead of chunks we could have:

  • iterateByteArrays
  • iterateByteChunks
  • iterateUint8Arrays

streams/iterate_reader.ts Outdated Show resolved Hide resolved
@kt3k
Copy link
Member

kt3k commented Feb 19, 2024

Thank you for the suggestions and opinions. We discussed this internally and decided to go with the original names iterateReader and iterateReaderSync to minimize the ecosystem impact.

Copy link
Member

@kt3k kt3k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@iuioiua iuioiua merged commit a828744 into main Feb 19, 2024
10 checks passed
@iuioiua iuioiua deleted the undeprecate-iter branch February 19, 2024 10:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feedback welcome We want community's feedback on this issue or PR io streams
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants