Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document and export Base.peek #28811

Merged
merged 5 commits into from
May 7, 2020
Merged

Document and export Base.peek #28811

merged 5 commits into from
May 7, 2020

Conversation

ararslan
Copy link
Member

It's generally useful but is currently both undocumented and unexported.

Closes #28797.

@ararslan ararslan added docs This change adds or pertains to documentation io Involving the I/O subsystem: libuv, read, write, etc. labels Aug 22, 2018
@ararslan ararslan requested a review from Keno August 22, 2018 00:00
@StefanKarpinski
Copy link
Member

Does it work on all kinds of I/O streams now?

@ararslan
Copy link
Member Author

Maybe? I didn't change any methods, but there is a method already there for peek(::IO).

@StefanKarpinski
Copy link
Member

@Keno, you recently added some methods to peek, any sense of how complete it is at this point? Another issue is making it more generic—i.e. combine peek and peekchar?

@bicycle1885
Copy link
Member

There is a generic method:

julia/base/iostream.jl

Lines 513 to 519 in e4d933d

function peek(s::IO)
mark(s)
try read(s, UInt8)
finally
reset(s)
end
end
.

But it depends on mark and reset, and the default implementations for these functions are not enough generic to support all kinds of IO streams.

@ViralBShah
Copy link
Member

I am guessing we do not want to export a new API here. Please suggest to merge or close.

Since this changes the export, I'll remove the doc label (since it's not strictly a doc PR).

@ViralBShah ViralBShah removed the docs This change adds or pertains to documentation label May 1, 2020
@ararslan
Copy link
Member Author

ararslan commented May 1, 2020

I think peek is useful enough to be exported but this PR is quite stale and I don't know when I'd have time to revisit it.

@StefanKarpinski
Copy link
Member

I think the API should be that peek(io) returns a single byte and peek(io, Char) returns a character. Since Base.peek already implements peek(io) that can just be exported. The methods for Base.peekchar can be turned into Base.peek(io, Char) methods too.

@ararslan
Copy link
Member Author

ararslan commented May 1, 2020

Found a spare couple of minutes for a rebase. peekchar seems not to be a thing anymore though, but I added peek(io, Char) for consistency with read.

@StefanKarpinski
Copy link
Member

I'm having trouble seeing where peek(io, Char) is defined...

@ararslan
Copy link
Member Author

ararslan commented May 1, 2020

I changed it to peek(io, T) where T defaults to UInt8. The method in base/io.jl for peek(io, T) calls read(io, T) internally instead of unconditionally calling read(io, UInt8) as it does on master. That allows peek(io, Char) to work while also permitting peeking other types supported by read.

@ararslan
Copy link
Member Author

ararslan commented May 2, 2020

I noticed that peek(::IOStream) returns an Int32. That is inconsistent with everything else, but changing it would be potentially breaking...

@StefanKarpinski
Copy link
Member

Ah, that seems bad. Does it actually read/return 32-bits of data or just a byte cast to Int32?

@ararslan
Copy link
Member Author

ararslan commented May 4, 2020

It's doing a ccall to return a Cint but the underlying C function seems to only be reading a byte; it actually does an explicit cast to unsigned char then implicit cast to int. Indeed:

julia> open(io -> print(io, "こんにちは"), "thing", "w")

julia> open(Base.peek, "thing", "r")
227

@StefanKarpinski
Copy link
Member

StefanKarpinski commented May 4, 2020

So in C it's traditional to return an int value that's not equal to any possible byte value when the stream is EOF but in Julia we consider that to be an exception instead and expect the programmer to check for EOF using eof before doing a read and it seems like peek should work the same. So it seems like this should check if the returned value is the EOF value and raise an error if so and otherwise return the value converted to a byte. @JeffBezanson, does that seem right to you?

@ararslan
Copy link
Member Author

ararslan commented May 4, 2020

Something like the definition of read(::IOStream) but with ios_peekc instead of ios_getc, then?

@StefanKarpinski
Copy link
Member

Exactly.

Copy link
Member

@StefanKarpinski StefanKarpinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really nice bit of API rationalization and making peek coherent enough to export. Thank you! Might be good for @JeffBezanson to take a look since he's got a good eye for details I may have missed.

@ararslan ararslan requested review from JeffBezanson and removed request for Keno May 5, 2020 15:49
@@ -1193,9 +1193,9 @@ unmark(x::LibuvStream) = unmark(x.buffer)
reset(x::LibuvStream) = reset(x.buffer)
ismarked(x::LibuvStream) = ismarked(x.buffer)

function peek(s::LibuvStream)
function peek(s::LibuvStream, ::Type{T}) where T
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be identical to the fallback method. Not a problem though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I noticed that. Figured I'd punt on dealing with that though, just in case there was something I was missing.

@JeffBezanson
Copy link
Member

I'm a little concerned that the default peek(io, Char) will be slow but I guess we can fix that later.

@JeffBezanson
Copy link
Member

In io.jl around line 354 there's a for loop that defines another peek method that needs to accept a type argument.

@StefanKarpinski
Copy link
Member

I'm a little concerned that the default peek(io, Char) will be slow but I guess we can fix that later.

I had the same thought: the efficient way to do this is to add support for peeking four bytes at a time and then decoding a UTF-8 character from that.

@JeffBezanson JeffBezanson merged commit d4d9303 into master May 7, 2020
@JeffBezanson JeffBezanson deleted the aa/peek-a-boo branch May 7, 2020 00:56
@ararslan
Copy link
Member Author

ararslan commented May 7, 2020

Thanks everyone! Always fun to revisit a nearly 2 year old PR and get it merged. :)

eulerkochy added a commit to JuliaCollections/DataStructures.jl that referenced this pull request May 14, 2020
added import from `Base`. 
Reason : JuliaLang/julia#28811
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
io Involving the I/O subsystem: libuv, read, write, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants