Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make 'read!(::Base.LibuvStream, ::Array{UInt8, 1})' indicate the number of bytes actually received on EOFError #14630

Merged
merged 1 commit into from
Jan 20, 2016

Conversation

mmaxs
Copy link
Contributor

@mmaxs mmaxs commented Jan 11, 2016

... and perhaps not lose data on UVError.
Before doing the changes I want to ask - what is the purpose of employing an additional reference to the LibuvStream internal buffer in the following excerpts?

function read(this::LibuvStream, ::Type{UInt8})
    wait_readnb(this, 1)
    buf = this.buffer
    @assert buf.seekable == false
    read(buf, UInt8)
end

function readavailable(this::LibuvStream)
    wait_readnb(this, 1)
    buf = this.buffer
    @assert buf.seekable == false
    takebuf_array(buf)
end

function readuntil(this::LibuvStream, c::UInt8)
    wait_readbyte(this, c)
    buf = this.buffer
    @assert buf.seekable == false
    readuntil(buf, c)
end

Why are the functions not like just:

function read(this::LibuvStream, ::Type{UInt8})
    wait_readnb(this, 1)
    read(this.buffer, UInt8)
end

function readavailable(this::LibuvStream)
    wait_readnb(this, 1)
    takebuf_array(this.buffer)
end

function readuntil(this::LibuvStream, c::UInt8)
    wait_readbyte(this, c)
    readuntil(this.buffer, c)
end

I want to change read!(::Base.LibuvStream, ::Array{UInt8, 1}) so that on exceptions it resizes its array argument to be able somehow indicate the number of bytes actually received.

@tkelman tkelman added the domain:io Involving the I/O subsystem: libuv, read, write, etc. label Jan 11, 2016
@kshyatt kshyatt added the domain:error handling Handling of exceptions by Julia or the user label Jan 12, 2016
@tkelman tkelman added the needs tests Unit tests are required for this change label Jan 14, 2016
@tkelman
Copy link
Contributor

tkelman commented Jan 14, 2016

This is one of several methods that is broken for UDPSocket or any other subtype of LibuvStream that does not have a buffer field.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jan 14, 2016

a LibuvStream should be assumed to implement fields of those names if meaningful, or to fail to find them if not meaningful (for example: UDPSocket.buffer is not meaningful since UDP doesn't support read/write). it is intended essentially as a typealias for the union over the libuv-backed IO objects in Base that supported all of the same operations (but with a nice name).

@tkelman
Copy link
Contributor

tkelman commented Jan 14, 2016

Without the fields that "the same operations" are assuming to be present, UDPSocket doesn't support the same operations. Ref #14552, it should probably not be a subtype if it doesn't implement the assumed interface.

@mmaxs mmaxs changed the title WIP: make 'read!(::Base.LibuvStream, ::Array{UInt8, 1})' indicate the number of bytes actually received on EOFError make 'read!(::Base.LibuvStream, ::Array{UInt8, 1})' indicate the number of bytes actually received on EOFError Jan 16, 2016
@mmaxs
Copy link
Contributor Author

mmaxs commented Jan 16, 2016

Thanks for explanation. I little expected that it may have something to do with UDPSocket. I thought it is something concerned with GC and the stream buffer substitution trick in the read!(::Base.LibuvStream, ::Array{UInt8, 1}) for array size greater than SZ_UNBUFFERED_IO (65535 bytes) .

Of course UDP socket is not of a "stream" type since it is not involve establishing a connection between both sides of communication. And in libuv itself uv_udp_t is not a "subtype" of uv_stream_t like uv_tcp_t is. Perhaps a more general supertype for all libuv-backed IO objects should be considered, e.g:

IO :>
    LibuvIO :>
        UDPSocket
        LibuvStream :>
            PipeEndpoint
            TTY
            TCPSocket

if !isempty(s.readnotify.waitq)
start_reading(x) # resume reading iff there are currently other read clients of the stream
start_reading(s) # resume reading iff there are currently other read clients of the stream
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where did x come from? seems like like a typo and missing test coverage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think no one having ever seen a UndefVarError or suchlike from here suggests that this branch has very little chance to be executed in concurrency context.

I also suspect that after the stream's buffer has been switched, all other simultaneous pended reads from the same socket that are waiting for the readnotify event in wait_readnb() loop begin to check the loop condition while isopen(x) && nb_available(x.buffer) < nb against the substituted buffer. But when they return from the wait_readnb() and strat copying data (i.e. read!(sbuf, a)) they do it from their saved local reference sbuf rather than from the actual stream's buffer.

@tkelman
Copy link
Contributor

tkelman commented Jan 16, 2016

Yeah, I like that idea. Can be separate from this PR of course. Adding backport pending label for the typo fix commit. If you can come up with a test that hits that branch that'd be much appreciated.

@tkelman tkelman added backport pending 0.4 and removed needs tests Unit tests are required for this change labels Jan 16, 2016
rethrow()
end
finally
close(c)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent

@samoconnor
Copy link
Contributor

It seems to me that read! is intended to have only two possible results: 1) the array is completely filled, 2) EOFError. It doesn't seem like a good thing to rely on a partial result. I'm pretty sure no other read! methods have this resizing on EOFError behaviour.

The readbytes! function allows for partial reads and returns the number of bytes read.

Note that there is ongoing discussion about the names of these functions.

See also Jeff's comment here: #14660 (comment)

@nalimilan
Copy link
Member

If we want to merge read! and readbytes, we need to choose which behavior is the best one. As @JeffBezanson, I would say that the most practical solution is to leave the array untouched, and return the number of read bytes. Then it's very easy to resize the array if you want to, but you can also keep it as a buffer for future calls.

@mmaxs
Copy link
Contributor Author

mmaxs commented Jan 18, 2016

There is no method of readbytes!() for LibuvStream. There is such a one for its general supertype - IO:

# read up to nb bytes into nb, returning # bytes read
function readbytes!(s::IO, b::AbstractArray{UInt8}, nb=length(b))
    olb = lb = length(b)
    nr = 0
    while nr < nb && !eof(s)
        a = read(s, UInt8)
        nr += 1
        if nr > lb
            lb = nr * 2
            resize!(b, lb)
        end
        b[nr] = a
    end
    if lb > olb
        resize!(b, nr) # shrink to just contain input data if was resized
    end
    return nr
end

Which delegates the actual byte by byte read operation to

function read(this::LibuvStream, ::Type{UInt8})
    wait_readnb(this, 1)
    buf = this.buffer
    @assert buf.seekable == false
    read(buf, UInt8)
end

Apparently this is slow. (This have just now recalled me the PR #14667 which improves the performance for wait_readnb() and eof().) Therefore I've made the suggestion. No concern with #14608 or whether read!() should resize its argument or return the number of read bytes. If the set of IO functions is going to be redesigned, it's a matter of practical sense and programming experience how their interfaces should be implemented.

@samoconnor
Copy link
Contributor

I was thinking that something like this might be a better way to get the effect you want:

--- a/base/stream.jl
+++ b/base/stream.jl
@@ -881,8 +881,7 @@ function readbytes(stream::LibuvStream)
     return takebuf_array(stream.buffer)
 end

-function read!(s::LibuvStream, a::Array{UInt8, 1})
-    nb = length(a)
+function readbytes!(s::LibuvStream, a::Array{UInt8, 1}, nb=length(a))
     sbuf = s.buffer
     @assert sbuf.seekable == false
     @assert sbuf.maxsize >= nb
@@ -893,7 +892,7 @@ function read!(s::LibuvStream, a::Array{UInt8, 1})

     if nb <= SZ_UNBUFFERED_IO # Under this limit we are OK with copying the array from the stream's buf
         wait_readnb(s, nb)
-        read!(sbuf, a)
+        r = readbytes!(sbuf, a, nb)
     else
         try
             stop_reading(s) # Just playing it safe, since we are going to switch buffers.
@@ -902,6 +901,7 @@ function read!(s::LibuvStream, a::Array{UInt8, 1})
             s.buffer = newbuf
             write(newbuf, sbuf)
             wait_readnb(s, nb)
+            r = nb_available(newbuf)
         finally
             s.buffer = sbuf
             if !isempty(s.readnotify.waitq)
@@ -909,7 +909,16 @@ function read!(s::LibuvStream, a::Array{UInt8, 1})
             end
         end
     end
-    return a
+    return r
+end
+
+function read!(s::LibuvStream, b::Array{UInt8, 1})
+    nb = length(b)
+    r = readbytes!(s, a, nb)
+    if r < nb
+        throw(EOFError())
+    end
+    return b
 end

@samoconnor
Copy link
Contributor

I'm working on a test case to compare the behaviour of various read*() methods for various ::IO subtypes over here: https://github.com/samoconnor/julia/blob/uv_fs_read_eof/test/read.jl
I'll add LibuvStream to that using your oneshot_accept code.

@samoconnor
Copy link
Contributor

There is no method of readbytes!(::LibuvStream, ...)

It seems that there is, but that it maybe doesn't play nicely with the SZ_UNBUFFERED_IO optimisation...
https://github.com/JuliaLang/julia/blob/master/base/stream.jl#L897

@samoconnor
Copy link
Contributor

See related readbytes! fix / test case : #14699 cf6fdc1

@mmaxs
Copy link
Contributor Author

mmaxs commented Jan 19, 2016

As far as I've understood you suggest to turn read!(::Base.LibuvStream, ...) into readbytes!(::Base.LibuvStream, ...) and then make read!() based on readbytes!(). Right?
I think it might be a better solution for this local problem. So for this PR not to interfere with other eventual future changes I only leave the typo fixing commit in it.

samoconnor added a commit to samoconnor/julia that referenced this pull request Jan 19, 2016
tkelman added a commit that referenced this pull request Jan 20, 2016
make 'read!(::Base.LibuvStream, ::Array{UInt8, 1})' indicate the number of bytes actually received on EOFError
@tkelman tkelman merged commit 2319785 into JuliaLang:master Jan 20, 2016
tkelman pushed a commit that referenced this pull request Mar 7, 2016
(cherry picked from commit bcc6e2f)
ref #14630
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:error handling Handling of exceptions by Julia or the user domain:io Involving the I/O subsystem: libuv, read, write, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants