-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a Read::initializer method #42002
Conversation
src/libstd/io/buffered.rs
Outdated
@@ -182,6 +206,10 @@ impl<R: Read> Read for BufReader<R> { | |||
} | |||
} | |||
|
|||
// we cant impl unconditionally because of the large buffer case in read. | |||
#[unstable(feature = "trusted_len", issue = "0")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to be "trusted_read"? "trusted_len" also appears in a few other places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, will fix
src/libstd/io/buffered.rs
Outdated
impl<R> MakeBuffer for R | ||
where R: TrustedRead | ||
{ | ||
fn make_buffer(capacity: usize) -> Vec<u8> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be simplified by removing the MakeBuffer
trait and just having make_buffer
as a free function?
edit: disregard. The trait is necessary to allow for specialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's probably a cleaner way of doing this but I don't really know what I'm doing with specialization beyond the basics :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I don't think this can be a free function, I believe specialization only applies to trait impls right now
👎 As I said in the other thread, I'd prefer an 'uninitialized buffer' (or 'appendable') trait, with a corresponding method on |
Agree with @comex . I'd prefer to encode that you must write some array or slice inside the language over this. We don't need super sweet sugar for it, a trait would be enough. |
@comex I don't think I quite understood your proposal or what it would look like. Not requiring all @sfackler I think FWIW this pattern isn't the first of its kind, @Kixunil do you have thoughts on the marker trait pattern? |
@sfackler presumably this could also change the default implementation of Also, I think the various |
@comex what do you mean by "forcing every @est31 I don't really understand what that means. Any given call to |
Updated |
0014e80
to
e2653f7
Compare
To be more explicit about my proposal: The real problem is that the type signature of fn read(&mut self, buf: &mut [u8]) -> Result<usize>; If implementations aren't supposed to read from fn read(&mut self, buf: &mut Vec<u8>) -> Result<usize>; Then, let ptr = buf.as_mut_ptr().offset(buf.len() as isize);
let size = buf.capacity() - buf.len();
let actual = os_read(ptr, size);
buf.set_len(buf.len() + actual); Of course, baking But we could abstract over the required functionality, such as with a trait: unsafe trait UninitializedBuffer<T> {
fn get(&mut self) -> *mut [T];
unsafe fn did_fill(&mut self, size: usize);
} This encapsulates exactly the pattern from before: unsafe code asks for an uninitialized buffer, fills it out, then reports back how much was filled. Then fn read_to_uninitialized(&mut self, buf: impl UninitializedBuffer<u8>) -> Result<usize>; There are a few concerns here:
So what are the actual benefits of this scheme over @sfackler's (seemingly simpler) proposal? Well...
|
How does changing from a
It is no more unsafe than use of
Where does this
Wouldn't it be equally easy to forget to implement
Yes it's totally fine. Uninitialized memory is used extensively throughout the standard library. |
Because
It is more unsafe because my suggestion allows non-FFI readers to be written using only safe code.
Among the non-IO standard library
Not equally easy, because it's one thing to do right rather than wrong (implement the new one rather than the deprecated old one), not an extra thing to remember in addition to the main thing ( |
@sfackler as one minor technical piece, mind deleting all the read_to_end specializations in libstd? I think there's a bunch inside of src/libstd/sys/* and they should all be obviated now I think. |
@alexcrichton Thank you very much for mentioning me and being interested in my opinion, I highly appreciate it! As documentation for One advantage I see in marker trait is that if you know the implementation is correct, you can express it without any change to implementation. Also, I wasn't targeting specialization. I intended to write a helper which turns any Regarding I'm not particularly convince which approach is "the right one". Currently my decision process would be based on whether something is impossible with either one which is possible with the other. In that case I'd go with the more capable one. If someone proves them equivalent, then I'd look at ergonomics. |
@carllerche Yeah, |
src/libstd/io/buffered.rs
Outdated
use memchr; | ||
|
||
trait MakeBuffer { | ||
fn make_buffer(capacity: usize) -> Vec<u8>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be unsafe fn
given that it potentially creates an uninitialized buffer?
I am still having a hard time visualizing what Could you maybe put together an equivalent implementation to this PR for comparison? |
Removed all of the old read_to_end impls. |
My 2c, just rephrasing a bit: The problem is: implementer of
Second approach is (disregarding backward-compat): the compiler statically ensures the buffer is only written to, not read from. I think that in principle, the second approach is favorable, if it can be made to work. I think that's what @comex is trying to achieve, before giving up and going the But fundamentally, it doesn't seem workable: in order to write, you must have a |
This would have been preferable, and indeed was discussed 2.5 years ago shortly before Rust 1.0:
But 1.0 is done, |
Ok from the technical side of things, looks great to me! So thinking about this I personally find the biggest downside to be that this won't work through trait objects. For example if you have Another possibility for implementing this functionality would be taking a tokio-io style approach: pub trait Read {
unsafe fn is_trusted(&self) -> bool { false }
} That's still backwards compatible, works through trait objects, and shouldn't come at a perf cost because except in the trait object case it should always be inlined away. The major downside I see to this approach is that I'm curious, what do others think of that solution? Are there more downsides I'm not thinking of? |
However, what would w/ branch prediction if |
src/libstd/process.rs
Outdated
@@ -211,6 +211,10 @@ impl Read for ChildStdout { | |||
fn read_to_end(&mut self, buf: &mut Vec<u8>) -> io::Result<usize> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed, right? I think there's one below on ChildStderr
too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, fixed
👍 looks great! Sort of unfortunate the |
src/libstd/io/mod.rs
Outdated
/// `Read`er - the method only takes `&self` so that it can be used through | ||
/// trait objects. | ||
/// | ||
/// # Safety |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we typically title sections like this "Unsafety"
src/libstd/io/mod.rs
Outdated
/// This method is unsafe because a `Read`er could otherwise return a | ||
/// non-zeroing `Initializer` from another `Read` type without an `unsafe` | ||
/// block. | ||
#[unstable(feature = "maybe_initialized", issue = "0")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a reminder to file a tracking issue to fill in before landing
@@ -0,0 +1,7 @@ | |||
# `should_initialize` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think tidy will also request that this file is renamed
Updated with a tracking issue - should be ready to go I think. |
This is an API that allows types to indicate that they can be passed buffers of uninitialized memory which can improve performance.
Read::initializer
@bors: r+ |
📌 Commit ecbb896 has been approved by |
Add a Read::initializer method This is an API that allows types to indicate that they can be passed buffers of uninitialized memory which can improve performance. cc @SimonSapin r? @alexcrichton
So… this is landing hours after a new designed being proposed (as far as people not in the triage discussion are concerned), but this new design uses By adding a precedent in the standard library, this is effectively changing the meaning of a language feature. Is this OK? Was this discussed? @alexcrichton, you called this a “major downside” in #42002 (comment). |
☀️ Test successful - status-appveyor, status-travis |
@SimonSapin It's not backwards. Certain implementations of the method always return I'd agree with you if the method was only returning a |
This looks like an evil hack. The reason safe code can't return an |
The unsafety here is in two locations:
We realized during the discussion that both are necessary. The documentation should indicate this. If it doesn't then it should be noted on the tracking issue that the unsafety is unclear. |
Is the documentation here not clear? https://github.com/rust-lang/rust/pull/42002/files#diff-668f8f358d4a93474b396dcb3727399eR521 |
This is an API that allows types to indicate that they can be passed
buffers of uninitialized memory which can improve performance.
cc @SimonSapin
r? @alexcrichton