-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading a CString safely without overhead from Read #59229
Comments
The second option seems very unlikely to be helpful; I can't think of any usecases for having a NonZeroU8 Vec beyond this API. Plus, converting to it would mean essentially implementing this same code - and then you could just use the from_vec_unchecked method. I think the first proposal is quite reasonable though. |
I think allowing to safely convert |
I wonder why Here's what an implementation for BufRead could look like: fn from_reader(mut reader: impl BufRead) -> Result<CString, std::io::Error>
{
let mut buffer = Vec::new();
reader.read_until(0, &mut buffer)?;
if buffer.len() > 0 {
// 0 has been read into the vector, pop it
buffer.pop(); //TODO: consider in-place transmute
} else {
// no bytes have been read, nothing to do
}
return Ok(unsafe { CString::from_vec_unchecked(buffer) });
} |
@Shnatsel I completely agree! I think in most cases (depending on how Read::read is implemented) an implementation for BufRead is probably much faster. I do however wonder: Wouldn't it be better to do it like this?
This way we can avoid doing a |
The implement PR is here: #59314 I could create one for BufRead as well however. Should I call it from_bufread? |
Using Here is a fixed (and performant) implementation: /// # Safety
///
/// - `BufRead` implementor mut enforce its contract
unsafe
fn from_reader_unchecked (
mut reader: impl BufRead,
) -> Result<CString, ::std::io::Error>
{
let mut buffer = Vec::new();
reader.read_until(b'\0', &mut buffer)?;
if let Some(&b'\0') = buffer.last() {
// # Safety
//
// - no null bytes before `buffer.last()` (thanks to `.read_until()` contract)
//
// - last byte has already been checked to be null
CString::from_vec_unchecked(buffer)
} else {
buffer.reserve_exact(1);
buffer.push(b'\0');
// # Safety
//
// - no null bytes before `buffer.last()` (thanks to `.read_until()` contract)
//
// - terminating null byte has been appended
CString::from_vec_unchecked(buffer)
}
}
/// # Safety:
///
/// - `read_until` must enforce its contract.
unsafe trait TrustedBufRead : BufRead {}
#[inline]
fn from_reader (
reader: impl TrustedBufRead,
) -> Result<CString, ::std::io::Error>
{
unsafe { from_reader_unchecked(reader) }
} |
@danielhenrymantilla @Shnatsel's implementation was correct to pop the null-byte, as unsafe
fn from_reader_unchecked (
mut reader: impl BufRead,
) -> Result<CString, ::std::io::Error>
{
let mut buffer = Vec::new();
reader.read_until(b'\0', &mut buffer)?;
if let Some(&b'\0') = buffer.last() {
buffer.pop()
// # Safety
// last was the first null byte encountered, it's been removed.
CString::from_vec_unchecked(buffer)
} else {
// # Safety
// no null bytes before `buffer.last()` (thanks to `.read_until()` contract)
CString::from_vec_unchecked(buffer)
}
} |
It is besides the question of safety and performance, but shouldn't |
…_from_vec_of_nonzerou8, r=KodrAus Added From<Vec<NonZeroU8>> for CString Added a `From<Vec<NonZeroU8>>` `impl` for `CString` # Rationale - `CString::from_vec_unchecked` is a subtle function, that makes `unsafe` code harder to audit when the generated `Vec`'s creation is non-trivial. This `impl` allows to write safer `unsafe` code thanks to the very explicit semantics of the `Vec<NonZeroU8>` type. - One such situation is when trying to `.read()` a `CString`, see issue rust-lang#59229. - this lead to a PR: rust-lang#59314, that was closed for being too specific / narrow (it only targetted being able to `.read()` a `CString`, when this pattern could have been generalized). - the issue suggested another route, based on `From<Vec<NonZeroU8>>`, which is indeed a less general and more concise code pattern. - quoting @Shnatsel: - > For me the main thing about making this safe is simplifying auditing - people have spent like an hour looking at just this one unsafe block in libflate because it's not clear what exactly is unchecked, so you have to look it up when auditing anyway. This has distracted us from much more serious memory safety issues the library had. Having this trivial impl in stdlib would turn this into safe code with compiler more or less guaranteeing that it's fine, and save anyone auditing the code a whole lot of time.
This is addressed by #64069 and should ship in 1.43 |
The new trait impl addressing this has shipped in 1.43, so this issue can be closed. |
How does having a You have a stream of bytes and want to interpret the non-nul prefix as a CString, and you don't want to have a non-stdlib unsafe{} block to to this without extra checks and unwraps. It's strange that there's no safe |
Hello everyone,
When I was inspecting libflate for unsafe code I found this piece of code:
The Problem
If I am correct and didn't miss anything this function is completely safe. However because there is (for as far as I know) no functionality that can safely read a
CString
without performance overhead, the author probably felt forced to implement it himself.I think reading
CStrings
from an object that implementsRead
is a pretty common operation when handling binary files, so it might be good to provide functionality in the standard library for doing so without sacrificing performance.Currently two solutions have been presented.
Solution 1: Add a
CString::from_reader
methodOne solution would be to add a
CString::from_reader
method as follows:Pro's and Con's:
CStrings
can be read using a simple one liner if the source being read implementsRead
. &[u8] also implementsRead
so I think this is already quite flexible.Read
. I am not sure if there are any scenario's on which this would not be sufficient?-> Playground example
Solution 2: Add way to convert from
Vec<NonZeroU8>
toCString
Credits to @alex
Another solution would be to add a conversion method (by for example using the
From
trait) forVec<NonZeroU8>
toCString
. Since we can be sure that no zero characters are included in the Vector we could perform a cheap conversion. I took an attempt to implementingFrom
forCString
but it might be improved.Pro's and Con's:
Vec<NonZeroU8>
can be safely converted into aCString
.Read````-able object manually and converting them into a
Vec``` in order for this to work.-> Playground example
I wonder what you all think of this proposal and whether or not it could be improved.
The text was updated successfully, but these errors were encountered: