-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stacked Borrows: raw pointer usable only for T
too strict?
#134
Comments
This is also a problem in pub fn into_raw(this: Self) -> *const T {
let ptr: *const T = &*this;
mem::forget(this);
ptr
}
|
cast the entire slice to a raw pointer, not just the first element A strict reading of pointer provenance implies that when a `&T` gets cast to `*const T`, you may only use the raw pointer to access that `T`, not its neighbors. That's what Miri currently implements, though it is less strict around statics (which is why this one does not currently cause a Miri failure -- I'd like to make Miri more strict though). Cc rust-lang/unsafe-code-guidelines#134
This also came up in Gilnaa/memoffset#21, where @Amanieu proposed an |
@Amanieu the problem with code like this struct Pair { f1: u16, f2: u16 };
let p = Pair { f1: 2, f2: 3 };
let c = container_of!(&p.f1, Pair, f1);
let _val = c.f2; arises when you imagine splitting it across several functions: struct Pair { f1: u16, f2: u16 };
let p = Pair { f1: 2, f2: 3 };
foo(&p.f1);
p.f2 = 4; We want the compiler to be able to move the assignment to So, I think there is a real conflict here between being able to bound the effects of a call like |
How does this work with |
@Lokathor yes. Those methods do the right thing. They cast the wide reference to a wide raw pointer, and only then go to thin -- so the ref-to-raw cast has the right "span" in memory. |
rust-lang/rust#64980 gives a minor reason to maintain the status quo here. It includes a test-case for a dataflow analysis that computes whether a given It is possible to relax the analysis so that it remains sound if this behavior became defined (see rust-lang/rust#65030). |
See this discussion for a related example. |
Another example caught by miri. The offending code: // self.storage: Box<[MaybeUninit<u8>]>
self.storage[self.cursor]
.as_mut_ptr()
.cast::<T>()
.write_unaligned(component); |
avoid creating unnecessary reference in Windows Env iterator Discovered in rust-lang/miri#1225: the Windows `Env` iterator violates Stacked Borrows by creating an `&u16`, turning it into a raw pointer, and then accessing memory outside the range of that type. There is no need to create a reference here in the first place, so the fix is trivial. Cc @JOE1994 Cc rust-lang/unsafe-code-guidelines#134
That is an interesting special case. It is actually less problematic than the other cases because there is no Rust reference pointing to that "extra" memory (unlike the example in the OP where |
soooo I don't think miri can handle AVX code or anything but I imagine this might give Miri even more of a fit (from rust-lang/rust#71025): use std::arch::x86_64::*;
#[inline(always)]
pub unsafe fn mutate_chunk(rows: [__m256d; 4]) -> [__m256d; 4] {
[
_mm256_permute2f128_pd(rows[0], rows[1], 0x20),
_mm256_permute2f128_pd(rows[2], rows[3], 0x20),
_mm256_permute2f128_pd(rows[0], rows[1], 0x31),
_mm256_permute2f128_pd(rows[2], rows[3], 0x31),
]
}
#[target_feature(enable = "avx")]
pub unsafe fn mutate_array(input: *const f64, output: *mut f64) {
let mut input_data = [_mm256_setzero_pd(); 4];
for i in 0..4 {
input_data[i] = _mm256_loadu_pd(input.add(4*i));
}
let output_data = mutate_chunk(input_data);
for i in 0..4 {
_mm256_storeu_pd(output.add(4*i), output_data[i]);
}
} |
That doesn't seem to have any UB on its own, it just requires that input and output have provenance over at [f64; 16] or similar. |
I didn't post it because I think it's UB, but because I suspect it might be an Exciting Case Study. |
Hm, but I don't think I quite understand the case study here, other than being a real-world example of using the C idiom of passing an array (of known length) via a raw pointer to its first element? |
Just wanted to follow up on @saethlin's example with a minimal reproduction (from rkyv/rkyv#259): #[repr(C)]
pub struct RelSlice {
offset: [u8; 4],
len: [u8; 4],
}
impl RelSlice {
pub fn as_slice(&self) -> &[u8] {
let offset = i32::from_le_bytes(self.offset) as isize;
let len = u32::from_le_bytes(self.len) as usize;
let base = self as *const Self as *const u8;
unsafe {
::core::slice::from_raw_parts(base.offset(offset), len)
}
}
}
unsafe fn get_root<T>(bytes: &[u8]) -> &T {
let root_pos = bytes.len() - ::core::mem::size_of::<T>();
&*bytes.as_ptr().offset(root_pos as isize).cast::<T>()
}
fn main() {
let bytes: &[u8] = &[
0, 1, 2, 3,
0xfc, 0xff, 0xff, 0xff,
4, 0, 0, 0,
];
let root = unsafe { get_root::<RelSlice>(bytes) };
println!("{:?}", root.as_slice());
} Under
It appears that the issue here is that the base pointer is created by That said, rkyv currently enforces a strict ownership model with relative pointers, so I don't think a situation like the one described above using |
I cannot comment on changes to the model which may accommodate this code, I think Ralf will take care of that.
I think this points to a common misunderstanding that I've already seen at least twice. I think you're confusing the borrow checker and lifetimes with the (prototype) aliasing rules. The lifetime connection between these two references is irrelevant and not even accessed by Miri. You could make them both
Relative pointers work perfectly fine in Stacked Borrows. What doesn't work is putting a reference anywhere in the chain of custody between the outer object and the access back out from the inner object to the outer object. I'm not saying that this is a good thing, or that you should write code like what I'm including below. I've just noticed that people often say things like "X is impossible under SB/SB with raw pointer tagging" and that is very rarely true. Almost always the thing is possible, but it's unergonomic or inconvenient to do the thing in question while avoiding references. So I don't know if people are just being terse or they don't understand. Anyway, this code does relative pointers and passes Miri with raw pointer tagging: #[repr(C)]
pub struct RelSlice<'a> {
offset: [u8; 4],
len: [u8; 4],
_marker: std::marker::PhantomData<&'a u8>,
}
impl<'a> RelSlice<'a> {
pub fn as_slice(slf: *const RelSlice<'a>) -> &'a [u8] {
unsafe {
let offset = i32::from_le_bytes((*slf).offset) as isize;
let len = u32::from_le_bytes((*slf).len) as usize;
let base = slf as *const u8;
::core::slice::from_raw_parts(base.offset(offset), len)
}
}
}
unsafe fn get_root<'a, T>(bytes: &'a [u8]) -> *const T {
let root_pos = bytes.len() - ::core::mem::size_of::<T>();
bytes.as_ptr().offset(root_pos as isize).cast::<T>()
}
fn main() {
let bytes: &[u8] = &[
0, 1, 2, 3,
0xfc, 0xff, 0xff, 0xff,
4, 0, 0, 0,
];
let root: *const RelSlice = unsafe { get_root::<RelSlice>(bytes) };
println!("{:?}", RelSlice::as_slice(root));
} |
Thanks for the detailed explanation, that cleared up my confusion a lot. I see now that producing a reference causes the issue. I guess this is an instance where the aliasing information generated by relative pointers is correct, but not compatible with stricter stacked borrows semantics. I am somewhat confused why |
Yeah, and you can run But strict provenance plus aliasing rules means we should also track provenance on raw pointers. Or, put differently, default Stacked Borrows treats raw pointers basically like integers -- as not having provenance. That is in direct opposition to the goal of strict provenance, where provenance ought to be tracked properly everywhere. |
let inner = Box::new(ErrorImpl {
vtable,
handler,
_object: error,
});
// Erase the concrete type of E from the compile-time type system. This
// is equivalent to the safe unsize coersion from Box<ErrorImpl<E>> to
// Box<ErrorImpl<dyn StdError + Send + Sync + 'static>> except that the
// result is a thin pointer. The necessary behavior for manipulating the
// underlying ErrorImpl<E> is preserved in the vtable provided by the
// caller rather than a builtin fat pointer vtable.
let erased = mem::transmute::<Box<ErrorImpl<E>>, Box<ErrorImpl<()>>>(inner);
let inner = ManuallyDrop::new(erased);
Report { inner } This transmute on its own is fine, but in a few places it then tries to un-erase: // Safety: requires layout of *e to match ErrorImpl<E>.
unsafe fn object_drop<E>(e: Box<ErrorImpl<()>>) {
// Cast back to ErrorImpl<E> so that the allocator receives the correct
// Layout to deallocate the Box's memory.
let unerased = mem::transmute::<Box<ErrorImpl<()>>, Box<ErrorImpl<E>>>(e);
drop(unerased);
} and due to the retag in the test test_iter ... error: Undefined Behavior: trying to reborrow <164086> for SharedReadOnly permission at alloc62436[0x18], but that tag does not exist in the borrow stack for this location
--> /tmp/eyre-0.6.8/src/error.rs:541:5
|
541 | &(*(e as *const ErrorImpl<()> as *const ErrorImpl<E>))._object
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| |
| trying to reborrow <164086> for SharedReadOnly permission at alloc62436[0x18], but that tag does not exist in the borrow stack for this location
| this error occurs as part of a reborrow at alloc62436[0x18..0x28]
|
= help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
= help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <164086> was created due to a retag at offsets [0x0..0x18]
--> /tmp/eyre-0.6.8/src/error.rs:541:9
|
541 | &(*(e as *const ErrorImpl<()> as *const ErrorImpl<E>))._object
| ^
= note: inside `eyre::error::object_ref::<eyre::error::ContextError<i32, eyre::Report>>` at /tmp/eyre-0.6.8/src/error.rs:541:5 |
Turns out it won't due to: rust-lang/unsafe-code-guidelines#134 so disable that check
Here is another example possibly worth adding to this issue. On x86_64-linux-gnu struct |
Yeah, dirent is basically the same problem as |
* We hae binaries. So dependabot everything. Also remote nostd * Remove our old CI * Add miri flag, disable loom * Update config.toml to take linking options out. Those should remain local, not checked in. * Disable isolation in miri as we need to read in test input files * Bump msrv to 1.65 * suppress leak suppressions file for now * Bump msrv comment explaining why * Change test to see if miri gets happier * Fixup tests trying to get miri to work. Turns out it won't due to: rust-lang/unsafe-code-guidelines#134 so disable that check * take 2 for multiple miri flags * Try adding a leak suppresions file * try putting lsan at top level * Rework a few things: sanitizer - specify path relative in github actions miri - use sampling of tests from cpu. Otheriwse it OOMs * Specify tests better * Break miri up for c64basic vs cpu For CPU only do a subset and run in separate runs to avoid OOMs. Try again to get leak detector happy * Change to a test miri can run. Try a different way to pass lsan suppressions file
Tree Borrows solves this issue by not doing any retagging on raw pointers, and just giving them the same permission as the reference they are created from. However doing that without two-phase borrows is tricky; here's some discussion. |
The problem is not restricted to casts, but also affects coercions, right? Consider the equivalent (AFAICT): .let val = [1u8, 2];
let ptr: *const u8 = &val[0];
let _val = unsafe { *ptr.add(1) }; A lot of the discussion regarding this has been about "casts" but even if one avoids casts, one could still be affected by this. |
Yes, it applies to "ref-to-ptr coërcions" as well. I think the "cast" terminology is used here kind of indiscriminately to cover both explicit |
Yes indeed, we often view coercions as just automatically inserted casts. Opsem discussions generally are happening on a level where all these implicit operations are made explicit. I can see how that can make the terminology confusing though. |
Currently, the following is illegal according to Stacked Borrows:
The problem is that the cast to
*const u8
creates a raw pointer that may only be used for theu8
it points to, not anything else. The most common case is to do&slice[0] as *const _
instead ofslice.as_ptr()
.This has lead to problems:
&slice[0]
thing.Rc::into_raw
+Rc::from_raw
don't work well together because of this.&slice[0]
patternMaybe this is too restrictive and raw pointers should be allowed to access their "surroundings"? I am not sure what exactly that would look like though. It would probably require having the raw pointer fully inherit all permissions from the reference it is created from.
I'll use this issue to collect such cases.
The text was updated successfully, but these errors were encountered: