Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miri and provenance using raw_ref macros #1695

Closed
danielhenrymantilla opened this issue Jan 29, 2021 · 4 comments
Closed

Miri and provenance using raw_ref macros #1695

danielhenrymantilla opened this issue Jan 29, 2021 · 4 comments
Labels
A-aliasing Area: This affects the aliasing model (Stacked/Tree Borrows) C-support Category: Not necessarily a bug, but someone asking for support

Comments

@danielhenrymantilla
Copy link
Contributor

danielhenrymantilla commented Jan 29, 2021

The following code:

#[derive(Debug)]
#[repr(C)] // for easier computation of the field offsets (no need to `offset_of!`)
struct S {
    a: u8,
    b: u8,
}
let s = S { a: 42, b: 27 };
let at_b = ::core::ptr::raw_const!(s.b);
unsafe {
    dbg!(
        &*at_b.cast::<u8>().offset(-1 /* offset of `.b` in `S` */).cast::<S>()
    );
}

triggers the following Miri error:

error: Undefined Behavior: trying to reborrow for SharedReadOnly at alloc1414+0x1, but parent tag <untagged> does not have an appropriate item in the borrow stack
  --> src/main.rs:15:13
   |
15 |             &*at_b.cast::<u8>().offset(-1 /* offset of `.b` in `S` */).cast::<S>()
   |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ trying to reborrow for SharedReadOnly at alloc1414+0x1, but parent tag <untagged> does not have an appropriate item in the borrow stack
   |
   = help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
   = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
   = note: inside `main` at src/main.rs:15:13
   = note: inside `<fn() as std::ops::FnOnce<()>>::call_once - shim(fn())` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227:5
   = note: inside `std::sys_common::backtrace::__rust_begin_short_backtrace::<fn(), ()>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:125:18
   = note: inside closure at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:66:18
   = note: inside `std::ops::function::impls::<impl std::ops::FnOnce<()> for &dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe>::call_once` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:259:13
   = note: inside `std::panicking::r#try::do_call::<&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe, i32>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:379:40
   = note: inside `std::panicking::r#try::<i32, &dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:343:19
   = note: inside `std::panic::catch_unwind::<&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe, i32>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:396:14
   = note: inside `std::rt::lang_start_internal` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:51:25
   = note: inside `std::rt::lang_start::<()>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:65:5

Which means that ptr::raw_const!(s.b) is interpreted as having provenance over s.b only; i.e., that it is not equivalent to doing ptr::raw_const!(s).cast::<u8>().offset( offset_of!(S, b) )

More surprisingly, if at_b is computed as:

let at_b = ::core::ptr::raw_const!((*&s).b);

then Miri does not complain.


This looks like a bug (Miri false positive), although I can imagine that having the raw_ref! macros "guess" the right provenance might be hard to implement, if not impossible; but these differences in semantics with changes that subtle look error-prone and footgunny to me; at the very least the docs of the raw_ref macros ought to mention that extra care should be taken when dealing with provenance (that is, to recommend using the explicit at_s.cast::<u8>.offset(offset_of!(S, field)).cast::<FieldTy>(), with offset_of! being the one that uses raw_ref! macros under the hood).

This might be an important thing to clarify / to fix now that the macros are about to be stabilized (rust-lang/rust#80886)

@RalfJung
Copy link
Member

This is caused by how Stacked Borrows treats the "reference to raw transition" in

let s = S { a: 42, b: 27 };
let at_b = ::core::ptr::raw_const!(s.b);

Here, the raw pointer is conceptually being created to s.b, and thus only that field may be accesses by the raw pointer and its descendants. If you want to use raw pointers to access the entire struct, you need to explicitly create a raw pointer to the entire struct:

let s = S { a: 42, b: 27 };
let s_raw = &s as *const _;
let at_b = ::core::ptr::raw_const!((*s_raw).b);

Now, at_b may be offset to also access a.

So, as far as I can see Miri is working as intended here; Stacked Borrows just doesn't behave the way you expected it to. And that's a fair point, Stacked Borrows maybe should be more permissive here -- but that's not a Miri issue, that's a memory model issue: rust-lang/unsafe-code-guidelines#134.

More surprisingly, if at_b is computed as:

Interesting; I'll look into this. At first sight I am surprised that this works with just a reference.

at the very least the docs of the raw_ref macros ought to mention that extra care should be taken when dealing with provenance

Given that hardly anything solid can be said about our aliasing model at this point, I am not sure what the docs could usefully say that is actually "official".

@RalfJung
Copy link
Member

RalfJung commented Jan 30, 2021

More surprisingly, if at_b is computed as:

I cannot reproduce this. I adjusted your code to make it actually run, simplified a bit, and arrived at this, which still fails:

#![feature(raw_ref_macros)]

fn main() {
#[derive(Debug)]
#[repr(C)] // for easier computation of the field offsets (no need to `offset_of!`)
struct S {
    a: u8,
    b: u8,
}
let s = S { a: 42, b: 27 };
let raw_s = &s; // adding `as *const S` makes it work
unsafe {
    let at_b = std::ptr::raw_const!((*raw_s).b);
    dbg!(
        &*at_b.offset(-1 /* offset of `.b` in `S` */)
    );
}
}

@danielhenrymantilla
Copy link
Contributor Author

danielhenrymantilla commented Jan 31, 2021

I cannot reproduce this.

FWIW, I can't reproduce this either 😅 If the Playground's miri hasn't been updated since, then I guess I had a typo on my test. Disregard that part, then.


Miri is working as intended here; Stacked Borrows just doesn't behave the way you expected it to.

To clarify, I get the SB memory model w.r.t., aliasing, or at least the issue you linked to. I am not trying to refute it; on the contrary, I am trying to act based on them, for the sake of conservativeness (assume UB until the requirement is loosened, if ever).

That's what I meant by provenance: I am wondering about the "SB-related rights" the obtained pointers have.

In order to think / study about that, precisely because there is no documentation on the topic, I've had to resort to using Miri. Given that miri has had instances of both false positives and false negatives, the fact that ptr::addr_of! { s.field } and ptr::addr_of! { (*(&s as *const S)).field } yield different Miri-empirical results (the former has provenance over that field only, the latter over the whole S) has led me to asking:

  • was that intended Miri behavior (and thus, intended semantics in the language),

  • or was that a bug within Miri's implementation.

Given your answer, I now understand that it is the former: the addr_of{,_mut}! macros yield pointers whose provenance is that of the last raw-pointer dereferenced to "access" the place, or else (when no raw pointers are dereferenced), whose provenance spans exactly over that place only.


(What follows does not really belong to the miri repo, but rather the Rust one: if we do open a new issue I'll cut-&-paste that there)

Given that hardly anything solid can be said about our aliasing model at this point, I am not sure what the docs could usefully say that is actually "official".

I know this is not an easy topic, since in Rust many "formal semantics" have not been decided / fixed yet, but precisely because of that, I think we1 should try to assume the stricter models available, while waiting for the semantics to be loosened, if ever. In this instance, until the SB aliasing model is "officially blessed" or partly "officially rejected" / changed, it is best if code that we1 write right now treats the SB memory model as "already officially blessed", so as to avoid it getting rejected in the future because of there being too many instance of "incorrect code". So, in the same fashion that uninitialized() integers are conservatively assumed UB, I strive to write code that avoids violating the SB aliasing rules, and I think that that's what most people should try to write.

Or in more practical terms: the fact that Miri itself rejects code violating parts of the SB memory model does mean that we1 should be given a way to know why.

  • 1 we = the programmers / Rust users

Thus, I do think that the docs could benefit from rules expressing the most conservative hypothesis about UB rules. For instance, the addr_of{,_mut}! docs would benefit (either directly, or indirectly, by linking to some nomicon chapter) from some disclaimer about the provenance of the so-generated pointers (i.e., I think that my italic "rule", or a better-phrased version of it, should be written somewhere).

@RalfJung RalfJung added A-aliasing Area: This affects the aliasing model (Stacked/Tree Borrows) C-support Category: Not necessarily a bug, but someone asking for support labels Feb 17, 2021
@RalfJung
Copy link
Member

I think this is not really actionable on the Miri side. We now enable raw pointer tagging by default, so the rules are hopefully less confusing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-aliasing Area: This affects the aliasing model (Stacked/Tree Borrows) C-support Category: Not necessarily a bug, but someone asking for support
Projects
None yet
Development

No branches or pull requests

2 participants