Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validity of aggregate types (structs, enums, tuples, arrays, ...) #69

Open
RalfJung opened this issue Jan 10, 2019 · 26 comments
Open

Validity of aggregate types (structs, enums, tuples, arrays, ...) #69

RalfJung opened this issue Jan 10, 2019 · 26 comments
Assignees
Labels
A-validity Topic: Related to validity invariants S-writeup-assigned Status: Ready for a writeup and someone is assigned to it

Comments

@RalfJung
Copy link
Member

RalfJung commented Jan 10, 2019

Discussing what the validity invariants of aggregate types are (and assembling a full list of aggregate types).

Safe compound types include enums, structs, tuples, arrays, slices, closures, generators, SIMD vectors.

The obvious invariant is

  • If applicable: The discriminant matches a possible variant (for enums). This applies to repr(C) enums as well! See #[repr(C)] C-like enums and out of range values rust-memory-model#41 for some discussion of that specific case.
  • All fields (of the active variant, for enums) are valid at their respective type.
  • All bytes not covered by any field ("padding") may have arbitrary content (including uninitialized).

Is there any exception? Currently at least, generators are an exception: Their fields may be uninitialized, leading to special cases in both layout computation code and Miri.

(I put these all together because my expectation is that there's not much to say here. We can split this up into several topics if that seems necessary.)

@RalfJung RalfJung added active discussion topic A-validity Topic: Related to validity invariants labels Jan 10, 2019
@nikomatsakis
Copy link
Contributor

@RalfJung

Is there any exception?

Since unions are covered in #73 -- I think this looks right to me.

One question mark might be some of the "well-known" types for example:

I'm not sure how to categorize those. (Similarly, rustc itself supports a richer attribute #[rustc_layout_scalar_valid_range_end] that lets us specify "invalid ranges" on similar structs (which must contain exactly one integer field, or something like that).

Currently at least, generators are an exception: Their fields may be uninitialized, leading to special cases in both layout computation code and Miri.

Probably that should be fixed by having their fields by MaybeUninit types, no? At least conceptually, I guess.

@RalfJung
Copy link
Member Author

One question mark might be some of the "well-known" types for example:

Ah, good point. This all boils down to rustc_layout_scalar_valid_range_end, the types just use that attribute nowadays.

@RalfJung
Copy link
Member Author

Hm, the fact that these are "wrapper structs" actually has some interesting consequences.

#![feature(rustc_attrs)]

#[rustc_layout_scalar_valid_range_start(1)]
#[repr(transparent)]
pub(crate) struct NonZero(u32);

fn main() { unsafe {
    let mut x = Some(NonZero(1)); 
    match x {
        Some(NonZero(ref mut c)) => {
            // Just writing 0 into an &mut u32
            *c = 0;
        }
        None => panic!(),
    }
    assert!(x.is_some());
} }

The assertion fails. And yet, I cannot find a good argument for where this program would raise UB. After all, it never creates an rvalue that violates its validity invariant.

@Amanieu
Copy link
Member

Amanieu commented Jan 31, 2019

@RalfJung Wouldn't an rvalue be created in the implementation of Option::is_some where it matches on *self?

@RalfJung
Copy link
Member Author

Not really, it is just going to load the discriminant. But even then, it would be an rvalue of type Option<NonZero>, and 0x0 is perfectly valid for that type.

@Amanieu
Copy link
Member

Amanieu commented Jan 31, 2019

Hmm, I still feel that is_some is the right place for UB to occur. At the validity level the Option would need to somehow "remember" that it holds a Some even though the inner value has been set to 0. Maybe we could define validity checking as ignoring the enum layout optimization.

@CAD97
Copy link

CAD97 commented Feb 1, 2019

There's three potential places I see that could be declared UB:

  • taking the reference to the inner type
  • writing an invalid for the outer type to the inner type
  • observing the bad state

To me it feels like we should have UB when assigning 0 to the u32 wrapped in the NonZeroU32. This, however, would require the u32 to inherit validity requirements.

I noted that the NonZero types in core don't ever create refs to the inner type at all. (I don't know about other #[rustc_layout_scalar_valid_range] types.) Maybe it could be the creation of the mutable reference which is disallowed and we only allow copying (and immutable refs)?

There's a bit more freedom in where UB can be around stdlib-only attributes I feel. Attributes like this one are unsafe without writing unsafe so safe code within the safety barrier can cause UB.

For those curious, all uses of rustc_layout_scalar_valid in rust-lang/rust: NonNull*, ptr::Unique, ptr::NonNull, and tests (as of my checking).

@RalfJung
Copy link
Member Author

RalfJung commented Feb 4, 2019

At the validity level the Option would need to somehow "remember" that it holds a Some even though the inner value has been set to 0.

I think that is the key point here, and it worries me a lot: we'd have to somehow keep around some extra state (the "true discriminant"), and that is far from trivial. In particular, with types like Option<Option<bool>>, there can be multiple discriminants stored at the same location. This is moving towards a C-style memory model where memory is a tree of paths that can be accesses through fields... and that's just awfully complicated to reconcile with byte-level operations.

I noted that the NonZero types in core don't ever create refs to the inner type at all. (I don't know about other #[rustc_layout_scalar_valid_range] types.) Maybe it could be the creation of the mutable reference which is disallowed and we only allow copying (and immutable refs)?

rustc_layout_scalar_valid_range is just one of many ways here to reveal the problem; Option<bool> has exactly the same problem:

fn main() { unsafe {
    let mut x = Some(true); 
    match x {
        Some(ref mut b) => {
            let u = b as *mut bool as *mut u8;
            // Just writing into a *mut u8
            *u = 2;
        }
        None => panic!(),
    }
    assert!(x.is_some());
} }

@CAD97
Copy link

CAD97 commented Feb 4, 2019

In that example I'd think the UB is clearer, personally. You're writing 2 into a bool memory slot via pointer type punning. That sounds wrong, as opposed to the scalar_valid_range which is more subtle.

I don't know how the rule would be written, or exactly when the UB would be defined to happen, but writing 2 into a pointer derived from &mut bool feels wrong. That's also the point that requires unsafe.

Actual ad-hoc proposal: while there is a reference on the stacked-borrows stack (conceptually, I think this can work for other borrowing models), it's UB to write a value that doesn't meet the validity requirement of all of the borrows. (Either that, or it's UB when popping the raw and getting to the ref on top.)

@RalfJung
Copy link
Member Author

RalfJung commented Feb 5, 2019

There's no such thing as a "bool memory slot". Rust has no C-style rules about "type punning" -- memory is a bunch of bytes, and every load/store operation interprets these types appropriately.

That makes it much easier to explain legal byte-level accesses that can be mixed with "properly" typed accesses.


To take the discussion into a different direction, is it even a problem if this is not UB? Are there optimizations that are in conflict with this code? The one optimization I can imagine is something along the lines of "nobody wrote the discriminant so it cannot have changed".

@RalfJung
Copy link
Member Author

RalfJung commented Feb 5, 2019

Here's another variant:

fn main() { unsafe {
    let mut x = Some(&0); 
    match x {
        Some(ref mut b) => {
            let u = b as *mut &i32 as *mut usize;
            // Just writing into a *mut u8
            *u = 0;
        }
        None => panic!(),
    }
    assert!(x.is_some());
} }

In terms of what happens with the memory, this code is basically the same as

fn main() { unsafe {
    let mut x = Some(&0); 
    (&mut x as *mut _ as *mut usize) = 0;
    assert!(x.is_some());
} }

which is certainly allowed. It seems really hard to allow the latter but disallow the former.

@hanna-kruppe
Copy link

The one optimization I can imagine is something along the lines of "nobody wrote the discriminant so it cannot have changed".

I don't see how, or rather, writing a value "of the wrong type" (intuitively, even if memory is untyped) isn't the only cause for it. Taking Option<bool> as a concrete example: since the enum is one byte large, and shared borrows permissions aren't "typed", just borrowing the Some.0 field mutably is pretty much the same as borrowing the whole enum, and gives permission to e.g. cast &mut bool -> &mut Option<bool> and write None. That doesn't even clash with the unspecified nature of enum layout optimizations, it is indeed just writing a valid Option value to a memory slot that was created to hold an Option, but it has the same effect of allowing borrows of an enum variant's field to affect the discriminant.

@gnzlbg
Copy link
Contributor

gnzlbg commented Feb 5, 2019

We guarantee the enum niche optimization for Option<bool>, but do we also guarantee which niche is used to represent None?

The example @RalfJung showed (#69 (comment)) exploits the knowledge that the niche 2 is used to represent None, and miri does not complain about that. Writing 3 instead does trigger undefined behavior, which miri detects.

AFAIK we don't guarantee which niche is used, and we are allowed to change that, so AFAICT the Option<bool> example is UB that just so happens to work with the current rustc version and that miri fails to detect.

We could start guaranteeing which niches are used for the enum optimizations, although that might get complicated as we start guaranteeing more enum optimizations, e.g., for Result<(), NonZero>.

Personally, I find it weird that using bool directly and writing 2 to it triggers UB only if the bool is not behind an enum, but I understand that because the memory of the bool is the same as the memory of the enum, the valid values for that memory is the union of the valid values for the bool and the enum, and 2 is a valid value in this case (at least for this rustc version). Since we are writing through a *mut u8, we can't really tell whether we are writing to the bool or to the enum, so if we make this UB I don't know how miri could efficiently check this. Maybe we should express the valid values for enum discriminants in miri not using integers, but using something else so that the valid values of the enum memory are {true, false, NoneDiscr} where NoneDiscr is different from any integer.

That way to avoid UB one would need to write:

fn main() { unsafe {
    let mut x = Some(true); 
    match x {
        Some(ref mut b) => {
            let u = b as *mut bool as *mut u8;
            let o: Option<bool> = None;
            // Just writing into a *mut u8
            *u = std::mem::transmute(o);
        }
        None => panic!(),
    }
    assert!(x.is_some());
} }

@RalfJung
Copy link
Member Author

RalfJung commented Feb 5, 2019

AFAIK we don't guarantee which niche is used, and we are allowed to change that, so AFAICT the Option<bool> example is UB that just so happens to work with the current rustc version and that miri fails to detect.

That doesn't make it UB, but it makes the code rely on unspecified layout details. It's like transmuting structs around making assumptions about layout. If the compiler makes the expected choice, that's not UB, but it might become UB if the compiler choice ever changes.

But for my latest example, using references, we do specify the layout optimization, so there would be no issue with this.

I find it weird that using bool directly and writing 2 to it triggers UB only if the bool is not behind an enum

That's not the case. This is not UB either:

fn main() { unsafe {
  let mut x = true;
  let xptr = &mut x as *mut bool as *mut u8;
  *xptr = 2;
} }

The point is that x does not get used again after the "bad" assignment -- nothing uses this at bool type, so nothing requires the invariant.

@gnzlbg
Copy link
Contributor

gnzlbg commented Feb 5, 2019

That doesn't make it UB, but it makes the code rely on unspecified layout details. It's like transmuting structs around making assumptions about layout. If the compiler makes the expected choice, that's not UB, but it might become UB if the compiler choice ever changes.

Makes sense.

This is not UB either:

Ah, in my experiments in the playground I was always using println!("{:?}", x); to print the value, which introduces an use after wards. For the option, it returns None, and for the bool, miri detects the UB.


Safe compound types include enums, structs, tuples, arrays, slices, closures, generators.

We should probably add Vectors to the list, with the caveat that we currently can't have Vectors whose element types are unions, so we can't currently have a Vector of MaybeUninit, although maybe we should be able to do that.

@RalfJung
Copy link
Member Author

RalfJung commented Feb 5, 2019

We should probably add Vectors to the list, with the caveat that we currently can't have Vectors whose element types are unions, so we can't currently have a Vector of MaybeUninit, although maybe we should be able to do that.

By vector your mean SIMD types? These behave exactly like arrays, as far as I am concerned. But yeah, we should probably list them... not sure when/if that list will be complete.^^

@gnzlbg
Copy link
Contributor

gnzlbg commented Feb 5, 2019

By vector your mean SIMD types? These behave exactly like arrays, as far as I am concerned.

Yes I mean SIMD types. From the validity point of view, they do behave exactly as arrays. They are only a distinct kind of type because they have a different layout.

@RalfJung
Copy link
Member Author

RalfJung commented Feb 6, 2019

I propose we move the discussion about validity of inner fields and unused variables to #84.

@RalfJung
Copy link
Member Author

IIRC @gnzlbg agreed to do a write-up of this.

@RalfJung RalfJung added the S-writeup-assigned Status: Ready for a writeup and someone is assigned to it label Feb 28, 2019
@the8472
Copy link
Member

the8472 commented May 23, 2019

For cases where one wants to transmute [u8] into structs, would one of the following help avoiding UB? a) a const method that verifies that all bit patterns of a type are valid values or b) a runtime method that checks whether a sequence of bytes would be a valid value of a specific type?

@sdroege
Copy link

sdroege commented May 31, 2019

See #137.

It would be good to explicitly document that fieldless repr(C) enums are not behaving like C enums but only have the same memory representation. That is, even for repr(C) enums it is UB to have values that are not part of the enum unlike in C where enums are just fancy integers and it's fine to have values that are not defined for the enum. repr(C) could give the idea that such an enum actually behaves like the C equivalent, which is simply wrong and easily leads to unexpected UB in FFI code where the correct solution then would be to pass around plain integers, check them and convert them to an enum on the Rust side.

This is a common misunderstanding and as such it seems to be a good idea to explicitly mention it.

@TyPR124
Copy link

TyPR124 commented Jul 7, 2020

It may be good to explicitly state that the #[non_exhaustive] attribute does not affect the validity of enum values. That is, that this code is UB:

#[repr(u8)]
#[non_exhaustive]
enum Foo {
    Zero,
    One,
}
fn this_is_not_okay() -> Foo {
   unsafe { std::mem::transmute(2u8) }
}

@jsha
Copy link

jsha commented Nov 8, 2021

Chiming in to say I recently made the mistake of accepting a fieldless #[repr(C)] enum as an input parameter in an FFI function: rustls/rustls-ffi#152. It would have been quite helpful to me for there to be a page in the unsafe code guidelines explicitly describing that pattern as undefined behavior.

It looks like this issue has been quiescent for a bit over a year. What's needed to make progress on documenting enum validity? Are there still decisions to be made or is it a matter of writing up some verbiage and making a pull request?

Thanks,
Jacob

@Lokathor
Copy link
Contributor

Lokathor commented Nov 8, 2021

a PR would be good.

Note that even if something is documented in these guidelines or issues then it's all still "non-normative", as they say. In other words, unofficial, guidelines only, the compiler doesn't necessarily respect it, etc etc.

Most stuff described in this repository needs to eventually become various RFCs. Until then, writing it down in this repo is the best we got.

@jsha
Copy link

jsha commented Nov 8, 2021

Thanks, I'll work on a PR!

@RalfJung
Copy link
Member Author

RalfJung commented Nov 9, 2021

Yeah, there was not a lot of discussion here because we have consensus on most things, I would say:

  • enum discriminants have to be valid
  • all the fields have to recursively satisfy their validity invariant

So, a PR definitely makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-validity Topic: Related to validity invariants S-writeup-assigned Status: Ready for a writeup and someone is assigned to it
Projects
None yet
Development

No branches or pull requests