Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix provenance UB and alignment UB #27

Merged
merged 4 commits into from
Feb 15, 2022
Merged

Fix provenance UB and alignment UB #27

merged 4 commits into from
Feb 15, 2022

Conversation

saethlin
Copy link
Contributor

I've been running MIRIFLAGS="-Zmiri-tag-raw-pointers" cargo miri test on a number of crates, and I recently came across this one. Here's what I found:

  • A &Header cannot be used to get a useful pointer to data beyond it, because the pointer from the as-cast of the &Header only has provenance over the Header.
  • After a set_len call that decreases the length, it is invalid to create a slice then try to get_unchecked into the region between the old and new length, because the reference in the slice that the ThinVec now Derefs to does not have provenance over that region. Alternatively, this is UB because the docs stipulate that you're not allowed to use get_unchecked to index out of bounds.
  • Misaligned data pointers were produced when the gecko-ffi feature was enabled and T has an alignment greater than 4.
  • I think the use of align_offset in tests is subtly wrong, align_offset seems to be for optimizations only. The docs say that a valid implementation may always return usize::MAX.

FYI, the provenance issue here can be tricky to understand from Miri's output but I have a Miri PR up that improves the diagnostics dramatically for cases like this. Also, I'm told this code might be shipped as part of a web browser? If an interested party wants to see if these issues affect a codebase that doesn't work under Miri I have a rustc PR up which detects the alignment and get_unchecked issue when code is built with -Zbuild-std.

A &Header cannot be used to get a useful pointer to data beyond it,
because the pointer from the as-cast of the &Header only has
provenance over the Header.

After a set_len call that decreases the length, it is invalid to create a
slice then try to get_unchecked into the region between the old and new
length, because the reference in the slice that the ThinVec now Derefs
to does not have provenance over that region. Alternatively, this is UB
because the docs stipulate that you're not allowed to use
`get_unchecked` to index out of bounds.

Misaligned data pointers were produced when the gecko-ffi feature
was enabled and T has an alignment greater than 4.

I think the use of align_offset in tests is subtly wrong, align_offset
seems to be for optimizations only. The docs say that a valid
implementation may always return usize::MAX.
Copy link
Owner

@Gankra Gankra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks great, thanks! (I'm well acquainted with stacked borrows, but this code was written quite a while ago when everything was much more up in the air).

I've been trying to find where an unaligned access can happen, and how any of these changes would possibly change that. Could your provide some more details?

@@ -470,7 +453,18 @@ impl<T> ThinVec<T> {
unsafe { self.ptr.as_ref() }
}
fn data_raw(&self) -> *mut T {
self.header().data()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 this being wrong makes sense to me.

@@ -565,7 +559,7 @@ impl<T> ThinVec<T> {
// doesn't re-drop the just-failed value.
let new_len = self.len() - 1;
self.set_len(new_len);
ptr::drop_in_place(self.get_unchecked_mut(new_len));
ptr::drop_in_place(self.data_raw().add(new_len));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 this being wrong makes sense to me

assert_eq!(
head_ptr.align_offset(std::mem::align_of::<$typename>()),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

god i just looked this up and what a nightmare api

👍 to the change

.data::<T>()
.as_ptr()
.add(1)
.cast::<T>()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to this change for provenance

@saethlin
Copy link
Contributor Author

saethlin commented Feb 13, 2022

I've been trying to find where an unaligned access can happen, and how any of these changes would possibly change that. Could your provide some more details?

Yeah totally. You can probably find the extant alignment issue on the latest release or commit to your branch by running Miri with Stacked Borrows off and gecko-ffi on:

MIRIFLAGS="-Zmiri-disable-stacked-borrows" cargo +nightly miri test --features=gecko-ffi

You can definitely detect it with

MIRIFLAGS="-Zmiri-disable-stacked-borrows -Zmiri-symbolic-alignment-check" cargo +nightly miri test --features=gecko-ffi

The symbolic alignment check is known to have false positives, but the normal alignment checker checks the actual value of a pointer, so it's possible to have a false negative because things just happen to be correctly aligned. So this is a Heisenbug. In a lot of cases if I try to stick in a println! or dbg! to extract a bit more information from the program, the alignment stops being wrong.

Extra debug assertions to the rescue! If I apply only this diff to fix the indexing out of bounds problem but retain the alignment issue:

diff --git a/src/lib.rs b/src/lib.rs
index 19e38b8..a5e3928 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -565,7 +565,7 @@ impl<T> ThinVec<T> {
                 // doesn't re-drop the just-failed value.
                 let new_len = self.len() - 1;
                 self.set_len(new_len);
-                ptr::drop_in_place(self.get_unchecked_mut(new_len));
+                ptr::drop_in_place(self.data_raw().add(new_len));
             }
         }
     }

Then running under gdb, we get a SIGILL from the assertion failure in core at this location:

#0  core::slice::raw::from_raw_parts_mut::{closure#0}<alloc::boxed::Box<i32, alloc::alloc::Global>> ()
    at /home/ben/rust/library/core/src/intrinsics.rs:1975
#1  0x00005555555a37c5 in core::ops::function::FnOnce::call_once<core::slice::raw::from_raw_parts_mut::{closure#0}, ()> ()
    at /home/ben/rust/library/core/src/ops/function.rs:227
#2  0x000055555559a6d3 in core::intrinsics::const_eval_select<(), fn(), core::slice::raw::from_raw_parts_mut::{closure#0}, ()>
    (arg=(), _called_in_const=0x555555669d4c <thin_vec::EMPTY_HEADER>, called_at_rt=...)
    at /home/ben/rust/library/core/src/intrinsics.rs:2340
#3  0x00005555555bc511 in core::slice::raw::from_raw_parts_mut<alloc::boxed::Box<i32, alloc::alloc::Global>> (
    data=0x555555669d54, len=0) at /home/ben/rust/library/core/src/slice/raw.rs:134
#4  0x000055555557ed06 in thin_vec::ThinVec<alloc::boxed::Box<i32, alloc::alloc::Global>>::as_mut_slice<alloc::boxed::Box<i32, alloc::alloc::Global>> (self=0x7ffff7852630) at src/lib.rs:589
#5  0x000055555558a24e in thin_vec::{impl#9}::deref_mut<alloc::boxed::Box<i32, alloc::alloc::Global>> (self=0x7ffff7852630)
    at src/lib.rs:1045
#6  0x00005555555a5163 in thin_vec::{impl#7}::drop<alloc::boxed::Box<i32, alloc::alloc::Global>> (self=0x7ffff7852630)
    at src/lib.rs:1029
#7  0x00005555555a47ea in core::ptr::drop_in_place<thin_vec::ThinVec<alloc::boxed::Box<i32, alloc::alloc::Global>>> ()
    at /home/ben/rust/library/core/src/ptr/mod.rs:191
#8  0x000055555558e8de in core::clone::Clone::clone_from<thin_vec::ThinVec<alloc::boxed::Box<i32, alloc::alloc::Global>>> (
    self=0x7ffff7852630, source=0x7ffff7852638) at /home/ben/rust/library/core/src/clone.rs:131
#9  0x00005555555aa159 in thin_vec::std_tests::test_clone_from () at src/lib.rs:1684

And looking at what we're building a slice out of...

(gdb) f 3
#3  0x00005555555bc511 in core::slice::raw::from_raw_parts_mut<alloc::boxed::Box<i32, alloc::alloc::Global>> (
    data=0x555555669d54, len=0) at /home/ben/rust/library/core/src/slice/raw.rs:134
134	       assert_unsafe_precondition!(
(gdb) info args
data = 0x555555669d54
len = 0

I think the bug actually might originate in padding. Though frankly I got my brain twisted up in knots trying to adjust that function. I think the problem case is when we're using gecko-ffi, the alignment of Header is 4. But its size is 8. And inside padding, the code only checks if alloc_align > header_size to decide to actually do some padding. But here, the header size is the alignment of T, so the code selects a padding of 0, and if the header is only aligned to 4, the contents end up misaligned.

I think this patch sidesteps the issue by not considering padding when deciding what pointer to return. If the capacity is 0, always dangling. It wouldn't hurt to fix the underlying problem, if there is one.

@Gankra
Copy link
Owner

Gankra commented Feb 13, 2022

Ah, I bet I know the issue. It doesn't affect real geck-ffi code but it would affect miri/tests which are in a weird zombie configuration. But very importantly: I think this would affect the actual production non-gecko-ffi when align(T) == header_size! That's especially terrible because align=16 is a key motivating usecase! As we'll see the align UB you found is relatively benign UB, but UB nonetheless!

First let's look at gecko-ffi:

When really running under gecko-ffi, we're dynamically linking the "empty singleton" to the following symbol defined in Firefox's C++:

alignas(8) const nsTArrayHeader sEmptyTArrayHeader = {0, 0, 0};

But we include no such repr(align(8)) in the Rust definition:

#[repr(C)]
struct Header {
    _len: SizeType,
    _cap: SizeType,
}

The reason this blows up in drop is because Drop is:

impl<T> Drop for ThinVec<T> {
    fn drop(&mut self) {
        unsafe {
            ptr::drop_in_place(&mut self[..]);
            self.deallocate();
        }
    }
}

Which eventually shells out to data but doesn't take the dangling() path because padding==0! The empty slice we construct here must be well-aligned even though we never actually access its contents. So this is our UB! (Although we can at least pat ourselves on the back for this probably not being an issue that will lead to actual miscompilations, much like the other stacked borrows violations this PR fixes. Of course we're still gonna fix it, it's just not what I would consider a security issue.)

The problem is that padding is expressing (I think correct) logic for how much padding is needed for allocating a non-empty header, but we are also using it to query "would the empty-singleton be underaligned for one-past-the-end to be a data pointer?", which is a different question!

When we allocate we use the max of the Header and T's align, so it is correct to say that when align(T) == header_size, there is no padding. But we don't actually align the empty singleton to T, so we need an even stricter guard -- we need to also check the align(Header) >= align(T). If it is then the max will just be align(Header) and so the naturally-aligned empty singleton will have the same alignment padding uses in its calculations.

With all of that said, you can probably see why non-gecko-ffi would also have this issue when align(T) == header_size: it would also report padding == 0, and is using the exact same naturally-aligned empty singleton that caused the problem for our tests! So in this crate's default configuration we could be producing underaligned empty slices for empty singletons! (depends on how the linker lays things out though, might get lucky).

You "fixed" this when you removed the padding > 0 && guard but you actually removed an important safety guard from the system! I don't think I actually noticed the removed guard in my review because you rewrote the whole routine (and moved it). This was the key point I was missing, and why I was confused about where you had actually fixed the alignment issue.

The important safety guard is: calling padding unconditionally. padding includes an assertion that will crash the program if something of too-high-alignment for gecko is ever used. By gating all attempts to get a pointer to data behind a function that calls padding, we have included an assertion that the type isn't over-aligned before actually grabbing the pointer.

This assertion helps keep us confident that you aren't accidentally creating/handling ThinVecs that aren't compatible with nsTArray's design, even if FFI hides the ctor from us. Conveniently it's using entirely static values so this guard has no runtime effects when you don't use invalid types.

I would also like to restore some version of that guard because it also serves an important purpose: it removes the runtime branch on cap for types which statically don't need the alignment.

Here's my proposed tweaks (with extra docs so we all remember The Reasoning):

// In "real" gecko-ffi mode, the empty singleton will be aligned
// to 8 by gecko. But in tests we have to provide the singleton
// ourselves, and Rust makes it hard to "just" align a static.
// To avoid messing around with a wrapper type around the
// singleton *just* for tests, we just force all headers to be
// aligned to 8 in this weird "zombie" gecko mode.
//
// This shouldn't affect runtime layout (padding), but it will
// result in us asking the allocator to needlessly overalign
// non-empty ThinVecs containing align < 8 types in 
// zombie-mode, but not in "real" geck-ffi mode. Minor.
#[cfg_attr(all(feature = "gecko-ffi", any(test, miri)), repr(align(8)))]
#[repr(C)]
struct Header {
    _len: SizeType,
    _cap: SizeType,
}

(Really we should wrap the empty singleton static in a struct that is aligned to 8 but I don't want to write all that out just for tests..)

fn data_raw(&self) -> *mut T {
    // `padding` contains ~static assertions against types that are
    // incompatible with the current feature flags. Even if we don't
    // care about its result, we should always call it before getting
    // a data pointer to guard against invalid types!
    let padding = padding::<T>();

    // Although we ensure the data array is aligned when we allocate,
    // we can't do that with the empty singleton. So when it might not
    // be properly aligned, we substitute in the NonNull::dangling
    // which *is* aligned.
    //
    // To minimize dynamic branches on `cap` for all accesses
    // to the data, we include this guard which should only involve
    // compile-time constants. Ideally this should result in the branch
    // only be included for types with excessive alignment.
    let empty_header_is_aligned = if cfg!(feature = "gecko-ffi") {
        // in gecko-ffi mode `padding` will ensure this under
        // the assumption that the header has size 8 and the
        // static empty singleton is aligned to 8.
        true 
    } else {
         // In non-gecko-ffi mode, the empty singleton is just
         // naturally aligned to the Header. If the Header is at
         // least as aligned as T *and* the padding would have
         // been 0, then one-past-the-end of the empty singleton
         // *is* an acceptable out-of-thin-air data pointer and
         // we can remove the `dangling` special case.
         align_of::<Header>() >= align_of::<T>() && padding == 0
    };

    unsafe {
        if !empty_header_is_aligned && self.header().cap() == 0 {
            NonNull::dangling().as_ptr()
        } else {
            // This could technically result in overflow, but padding would have to be absurdly large for this to occur.
            let header_size = mem::size_of::<Header>();
            let ptr = self.ptr.as_ptr() as *mut u8;
            ptr.add(header_size + padding) as *mut T
        }
    }
}

It would be good to also include a test for the align==header_size empty ThinVec case. Basically just do ThinVec::new() and drop it.

And also explain the nuanced explanation in detail, and try to back up
the implementation with tests that ensure that the data pointer for a
zero-length ThinVec is correct, in and out of gecko-ffi mode.

Co-authored-by: Aria Beingessner <[email protected]>
@saethlin
Copy link
Contributor Author

saethlin commented Feb 14, 2022

When I turn on gecko-ffi, the gecko-specific test fails on my machine with SIGILL, which seems wrong.

@Gankra
Copy link
Owner

Gankra commented Feb 14, 2022

Ha, funny. My bad.

The SIGILL is a double-panic from the over-align assertion. Basically with the way I recommended, when you try to access an empty invalid ThinVec, it panics, which tries to run its destructor and get the data pointer, and then panics again.

I think the simplest fix is to add

let _ = padding::<T>();

To the start of ThinVec::new so that the assertion triggers before we ever create one. The user will still get a Bad UX if they bypass our checks with FFI or whatever, but that's ~ok.

Comment on lines +250 to +265
// The header of a ThinVec.
//
// The _cap can be a bitfield, so use accessors to avoid trouble.
//
// In "real" gecko-ffi mode, the empty singleton will be aligned
// to 8 by gecko. But in tests we have to provide the singleton
// ourselves, and Rust makes it hard to "just" align a static.
// To avoid messing around with a wrapper type around the
// singleton *just* for tests, we just force all headers to be
// aligned to 8 in this weird "zombie" gecko mode.
//
// This shouldn't affect runtime layout (padding), but it will
// result in us asking the allocator to needlessly overalign
// non-empty ThinVecs containing align < 8 types in
// zombie-mode, but not in "real" geck-ffi mode. Minor.
#[cfg_attr(all(feature = "gecko-ffi", any(test, miri)), repr(align(8)))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative to this conditional attribute would be to make the static be of an over-aligned wrapper around the actual Header type, like:

#[repr(C, align(8))]
struct EmptyHeader(Header);

#[cfg(any(not(feature = "gecko-ffi"), test, miri))]
static EMPTY_HEADER: EmptyHeader = EmptyHeader(Header { _len: 0, _cap: 0 });

#[cfg(all(feature = "gecko-ffi", not(test), not(miri)))]
extern "C" {
    #[link_name = "sEmptyTArrayHeader"]
    static EMPTY_HEADER: EmptyHeader;
}

This would allow us to treat the empty header as 8-byte aligned with all configurations, despite the actual Header struct not being 8-byte aligned, which might be useful even for non-gecko-ffi code to avoid the check for empty ThinVec<&T>. We'd just need to change the check of for the empty header being aligned to:

// The empty header is over-aligned using the `EmptyHeader` wrapper, 
// meaning that if the header is at least as aligned as T, then the 
// one-past-the-end of the empty singleton is an acceptable out-of-thin-air 
// data pointer, and we can remove the dangling special case.
let empty_header_is_aligned = align_of::<EmptyHeader>() >= align_of::<T>();

It might also be worth adding a test to explicitly assert std::align_of::<EmptyHeader>() >= std::align_of::<Header>(), but IIRC rust doesn't silently under-align types with #[repr(C, align(...))].

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I completely forgot that SizeType without gecko-ffi on 64-bit systems is already defined using usize, not u32, so there's actually not much benefit to doing it this way other than making the empty_header_is_aligned check consistent across configurations.

Only asserting alignment in data_raw is a bit too late, because the
check will occur for an always-empty ThinVec in the destructor, which
will cause a double-panic and a SIGILL. Duplicating it in the
constructors prevents creating an invalid ThinVec in the first place.

The check in data_raw is retained as a prevention against using
an invalid ThinVec through FFI.
@Gankra
Copy link
Owner

Gankra commented Feb 15, 2022

Oops, looks like you left out an unsafe block!

src/lib.rs Outdated Show resolved Hide resolved
@Gankra
Copy link
Owner

Gankra commented Feb 15, 2022

Thanks so much for finding this stuff and helping fix it!

We can do the Right Thing with the wrapper that nika describes in a follow-up without bothering you with this anymore.

@Gankra Gankra merged commit cf8749e into Gankra:master Feb 15, 2022
@saethlin
Copy link
Contributor Author

It's not bothersome, I'm out here because I'm trying to fix things like this 😄

@Gankra
Copy link
Owner

Gankra commented Feb 15, 2022

Oh also in case you didn't notice I updated your commit message with some extra notes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants