-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BTreeSet causes UB by having (partially) dangling shared reference #54957
Comments
cc @gankro |
Good catch, and very interesting. I wasn't aware that this was a thing. When was this semantic decided/documented? This can be easily fixed by hoisting the |
Also: what value does that metadata have? Allowing speculative loads? |
I don't think that's enough. All the code that does read-only operations (
Exactly. Moreover, even just enforcing the |
@nikomatsakis points out there might also be alignment problems. @eddyb are there any types that have alignment bigger than pointers? |
One possible fix -- but it'd be a big lang change -- is supporting generic statics (so that the static node can have a valid type). Of course that's not a "short term" usable thing. |
Another thing @nikomatsakis suggested is to essentially use |
I need to go over the code to check but I assume the “right” fix is to define an untyped base header type that LeafNode “inherits” from, just like we did in the Swift stdlib (where swift has TBAA) |
Deferring working on this for a bit while some other stuff lands, some notes: the core of the right fix is to add something like this code: fn as_key_slice(&self) -> &'a [K] {
// When taking a pointer to the keys, if our key has a stricter
// alignment requirement than the shared root does, then the pointer
// would be out of bounds, which LLVM assumes will not happen. If the
// alignment is more strict, we need to make an empty slice that doesn't
// use an out of bounds pointer.
if mem::align_of::<K>() > mem::align_of::<LeafPrefix>() && self.is_shared_root() {
&[]
} else {
// Here either it's not the root, or the alignment is less strict,
// in which case the keys pointer will point "one-past-the-end" of
// the node, which is allowed by LLVM.
unsafe {
slice::from_raw_parts(
&(*self.node as *const LeafNode<K, V>).keys as *const K,
self.len()
)
}
}
} to |
I am not sure why comparing the alignments is enough to guarantee that things are in-bounds (I assume "the pointer" which is out of bounds is |
The idea is we're trying to avoid any additional checks. Alignment differences determine padding, which in turn determines if the keys would be placed within the bounds of the LeafPrefix's size. |
Oh I see, the alignment checks would only involve constants and hence not be present at run-time... |
Yeah it compares a dynamic pointer to a static value, but it's &&'d, so under reasonable conditions it will never run. |
Is there a way to observe some kind of memory problem in practice because of this issue, such as an out-of-bounds read? I'm asking because it would be an interesting test for https://github.com/blt/bughunt-rust - it would be very useful to see whether the testing setup detects this issue or not in order to gauge how it performs in practice. |
No, I don't think LLVM exploits this UB and it doesn't by itself lead to bad memory accesses or such things. |
I'm not sure I follow whether this is relevant, but this is the case for:
|
@gankro I have a fix for the out-of-bounds problem... but there's another one. ;) Consider fn into_val_slice_mut(mut self) -> &'a mut [V] {
debug_assert!(!self.is_shared_root());
unsafe {
slice::from_raw_parts_mut(
self.as_leaf_mut().vals.as_mut_ptr() as *mut V,
self.len()
)
}
}
fn into_slices_mut(self) -> (&'a mut [K], &'a mut [V]) {
let k = unsafe { ptr::read(&self) };
(k.into_key_slice_mut(), self.into_val_slice_mut())
} I find the But anyway, this is a blatant violation of mutable references being unique. Namely, when we call Cases like this and the discussions we have been having around two-phase borrows make me wonder whether it is indeed the right call to consider creating a mutable reference as being roughly equivalent to a write access. The trouble is, if you don't do that, then when do you even assert uniqueness of mutable references? Only when they are written to? That can't be it, we couldn't say |
The only reason we create a mutable reference here is because we lack |
Ah right, using raw pointers likely fixes the problem. @nikomatsakis @eddyb lack of auto-deref for raw pointers considered harmful... |
I'd rather change the deref operator from prefix to postfix than make it implicit for raw pointers. |
Yeah, a postfix deref operator would be nice. Currently, things get a bit ugly. |
Couldn't we add some kind of method (inherently postfix) as an 80-20 solution? Or is the problem that we couldn't give it an appropriate type? |
I don't think |
I don't think postfix |
I submitted a proposed fix for this bug at #56648 |
Fix BTreeMap UB BTreeMap currently causes UB by created a shared reference to a too-small allocation. This PR fixes that by introducing a `NodeHeader` type and using that until we really need access to the key/value arrays. Avoiding run-time checks in `into_key_slice` was somewhat tricky, see the comments embedded in the code. I also adjusted `as_leaf_mut` to return a raw pointer, because creating a mutable reference asserts that there are no aliases to the pointee, but that's not always correct: We use `as_leaf_mut` twice to create two mutable slices for keys and values; the second call overlaps with the first slice and hence is not a unique pointer. Fixes rust-lang#54957 Cc @nikomatsakis @gankro
The following code causes UB (found by a patched miri enforcing validity invariants):
What happens is that
BTreeSet::new
sets the root node to a pointer to a staticEMPTY_ROOT_NODE
shared for this purpose by all B-trees that has typeLeafNode<(), ()>
. That step is okay.However, in the next step,
insert
callsensure_root_is_owned
which indirectly turns this ptr into&LeafNode<i32, ()>
. The trouble is thatLeafNode<i32, ()>
is bigger thanLeafNode<(), ()>
, so this reference is partially dangling: Its beginning points to allocated memory (the staticEMPTY_ROOT_NODE
), but that allocated memory ends "too early". The invariant that&LeafNode<i32, ()>
be dereferencable formem::size_of::<LeafNode<i32, ()>>()
bytes is violated. Since we are passingdereferencable(n)
to LLVM withn
set to the full size of the struct, this means we have UB not just according to our own validity rules, but according to LLVM's rules.The text was updated successfully, but these errors were encountered: