-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Amend Rc/Arc::from_raw() docs regarding unsafety #68099
Conversation
Constructing an Rc/Arc is unsafe even if the wrapped `T` is never dereferenced.
r? @shepmaster (rust_highfive has picked a reviewer for you, use r? to override) |
I'm not sure if it applies to this PR as well, but worth noting this comment by @Mark-Simulacrum regarding something similar on another PR that added documentation about something being unsafe: |
src/liballoc/rc.rs
Outdated
/// This function is unsafe because improper use may lead to memory problems. For example, a | ||
/// double-free may occur if the function is called twice on the same raw pointer. | ||
/// This function is unsafe because improper use may lead to memory unsafety, | ||
/// even if `T` is never accessed. For example, a double-free may occur if the function is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this means "if Rc<T>
is never accessed"`?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True as well, although the pitfall here is to assume accessing Rc<T>
itself was safe if T
is never touched.
even if
Rc<T>
orT
is never accessed.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole paragraph seems a bit strange to me. Underspecified in a way. So what are the conditions for this method to be safe?
- What's specified in the first paragraph (from a previous
into_raw
call) - Only call
from_raw
once per raw pointer. T
must (still) be initialized so that it can be dropped.
That's it, right? Shouldn't we explicitly document that? We can still mention the double free later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth noting (in my opinion) that the from_raw
once per raw pointer makes relatively little sense, and IMO represents more hassle than it's worth. It means you can't technically store a pointer and then clone off copies of it, i.e., something like this would not be valid, but I think should be.
Other than that, I don't see any issues with the other points.
unsafe fn clone_arc(ptr: *const T) -> Arc<T> {
let a = Arc::from_raw(ptr);
let copy = a.clone();
mem::forget(a);
copy
}
let ptr = Arc::into_raw(some_arc);
clone_arc(ptr);
clone_arc(ptr);
clone_arc(ptr);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I guess we can just say "the user of from_raw
has to make sure a specific T
is only dropped once". Then it's the users responsibility.
src/liballoc/rc.rs
Outdated
@@ -573,10 +573,11 @@ impl<T: ?Sized> Rc<T> { | |||
/// Constructs an `Rc` from a raw pointer. | |||
/// | |||
/// The raw pointer must have been previously returned by a call to a | |||
/// [`Rc::into_raw`][into_raw]. | |||
/// [`Rc::into_raw`][into_raw] using the same `T`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not quite true, or at least is likely over-constraining. For example, I believe it is fine to go between the unsized representation and the sized representation via pointer casts (e.g,. Rc<T>
unsizes to Rc<dyn Debug>
and then you into_raw, cast to T, and from_raw).
I'm not sure what exact wording we want to guarantee here, though, maybe just same T is in practice both sufficient and good enough for most use cases. Maybe something like "layout and alignment equivalent to T; note that this is essentially a reference to reference transmute if the T differs from the one used by into_raw, and so the same conditions apply"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I thought about that as well, it could be done. My thinking, however, is that it's too flaky to try to document under which conditions this is safe (and therefore binds all future changes to Rc
/Arc
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this? (basically the same what @Mark-Simulacrum suggested)
The raw pointer must have been previously returned by a call to a
Rc<U>::into_raw
, whereU
must have the same alignment and memory layout asT
(which is trivially satisfied ifT == U
). Note that ifU
is notT
, this is basically like transmuting references of different types. Seemem::transmute
for more information on what restrictions apply in that case.
I usually wouldn't mind over-constraining, but if we document this exactly, this PR is just a doc bug fix.
No, that's specifically because we have not yet firmly decided whether Any should be an There's open questions here as well but I don't think they're "some other PR is changing things here as well" at this point. I'm going to r? @LukasKalbertodt -- we'll likely want to nail down some proposed wording and then possibly FCP merge (since it's specifying, and likely narrowing in some sense, the allowed uses of into/from_raw). |
But if this PR just specifies things that were true all along, this is just a doc bug fix, right? In that case an FCP shouldn't be necessary, should it? Not entirely familiar with the procedures yet. |
I believe so. However, I sort of expect we'll be moving from "unclear specification" to "fairly clear specification" -- in that case I would get sign off, personally. But if we end up with just general commentary without specification, then it seems fine to forgo FCP. |
Ping from triage: @LukasKalbertodt any updates on this review? |
New proposal for
The same would apply to Notice that it says "... the same size and alignment ...", not "... the same alignment and memory layout ...", because the inner layout of I'm unsure about the single sentence regarding "... only dropped once". Are people aware how to meet this requirement if they tinker with raw pointers? |
@lukaslueg That sounds good to me!
I think this should be sufficient. It's an If you push that as a commit, I'd be fine with starting an FCP merge (as was suggested above). |
Pushed. Quick recap: The point of this PR is to amend the docs regarding the behavior of |
Let's see if I already have the rights to... @rfcbot fcp merge This is a rather minor documentation fix regarding the safety of |
@rfcbot fcp merge |
Team member @sfackler has proposed to merge this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
/// The raw pointer must have been previously returned by a call to | ||
/// [`Rc<U>::into_raw`][into_raw] where `U` must have the same size | ||
/// and alignment as `T`. This is trivially true if `U` is `T`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is same size and align strong enough? What stops rustc from ordering the strong
and weak
fields of RcBox and ArcInner in a different order for different T?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, nothing, but we know today that rustc will not do so I believe. I continue to believe that it would be a good idea to add #[repr(C)]
or something like it to RcBox and ArcInner, since there should be no reason to not guarantee (internally at least) that the field layout is consistent.
Notably, the data field must be last, due to unsizing, and reodering two usize fields is not currently ever going to happen anyway, I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reordering two usize fields is not currently ever going to happen anyway, I believe
Currently it doesn't, but rust-lang/unsafe-code-guidelines#36 is about whether to make this a guarantee, and I don't think it's decided yet. For example PGO-based field layout would absolutely reorder fields with the same size and align to keep frequently read data together and frequently written data in a different cache line.
I don't necessarily intend on blocking this PR on this, but I would feel more comfortable with #[repr(C)] on those types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be in favor of adding #[repr(C)] as well. Doing so in this PR makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amended RcBox
and ArcInner
with repr(C)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add a comment to those repr(C) annotations that notes that they're not for C-compat, but strictly so that the layout is equivalent for transmutable types (or so)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
force-pushed
Future-proof these types in case rustc reorders the inner fields. As per discussion in PR rust-lang#68099.
f90d359
to
586c7e3
Compare
/// This function is unsafe because improper use may lead to memory problems. For example, a | ||
/// double-free may occur if the function is called twice on the same raw pointer. | ||
/// The user of `from_raw` has to make sure a specific value of `T` is only | ||
/// dropped once. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this mean? Isn’t Rc::drop
the one responsible for dropping T
when the refcount reaches zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user is responsible for not calling from_raw
(and allowing the returned value to drop) "too many times" is what this is getting at. I want to avoid saying something like "only once," since it can be incredibly useful to keep around a single pointer in FFI code and "clone out" Rc's from it, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see what you’re getting at but I feel this one sentence does not express it well. Maybe we could talk about how each Rc
or Arc
“owns” one strong reference, and into_raw
/from_raw
transfer ownership? I think that’s a better mental model, and double-drop is just one of the possible (bad) consequences.
(By the way I think there should be an alternative to from_raw
that does not transfer ownership of the strong reference as a less fragile alternative to ManuallyDrop::new(Rc::from_raw(ptr))
, but that’s out of scope for this PR. Maybe &*mut T -> &Rc<T>
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed on both points. I'm not sure of the exact wording, but maybe @lukaslueg can come up with some draft wording?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little hesitant on this point, for fear of convolution.
Calling
into_raw()
/from_raw()
does not modify the internal reference counter, cloning and dropping the resultingRc<T>
does. One can think of these functions as temporarily lending exactly one "unit of ownership" to the raw pointer. The compiler can't check unique ownership through raw pointers, though, especially as all raw pointers areCopy
. The user has to manually ensureT
is always dropped exactly once, which is not the case if (among other things)T
is dropped through the raw pointer or iffrom_raw()
is called multiple times on inadvertently made copies of the raw pointer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Mark-Simulacrum any thoughts on the previous comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's certainly more detailed :)
However, I think it should clarify that even if you know T is safe to drop lots of times (e.g., ()
unit type), it is still not okay to let Rc
drop more than once for the "last" time.
🔔 This is now entering its final comment period, as per the review above. 🔔 |
The final comment period, with a disposition to merge, as per the review above, is now complete. As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed. The RFC will be merged soon. |
@bors r+ rollup |
📌 Commit 586c7e3 has been approved by |
Rollup of 10 pull requests Successful merges: - rust-lang#68099 (Amend Rc/Arc::from_raw() docs regarding unsafety) - rust-lang#70172 (parse/lexer: support `StringReader::retokenize` called on external files.) - rust-lang#70209 (parser: recover on `for<'a> |...| body` closures) - rust-lang#70223 (fix type of const params in associated types.) - rust-lang#70229 (more clippy fixes) - rust-lang#70240 (Return NonZeroU64 from ThreadId::as_u64.) - rust-lang#70250 (Remove wrong entry from RELEASES.md) - rust-lang#70253 (Remove another wrong entry from RELEASES.md) - rust-lang#70254 (couple more clippy fixes (let_and_return, if_same_then_else)) - rust-lang#70266 (proc_macro_harness: Use item header spans for errors) Failed merges: r? @ghost
This question on SO boils down to "is it safe to
::from_raw()
aRc<T>
/Arc<T>
using a dummyT
even ifT
is never dereferenced via the newRc
/Arc
?". It almost never is.This PR amends the docs of
from_raw()
regarding this point.