Publicly expose the allocated representation of Rc and Arc #80029

ggriffiniii · 2020-12-14T16:16:10Z

As discussed on the internals forum https://internals.rust-lang.org/t/expose-a-way-to-create-an-arc-fallibly/13551

I'm working on a project that is very strict about when it's appropriate to allocate and is not allowed to panic on allocation failure. The current mechanism of creating an Rc/Arc does not allow gracefully failing to allocate. This series of changes provides an optional way of constructing an Rc/Arc where the user provides an already Box'd version of the Rc/Arc internal representation and the Rc/Arc can be constructed by taking over the allocation. This gives control to the user over how to construct the Box, which can be done in a fallible way in our case.

In the process of making the types visible it seemed to me that they should be renamed. I'm not set on the names RcRepr and ArcRepr and would welcome better suggestions.

The goal of this change is to provide more flexibility over how an `Arc<T>` is allocated. An example of when this could be useful is when the user does not want to panic on allocation failure and instead return a Result. The changes here provide a way of constructing an `Arc<T>` by first creating an `ArcInner<T>` wrapping that in a `Box<ArcInner<T>>`, and finally creating an `Arc<T>` from the `Box<ArcInner<T>>`. The last step does no allocation of it's own and merely takes control of the `ArcInner<T>` from the allocation within the Box. This does not make any of the fields within the `ArcInner<T>` visible so it is *not* stabilizing the exact layout of the `ArcInner<T>`. The only operations that can be done with `ArcInner<T>` is initializing one from a `T` and subsequently creating an `Arc<T>` from a `Box<ArcInner<T>>`.

`ArcInner` does not seem like the most descriptive name for a type that's now publicly accessible. Rename it to `ArcRepr` since it's the underyling representation of an `Arc`.

The goal of this change is to provide more flexibility over how an `Rc<T>` is allocated. An example of when this could be useful is when the user does not want to panic on allocation failure and instead return a Result. The changes here provide a way of constructing an `Rc<T>` by first creating an `RcBox<T>` wrapping that in a `Box<RcBox<T>>`, and finally creating an `Rc<T>` from the `Box<RcBox<T>>`. The last step does no allocation of it's own and merely takes control of the `RcBox<T>` from the allocation within the Box. This does not make any of the fields within the `RcBox<T>` visible so it is *not* stabilizing the exact layout of the `RcBox<T>`. The only operations that can be done with `RcBox<T>` is initializing one from a `T` and subsequently creating an `Rc<T>` from a `Box<RcBox<T>>`.

`RcBox` does not seem like the most descriptive name for a type that's now publicly accessible. Rename it to `RcRepr` since it's the underyling representation of an `Rc`.

rust-highfive · 2020-12-14T16:16:13Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @m-ou-se (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

the8472 · 2020-12-14T19:04:21Z

Going through Box seems somewhat ad hoc in the face of ongoing work with the allocator API. Wouldn't it be better to have a try_new that returns an uninitialized Rc or Error (via Result<Rc<MaybeUninit<T>>, AllocErr>) which can then be initialized and cast later as needed.

Somewhat similar to fallible allocations in Vec (via new + try_reserve).

SimonSapin · 2020-12-14T19:17:29Z

(Hi @cramertj, what was wrong with the auto-assigned reviewer? Not having my name in https://github.com/rust-lang/highfive/blob/master/highfive/configs/rust-lang/rust.json is deliberate. I might be slow to respond further.)

This PR exposes implementation details that I feel should never be stabilized. I’d be more inclined to consider a fallible try_new constructor in the spirit of https://rust-lang.github.io/rfcs/2116-alloc-me-maybe.html. Though the discussion in #48043 has stalled pending resolution of some aspects of https://github.com/rust-lang/wg-allocators.

ggriffiniii · 2020-12-14T20:54:02Z

Going through Box seems somewhat ad hoc in the face of ongoing work with the allocator API. Wouldn't it be better to have a try_new that returns an uninitialized Rc or Error (via Result<Rc<MaybeUninit<T>>, AllocErr>) which can then be initialized and cast later as needed.

Somewhat similar to fallible allocations in Vec (via new + try_reserve).

I'm fine with try_new, (that was the second option listed in https://internals.rust-lang.org/t/expose-a-way-to-create-an-arc-fallibly/13551). I just chose this option since I didn't get strong feedback one way or the other there and it let me avoid the question of what Error to return when try_new fails. I didn't see this approach conflicting with the ongoing allocator API work. In the future when Arc has a second type param for the allocator then it would simply get added to the from_repr requirements. Something like:

impl<T, A> Arc<T, A> {
  fn from_repr(repr: Box<ArcRepr<T, A>>) -> Arc<T, A> {
    ...
  }
}

Your prototype returns an Rc<MaybeUninit<T>>. Is the MaybeUninit necessary? or should it return Rc<T> in the Ok path? Not that my goal is not to define an Arc in an uninitialized state, it's merely to allow handling allocation failures gracefully.

ggriffiniii · 2020-12-14T20:57:12Z

This PR exposes implementation details that I feel should never be stabilized. I’d be more inclined to consider a fallible try_new constructor in the spirit of https://rust-lang.github.io/rfcs/2116-alloc-me-maybe.html. Though the discussion in #48043 has stalled pending resolution of some aspects of https://github.com/rust-lang/wg-allocators.

I have no problem preferring a try_new method, but on the point of stabilizing implementation details I wanted to point out that this does not stabilize the implementation details of the Rc/Arc. It only stabilizes the fact that Arc/Rc are wrappers around a Box'd version of something. The implementation of that something is not stabilized and they are still free to change.

the8472 · 2020-12-14T21:14:22Z

It only stabilizes the fact that Arc/Rc are wrappers around a Box'd version of something. The implementation of that something is not stabilized and they are still free to change.

Well, for one it would make the size of the heap allocation observable which wasn't before. It would also set in stone that it can represented by a single allocation, conceivably the counters and the payload itself could have been stored separately.

Is the MaybeUninit necessary? or should it return Rc in the Ok path? Not that my goal is not to define an Arc in an uninitialized state, it's merely to allow handling allocation failures gracefully.

It sounded like you wanted to separate the time at which the storage is allocated vs. the time at which it is actually used. If that's not the goal then yeah the MaybeUninit is not necessary. In fact it simplifies things even further.

I just chose this option since I didn't get strong feedback one way or the other there and it let me avoid the question of what Error to return when try_new fails.

From previous discussions the try_ methods seem to be the preferred approach so that question will have to be settled before stabilization anyway to provide a consistent API across the multiple container types which will get fallible allocation.

Or to put it differently, there already are vague plans for what you want but they're not fully implemented yet, only try_reserve on collections exists so far but nothing for pointer-like containers. So it probably has to fit into that story.

ggriffiniii · 2020-12-14T21:36:19Z

It only stabilizes the fact that Arc/Rc are wrappers around a Box'd version of something. The implementation of that something is not stabilized and they are still free to change.

Well, for one it would make the size of the heap allocation observable which wasn't before. It would also set in stone that it can represented by a single allocation, conceivably the counters and the payload itself could have been stored separately.

Valid points.

Is the MaybeUninit necessary? or should it return Rc in the Ok path? Not that my goal is not to define an Arc in an uninitialized state, it's merely to allow handling allocation failures gracefully.

It sounded like you wanted to separate the time at which the storage is allocated vs. the time at which it is actually used. If that's not the goal then yeah the MaybeUninit is not necessary. In fact it simplifies things even further.

Great.

I just chose this option since I didn't get strong feedback one way or the other there and it let me avoid the question of what Error to return when try_new fails.

From previous discussions the try_ methods seem to be the preferred approach so that question will have to be settled before stabilization anyway to provide a consistent API across the multiple container types which will get fallible allocation.

Or to put it differently, there already are vague plans for what you want but they're not fully implemented yet, only try_reserve on collections exists so far but nothing for pointer-like containers. So it probably has to fit into that story.

Okay. I'll go forward with a try_ approach. Do you think it's appropriate to use the try_reserve feature to guard the try_new method? or should it be a separate feature gate specifically for pointer-like containers?

the8472 · 2020-12-14T22:59:05Z

A new feature makes sense since previous discussion and RFC 2116 only discussed collections, not pointer-like containers. So some new discussion will be necessary.

I have thought about the MaybeUninit case some more, I think it might be convenient to have a try_new but it can also be inefficient for some types if you have to construct a value up-front only to discard it when the allocation fails. The collections avoid that by separating allocation from insertion. On the other hand MaybeUninit requires unsafe code, so having a safe convenience method may still be good enough for the start.

brandenburg · 2020-12-18T10:01:18Z

I'd like to add my +1 to the spirit of this PR. In my project I also need a way to construct (potentially large) Arc objects such that OOM is handled gracefully by returning an error rather than panicing the system.

As a systems language, Rust definitely needs something like Arc::try_new(), at least behind a feature flag in Nightly. Currently, the only way out is to re-implement Arc and friends just to get fallible allocations, which is a lot of wasted effort for one corner case. It would be so much nicer and more productive to be able to simply use what alloc already provides.

the8472 · 2020-12-22T23:08:25Z

See #80310 for similar work being done for Box.

ggriffiniii · 2020-12-22T23:39:55Z

See #80310 for similar work being done for Box.

Thanks for the pointer. I'm planning on putting together a new PR with the try_new api, and I'll follow that Box discussion to ensure that Rc/Arc match appropriately.

CAD97 · 2020-12-29T19:52:05Z

@the8472

Well, for one it would make the size of the heap allocation observable which wasn't before. It would also set in stone that it can represented by a single allocation, conceivably the counters and the payload itself could have been stored separately.

FWIW, into_raw and from_raw, since they roundtrip between *const T and Rc<T>, already basically require that Arc/Rc are implemented as a single allocation.

the8472 · 2020-12-30T15:16:52Z

@the8472

Well, for one it would make the size of the heap allocation observable which wasn't before. It would also set in stone that it can represented by a single allocation, conceivably the counters and the payload itself could have been stored separately.

FWIW, into_raw and from_raw, since they roundtrip between *const T and Rc<T>, already basically require that Arc/Rc are implemented as a single allocation.

Do they? The counters could be looked up by pointer value, e.g. in a static, thread-safe table. Not the greatest implementation, but theoretically possible with the current design. Lookup would work since you're not allowed to provide arbitrary pointers, only pointers created by into_raw.

m-ou-se · 2020-12-30T19:00:23Z

#63291 (comment) proposes a separate 'builder' type for Rc called RcMut which represents an Rc with a counter of 1 that has not been shared yet. Would a function on RcMut<MaybeUninit<T>> that writes a T into it and converts it to a Rc<T> solve the problem this PR is trying to solve? Then RcMut::uninit() could be used to allocate the object before initializing the T.

I'd prefer some kind of 'builder' like this over exposing implementation details like RcRepr.

bors · 2021-01-01T13:43:04Z

☔ The latest upstream changes (presumably #80310) made this pull request unmergeable. Please resolve the merge conflicts.

ggriffiniii · 2021-01-01T21:20:19Z

#63291 (comment) proposes a separate 'builder' type for Rc called RcMut which represents an Rc with a counter of 1 that has not been shared yet. Would a function on RcMut<MaybeUninit<T>> that writes a T into it and converts it to a Rc<T> solve the problem this PR is trying to solve? Then RcMut::uninit() could be used to allocate the object before initializing the T.

I think that's solving a different problem. I'm working in an environment where we want to be sure that allocation failures do not cause a panic and are handled at the call-site where the allocation occurs. The motivation for this change was to allow external unsafe code to allocate and initialize an ArcRepr itself handling it however we want, then the conversion into an Arc would not require any additional allocation. Having a try_new method on Arc also accomplishes my goal, which is what I was planning on proposing next, but luckily #80310 beat me to it.

I'm going to go ahead and close this PR since #80310 added all the functionality I wanted.

ggriffiniii added 5 commits December 14, 2020 07:56

Rename ArcInner to ArcRepr.

d30e9d6

`ArcInner` does not seem like the most descriptive name for a type that's now publicly accessible. Rename it to `ArcRepr` since it's the underyling representation of an `Arc`.

Rename RcBox to RcRepr.

907e441

`RcBox` does not seem like the most descriptive name for a type that's now publicly accessible. Rename it to `RcRepr` since it's the underyling representation of an `Rc`.

Add documentation comments for ArcRepr and RcRepr.

ea14e22

rust-highfive assigned m-ou-se Dec 14, 2020

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 14, 2020

ggriffiniii added 2 commits December 14, 2020 08:43

Some tidy fixes.

dcdda7d

./x.py fmt

f3b554d

cramertj requested a review from SimonSapin December 14, 2020 18:04

cramertj removed the request for review from SimonSapin December 14, 2020 21:25

m-ou-se added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Dec 30, 2020

ggriffiniii closed this Jan 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Publicly expose the allocated representation of Rc and Arc #80029

Publicly expose the allocated representation of Rc and Arc #80029

ggriffiniii commented Dec 14, 2020

rust-highfive commented Dec 14, 2020

the8472 commented Dec 14, 2020

SimonSapin commented Dec 14, 2020

ggriffiniii commented Dec 14, 2020

ggriffiniii commented Dec 14, 2020 •

edited

Loading

the8472 commented Dec 14, 2020

ggriffiniii commented Dec 14, 2020

the8472 commented Dec 14, 2020 •

edited

Loading

brandenburg commented Dec 18, 2020 •

edited

Loading

the8472 commented Dec 22, 2020

ggriffiniii commented Dec 22, 2020

CAD97 commented Dec 29, 2020

the8472 commented Dec 30, 2020

m-ou-se commented Dec 30, 2020

bors commented Jan 1, 2021

ggriffiniii commented Jan 1, 2021

Publicly expose the allocated representation of Rc and Arc #80029

Publicly expose the allocated representation of Rc and Arc #80029

Conversation

ggriffiniii commented Dec 14, 2020

rust-highfive commented Dec 14, 2020

the8472 commented Dec 14, 2020

SimonSapin commented Dec 14, 2020

ggriffiniii commented Dec 14, 2020

ggriffiniii commented Dec 14, 2020 • edited Loading

the8472 commented Dec 14, 2020

ggriffiniii commented Dec 14, 2020

the8472 commented Dec 14, 2020 • edited Loading

brandenburg commented Dec 18, 2020 • edited Loading

the8472 commented Dec 22, 2020

ggriffiniii commented Dec 22, 2020

CAD97 commented Dec 29, 2020

the8472 commented Dec 30, 2020

m-ou-se commented Dec 30, 2020

bors commented Jan 1, 2021

ggriffiniii commented Jan 1, 2021

ggriffiniii commented Dec 14, 2020 •

edited

Loading

the8472 commented Dec 14, 2020 •

edited

Loading

brandenburg commented Dec 18, 2020 •

edited

Loading