Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add &own T #965

Closed
wants to merge 5 commits into from
Closed

Add &own T #965

wants to merge 5 commits into from

Conversation

mahkoh
Copy link
Contributor

@mahkoh mahkoh commented Mar 11, 2015

Add reference and slice types that take ownership of the objects they reference.

The motivation has not yet been updated. You might want to read the following and subsequent posts instead of the motivation in the RFC: #965 (comment)

Rendered

@theemathas
Copy link

Unresolved question, how to get a &move [T] from a Vec<T>? What are the required traits? DerefMove and IndexMove?

@theemathas
Copy link

Unresolved question: how would deref coercions work?

@theemathas
Copy link

Unresolved question: is it possible to convert &move &mut T to &mut T? See also rust-lang/rust#14270

@comex
Copy link

comex commented Mar 11, 2015

Three thoughts:

  • Why does the callee need to be responsible for deallocation, rather than the caller? I don't understand the need for vtables at all.
  • Is it actually useful to have a pointer you can move out of, as opposed to must? i.e. the latter would be semantically equivalent to passing by value, but compatible with trait objects and other use cases. That would avoid the drop flag. I'm not convinced by the collect_strings example.
  • A pointer you can/must move out of has a natural dual: a pointer you must move into. If this is considered useful, then it would be better to call them &in and &out than the ambiguous &move. I think it would be most useful for guaranteeing efficient calling conventions for functions that semantically return large structs, rather than relying on LLVM optimizing it, which it's currently rather bad at - although this situation ought to be improved on LLVM's end. But it could also be useful for other purposes: for example, when implementing a C function exposed via FFI as int get_something(foo *in_param, bar *out_param), the Rust signature could be fn get_something(in_param: &in Foo, out_param: &out Bar) -> c_int... Can anyone think of a more interesting use case?

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 11, 2015

@comex

Why does the callee need to be responsible for deallocation, rather than the caller? I don't understand the need for vtables at all.

You're correct. I didn't consider that you could drop the container and the contained object at different points. I've updated the detailed design of the RFC.

Unfortunately this makes &move [T] less powerful than before. However, I believe this problem will be solved automatically in the future.

Is it actually useful to have a pointer you can move out of, as opposed to must?

When you drop a &move reference without moving out of it, the compiler will automatically move out of it.

@theemathas

I think I've addressed your comments in the new detailed design.

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 11, 2015

The restrictions in place are somewhat natural:

  • You cannot move out of anything but T and Box<T> so you can only create &move T from T or Box<T>.
  • You cannot move [T] out of Vec<T> so you cannot create a &move [T] from a Vec<T>.

@mahkoh mahkoh changed the title Add &move T and &move [T] Add &move T Mar 11, 2015
@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 11, 2015

Just to document this since I accidentally killed the first revision: A previous version of this RFC supported &move [T] created from lots of containers such as Vec<T> by containing 5 pointers.

@mahkoh mahkoh changed the title Add &move T Add &out T Mar 11, 2015
@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 11, 2015

I've renamed &move to &out to avoid ambiguity with the closures syntax &move || ....

@kennytm
Copy link
Member

kennytm commented Mar 11, 2015

I would expect &out T mean an uninitialized reference to be filled i.e. #98, not a moved reference. How about &unique T or &owned T if we want to create a new keyword anyway?

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 11, 2015

Yes, &out is somewhat ambiguous.

  • This is a reference one can move out of.
  • This is where you store the _out_put of your function.

Both interpretations seem valid.

@aidancully
Copy link

I had some concerns with being able to return &out pointers. Code like the following should be prevented:

fn soundness_hole() -> &out T {
  let x: T = T::new();
  &out x
}

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 11, 2015

@aidancully: Like all references, the lifetime of &out references is bound by the lifetime of the thing they reference. Your code should give the same error you get with &mut.

@aidancully
Copy link

@mahkoh Yes, that makes sense. If I can say, lifetime annotations should probably be added to deref_out, and to your functions f, g, and h, it would have helped me (at least) avoid confusion.

I am for this RFC.

@Florob
Copy link

Florob commented Mar 11, 2015

I have to say, from reading the RFC I can't quite grasp why this is desirable. The examples seem to boil down to "This gives you trait objects you can move out of, without requiring a heap allocation".
Assuming that is indeed the main motivation: Is this use-case common enough to warrant a whole new reference type?
Like some others, I also found the name &out confusing. I was expecting something completely different from the name, than what the RFC actually describes.

@mahkoh mahkoh changed the title Add &out T Add &own T Mar 12, 2015

```rust
fn f() {
let x: Option<String> = Some(String::new());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let mut x

@oli-obk
Copy link
Contributor

oli-obk commented Mar 12, 2015

I don't really see how T differs from &own T and how &own [T] differs from unsized arrays ([T])?

Why would we need references to move out of, if we could just move the type directly. If you need to move out of traits, use generics. I see that something needs to be done about slices, but I'd much prefer to have unsized arrays.

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 12, 2015

@Florob

I have to say, from reading the RFC I can't quite grasp why this is desirable.

@oli-obk

Why would we need references to move out of

Let me try to explain myself some more.

I think the idea that &own brings to the language can be expressed in one sentence:

I have a reference to an object I own but I don't own/care about the memory it is stored in
Why this is useful for sized objects

As @oli-obk pointed out, you can already transfer ownership of sized objects by passing them by value. This also requires you to copy the object to a place where the new owner can see it. Most of the time this is absolutely what you want and what you should do!

There are, however, certain situations where you don't want to do this:

The first case is when you want to precisely control when something is copied. If I have a large object x and I pass it through many functions f -> g -> h, then the large object will be copied unnecessarily often which can harm performance. One prominent example of this is Box::new:

fn f(x: X) -> Box<X> {
    Box::new(x)
}

This will first copy x from the stack of f into the stack of Box::new and then into the newly allocated Box. One of the reasons the special box keyword exists is this unnecessary copy. If Box::new accepted &own T, this problem would be solved automatically.

The second case is when your object is in a smart pointer. A smart pointer (currently the only one that exists is Box) has ownership of your object (often via a heap allocation.) Consider the following code:

fn f(x: X, y: Box<X>) {
    g(x);
    g(*y);
}

fn g(x: X) { /* ... */ }

The function g takes X by value because it wants ownership. Another way to write the code is like this:

fn f(x: X, y: Box<X>) {
    g(box x);
    g(y);
}

fn g(x: Box<X>) { /* ... */ }

This time g takes a Box which is currently the only way to express an owner pointer. You can see that neither version is ideal. In the first case we have to throw the owned pointer y away. In the second case we have to allocate a new box. With &own this could be written efficiently:

fn f(x: X, y: Box<X>) {
    g(&own x);
    g(&own *y);
}

fn g(x: &own X) { /* ... */ }

C has these features because C's pointers are fundamentally unsafe, you cannot distinguish between owned and borrowed pointers. Rust's references are much safer but they also can't express the case where a pointer is owned. Rust is a systems language and I believe that something like this should be expressible in a systems language.

Why this is useful for slices

In Rust, slices are unsized arrays. Slices are written [T] and since they are unsized you can't have a slice on the stack. Instead you store references to slices: &[T] and &mut [T]. In the case of sized types we saw that we can emulate owned pointers by passing the object by value. This is not possible with slices because they are unsized.

If a functions wants to flexibly accept slices of type T, then it has to look like this:

fn f(x: &mut [T]) { /* ... */ }

fn g(mut x: [T; 16], mut y: Vec<T>, mut z: Box<[T]>) {
    f(&mut x[..]);
    f(&mut y[..]);
    f(&mut *z);
}

The problem here is that the ownership of the elements in the slices cannot be transferred to f. f will have to clone or copy the elements (if T is cloneable.)

One way to work around this is by having f accept Box<[T]>:

fn f(x: Box< [T]>) { /* ... */ }

fn g(x: [T; 16], y: Vec<T>, z: Box<[T]>) {
    f(box x);
    f(y.into_boxed_slice());
    f(z);
}

This again requires an unnecessary allocation for x and also loses the memory contained in y. Let's see how this would look with owned slices:

fn f(x: &own [T]) { /* ... */ }

fn g(x: [T; 16], mut y: Vec<T>, z: Box<[T]>) {
    f(&own x[..]);
    f(y.as_owned_slice());
    f(&own *z);
}

No unnecessary allocations this time. One additional benefit is that, after f returns, you can use the Vec<T> as before. The semantics of owned references guarantee that all elements have been moved out of the vector when it is no longer borrowed. That is, as_owned_slice would look like this:

fn as_owned_slice(&mut self) -> &own [T] {
    unsafe {
        let slice = mem::transmute(self.as_slice());
        self.set_let(0);
        slice
    }
}
Why this is useful for traits

As with slices, there is no way to pass ownership of traits without allocating a Box<Trait>. One alternative here is to use monomorphization:

fn f<T: Trait>(x: T) { /* ... */ }

There are some downsides to this:

Monomorphization can cause significant code/binary bloat for little benefit. While monomorphization avoids a virtual function call, this is often not necessary, even in a language used for systems programming.

The syntax becomes unwieldy when used for multiple arguments:

fn f<T: Trait, U: Trait, V: Trait>(x: T, y: U, z: V) { /* ... */ }

It is not possible to pass slices this way. The following code will only ever accept one type:

fn f<T: Trait>(xs: &mut [T]) { /* ... */ }

Since you currently can only express owned traits with the Box<Trait> syntax, people are writing code like this:

trait Trait {
    fn f(self: Box<Self>) -> T;
}

which is a backwards compatibility hazard if we ever get other smart pointers. With owned trait references this becomes

trait Trait {
    fn f(&own self) -> T;
}

@oli-obk
Copy link
Contributor

oli-obk commented Mar 12, 2015

wow... just reading stuff from you makes me want to accept everything you say :D You defend your arguments well.

One of the reasons the special box keyword exists is this unnecessary copy.

Are you inferring this or are there discussions on this?

I was under the impression we were trusting LLVM on this, even more so since we got jemalloc-optimizations last week (optimizing out jemalloc allocations in situations like boxing just for ownership). Especially since the book says that we should not return boxes: http://doc.rust-lang.org/book/pointers.html#returning-pointers

Though I'm not sure about unboxing and reboxing.

So just to be sure: Your suggestions are just optimizations to prevent copies and allocations? If so, I'd rather suggest making the internal optimizations a language-guarantee like tail-call-optimizations in Scheme

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 12, 2015

wow... just reading stuff from you makes me want to accept everything you say :D You defend your arguments well.

Thanks.

Are you inferring this or are there discussions on this?

I think there is actually an issue in the Rust repo that explicitly says this short and sweet. But I can't find it right now. You can also read the motivation in this RFC: https://github.com/rust-lang/rfcs/blob/8486fed94fbe85d8da62f3dd6ffddd58564c37f6/text/0000-placement-box.md

I was under the impression we were trusting LLVM on this

LLVM is not part of the language while references are. It is true that LLVM might optimize this away, it currently does not (if that has not changed very recently). Another Rust backend might do different things. You can do this in explicitly in C and I believe it should also be possible in Rust.

Especially since the book says that we should not return boxes

It is true that

fn f() -> X {
    X {
          /* ... */
    }
}

will be optimized to code that does not copy. However this breaks down when you give the return value a name:

fn f() -> X {
    let mut x = /* ... */
    /* fill x with data */
    x
}

This will cause a copy. The thing Rust would have to implement to make this not copy is called named return value optimization.

Your suggestions are just optimizations to prevent copies and allocations?

I think it's not "just optimizations". Being efficient is one of the explicit goals of Rust.

@oli-obk
Copy link
Contributor

oli-obk commented Mar 12, 2015

I think it's not "just optimizations". Being efficient is one of the explicit goals of Rust.

But it should do so automatically, without requiring the programmer to write things just to make the code efficient. In C++ we learned the hard way NOT to use const T& parameters.

Another Rust backend might do different things. You can do this in explicitly in C and I believe it should also be possible in Rust.

It could simply be a rule, that a rust compiler needs to optimize this properly.

This will cause a copy. The thing Rust would have to implement to make this not copy is called named return value optimization.

Alas, lets do it.

It is true that LLVM might optimize this away, it currently does not (if that has not changed very recently).

See rust-lang/llvm#37 . llvm detects intermediate heap allocations: http://is.gd/LPQJbJ . It does not (yet?) detect intermediate heap deallocations: http://is.gd/liX8kO

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 12, 2015

It could simply be a rule, that a rust compiler needs to optimize this properly.

You cannot make such a guarantee. Rustc can't even do it right now. If there is a regression in LLVM then you've suddenly broken the language specification. How would that guarantee even look like? In some cases (sufficiently small structs) it might be better to pass the struct by value.

When I write Rust I want to approximately know what the generated assembly looks like. Certain core team members have expressed similar sentiments when speaking about Rust publicly. When I write box x then I expect a heap allocation. It might not do a heap allocation and that's great, but I cannot "approximately know" that.


I think I might have undersold the RFC a bit when I replied above to the statement that it is "just about optimizations". Not having owned references will affect libraries because APIs will have to accommodate for the fact that you can't move out of slices/references/traits.

I already mentioned above that the fn f(x: Box<Self>) syntax is a backwards compatibility hazard.

People learn from the first day that creating boxes is something you try to avoid at all costs in Rust. If Box<Self> is the only way to move out of a trait, then traits will not be designed with the capability of moving out of them in mind.

Let's consider an API where the user wants to move a variable number of objects into a function:

fn f<T>(xs: &own [T]) { /* ... */ }

Without this syntax available the library might look like the following:

fn f_clone<T: Clone>(xs: &[T]) { /* ... */ }
fn f_vec(xs: Vec<T>) { /* ... */ }

Even though the library really just wants to move out of a slice and has no need for either T: Clone or a vector.

@oli-obk
Copy link
Contributor

oli-obk commented Mar 12, 2015

I have come around to your view. This could also be used in things like Vec::remove (which now exists for ranges, but out of necessity returns an iterator)

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 17, 2015

@dgrunwald: Thank you for your comments.

If I currently have a function that accepts a large struct by value, can I always replace the parameter with &own to eliminate the copy?

Correct. If the argument has type &own, then only the pointer will be passed to the called function, just like with other references.

If so, that sounds like a mechanical transformation that the compiler can do automatically. [...] And once the compiler does this automatically for structs that aren't known to be small

This looks like an implementation detail that cannot be part of the language specification without restricting future platforms or implementations. It's really hard to say what "small" means here: Two pointers in a struct is definitely not "large", we want to pass those by value because they fit nicely into two registers. On future architectures we might want to pass very large structs by value because it's very efficient there. The language spec probably can't say more than "the compiler will decide if passing by value or reference is more efficient," which is not worth adding to the spec.

we could allow unsized types as parameters, which would allow passing around slices by value, and using functions that take self by value on object-safe traits.

I'm not a fan of allowing unsized types as "lvalue" types. E.g., the following looks weird:

fn f(x: [u8]) {
    let y: [u8] = x;
}

I'm not really sure what's going on here.

Of course, &own is a bit more general than this

For example, the compiler magic you suggest wouldn't allow the following:

struct X<'a> {
    field: &'a own [u8],
}

fn f<'a>(field: &'a own [u8]) -> X<'a> {
    X { field: field }
}

@Ericson2314

The fact is right now Rust does not understand owning pointers or uninitialized pointers. I argue that giving Rust's worldview, these concepts should be part of the language.

I think Rust wants to be a systems language in the sense that it can successfully replace C and C++ (as long as compilers are available.) This means that it has to be usable at the firmware/driver level. Owned references give you control that is currently not available in the language. "Knowing better than the compiler" is definitely something that can happen in a lower-level language.

@dgrunwald

large types are implicitly passed by reference,

When would the destructor of the box be called in the following code?

let x: Box<T> = ...
f(*x);

Why have to teach new users "you have to use an &own type to pass a slice by value" if you can just make the simple case work as expected?

People will have to learn about unsized types anyway, e.g., you can't have an unsized type field in a struct.

But if set_len doesn't free the memory, who does?

The &own [T] only takes ownership of the elements in the vector, not the memory they are stored in. That memory is still responsibility of the vector. Consider the following code:

let x: Vec<i32> = Vec::with_capacity(5);
x.push(1);
{
    let y: &own [i32] = x.as_owned_slice();
    // x borrowed here
}
assert!(x.len() == 0);
assert!(x.capacity() == 5);

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 17, 2015

@dgrunwald

Which is why I asked for motivating use cases for those "other things".

On a "language design" level: &own allows unsized types to be passed around by value. You suggested that the compiler allows this in function calls but &own is more explicit and more powerful (struct fields.)

@dgrunwald
Copy link
Contributor

When would the destructor of the box be called in the following code?
let x: Box = ...
f(*x);

Same way with my compiler magic as with &own: the T destructor gets called by f, the box gets deallocated by the caller after f returns. Which is unfortunate (for both ideas), because it adds more compiler magic to the Box (essentially requiring two drop flags), and I don't see a way to get the same semantics from a library type through a DerefMove trait.

I'm not against these semantics for passing ownership into functions without copying memory, I'm just not sure if it's worth extra syntax when a bit of compiler magic can get us half the way there.
The compiler magic has some advantages with generic code, as some existing generic functions that pass stuff by-value can be extended to support unsized types in a backward-compatible fashion.
The extra syntax has the advantage that it also goes the other half of the way.

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 17, 2015

Same way with my compiler magic as with &own: the T destructor gets called by f, the box gets deallocated by the caller after f returns.

What if T is small enough to be passed around by value?

@Ericson2314
Copy link
Contributor

How does deref-consuming of Box work now, btw? I have absolutely no idea, other than a guess it is probably black magic.

@dgrunwald
Copy link
Contributor

What if T is small enough to be passed around by value?

In that case it does get passed by value. Not sure whether the box should get deallocated before or after the call in that case. In fact, "after the call" should probably read "at the end of the statement" (when temporaries get destroyed).

But I'm not sure if the language even needs to specify when exactly the box has to be deallocated, since the difference usually isn't visible to user code (unless using a custom memory allocator).

How does deref-consuming of Box work now, btw?

No idea, but it's definitely compiler magic.
I was hoping a &move RFC would give us a way to make Box a library type, but that doesn't seem to be possible with this RFC.
Unlike Vec where the length can be set to 0 to indicate the contents were destroyed and only the memory needs to be deallocated, Box doesn't have any room to store whether dropping the box still needs to destroy the contents. This may also cause problems for other smart pointers that would want to implement DerefMove.

@Ericson2314
Copy link
Contributor

I was hoping a &move RFC would give us a way to make Box a library type, but that doesn't seem to be possible with this RFC.

Just thinking about that actually. Arguably we should have:

struct Box<T>(&'static owned T);
struct EmptyBox<T>(&'static undef T);

Taking &own *a_box leaves one with an empty box after the lifetime expires, because the only way to destroy a &own T is to move out / destroy T. Taking &undef *an_empty leaves a box after the lifetime expires, because the only way to destroy an &undef T is to assign to it.

I don't want to bring back typestate, but... such symmetry :)

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 17, 2015

I was hoping a &move RFC would give us a way to make Box a library type

There has been some discussion (e.g. a smart pointer has to specify a second destructor that is run if and only if you've moved out before drop) but this is orthogonal to &own. &own can be used in such a design.

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 17, 2015

E.g.

struct Container<T> {
    random_field: Vec<i32>,
    data: ManuallyDrop<T>,
}

trait Smaht: DerefMut {
    /// Used when moving out or using `&own *Smaht`.
    fn deref_move(&mut self) -> &own <Self as Deref>::Target;

    /// Drop method called if the content has been moved out.
    fn drop_moved(&mut self);
}

impl<T> Smaht for Container<T> {
    fn deref_move(&mut self) -> &own T {
        unsafe { mem::transmute(&*self.data) }
    }

    fn drop_moved(&mut self) { }
}

impl<T> Drop for Container<T> {
    fn drop(&mut self) {
        unsafe { ptr::read(&*self.data); }
    }
}

ManuallyDrop comes from another RFC. The content of ManuallyDrop is not automatically dropped in the destructor.

@dgrunwald
Copy link
Contributor

trait Smaht: That could work, though you'd have to mark deref_move and drop_moved as unsafe, or prevent unpaired manual calls from safe code in some other way (probably just outright ban manual calls as with fn drop).
Given that the ability to move out of containers seems to be the main benefit of having &own in the type system (as opposed to the compiler magic I'm suggesting), maybe you should add something like that trait to this RFC?

On the syntax side: We can use &move by solving the &move || ambiguity with a simple lookahead rule.
Note that the user can always disambiguate by adding parentheses: &(move ||) vs. &move (||).

Variant 1: &move || parses as &(move ||). This is backwards-compatible with current rust.
Variant 2: &move || parses as &move(||). This would require a pre-1.0 change to enforce the use parentheses when using the & operator on a move closure. Not sure how much code this would break -- if a borrowed temporary closure lives long enough, why use a move closure?

Note that &own is also tricky in a completely backwards compatible fashion, because this is currently valid code:

let own = 1;
let x = &own - 2; // (&own) - 2
let y = &mut - 2; // &mut (-2)

So either we'll have to bump some form of 'language version' to allow for new keywords or syntax changes, or we'll need some disambiguation rule. I'd guess that &move is easier to disambiguate; but since we can make both cases work, I'd suggest you pick the name that makes more sense.

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 17, 2015

Given that the ability to move out of containers seems to be the main benefit

That's not given at all.

maybe you should add something like that trait to this RFC?

It's orthogonal to this RFC.

@dgrunwald
Copy link
Contributor

Consider: Rust currently has generic functions that take T by value. Those functions are necessarily restricted to sized types, but we could relax those restrictions if we allow passing unsized types by value.
But if we don't allow passing unsized types by value and just add &own instead, those function will get duplicated into T / &own T variants.

I don't want us to end up like C++ where all functions take const T& instead of T for efficiency. (Rust: all functions take &own T instead of T for efficiency, and for allowing the use of unsized types)
So, at some point in the future we'll probably want to go the "compiler optimization" approach anyways for the sake of generic code. (also consider: deref coercions, where the community decided that "is it borrowed or owned?" is the more important question than "how many levels of indirection are there?")

In that light, what motivation remains for &own? Yes it's more general, but I've seen few usecases that need that generality. [&own Any] is one, but moving out of containers is the more important one.
So even though DerefMove is technically orthogonal, it's a big part of what will make &own actually useful. (fully specifying the trait is probably too much for this RFC, but you should at least add it to the motivation section)

@joshtriplett
Copy link
Member

I'm really hoping this doesn't happen. One of the things I found really appealing about Rust follows naturally from the default being immutable values and "mut" being explicit. For a read-only value, it doesn't semantically matter if you pass it by reference or by value, other than efficiency, so the compiler can always choose to DTRT (e.g. "if the value fits in a processor register, pass by value, else pass by reference"). Which means that the undecorated type name is the thing you almost always want.

@aidancully
Copy link

@joshtriplett The availability of the facility does not force you to use it. You'd be free to continue passing by value wherever makes sense, and most people would in the same way that most people don't currently pass boxes around. On the other hand, the absence of the facility means that users are forced to assume that any time ownership changes, the address may also change. Usually this won't matter, but there are cases in which it does, especially with FFI. Further, as @mahkoh showed above not having &own pointers makes it harder and less efficient than it should be to represent ownership of DSTs.

The one thing that still sticks a little for me is that &own T is not a perfect parallel for T, in that it immediately allows mutability. In my opinion, a mutable &own T should be declared something like let mut x = &own T;, and x as &mut T should only be valid if x were declared as mutable.

@joshtriplett
Copy link
Member

@aidancully The language (as opposed to libraries) being wider is not necessarily a feature.

I would suggest, if there's a use case for this that isn't just efficiency, the motivation in the RFC needs to very clearly spell that out. Because if this is just about efficiency, the compiler can and should handle that. If this is needed to build certain interfaces that can't be built otherwise, then those use cases should be spelled out very clearly. The comment you linked to seems to assume that any use of "T" means a copy, but that's not the case. Assuming appropriate compiler optimizations, what use cases remain that can't be expressed without &own ?

@reem
Copy link

reem commented Mar 19, 2015

@joshtriplett A by-value iterator over a stack allocated array of a constant number of Ts can't be written today, but could be written with &own [T]. The same is true for all other things which want to receive a slice that owns the data it contains without forcing heap allocation.

For what it's worth, I've run into these limitations surrounding owned DSTs several times in real code - these aren't just hypothetical concerns.

@joshtriplett
Copy link
Member

@reem That makes sense; that couldn't be written without language changes.

Could that also be handled by allowing code to pass a value of slice type by non-& type, with the compiler optimizing that by passing it by pointer internally?

@aidancully
Copy link

@joshtriplett:

The comment you linked to seems to assume that any use of "T" means a copy, but that's not the case.

No, that's not what I meant at all. That's why I used the word "may" in "users are forced to assume that any time ownership changes, the address may also change." (emphasis added) In other words, it's currently impossible for an FFI to rely on an address not moving, when ownership can change.

@joshtriplett
Copy link
Member

I wasn't referring to your comment; I was talking about the rationale in #965 (comment) , which seemed to assume that a non-& type always meant a copy.

I'm not trying to argue that there are no use cases where &own makes sense; I'm asking that such use cases be added to the rationale in the RFC (along with why none of T, &T, &mut T, or a library type wrapping an unsafe T* will work for those use cases), to make it clear that this is about more than just optimizations.

@aidancully
Copy link

I guess that's like the old saw that any post correcting a spelling error will always include at least one spelling error, sorry, I misunderstood your response. I see why you say that, but the part referring to moves requiring copies was only one part of the comment linked ("why this is useful for sized objects"). While I agree with your criticism of that part (pass by value will be more optimizable than pass by &own, as the compiler is less constrained to choose a representation strategy), the rest seemed (to me) to be about difficulty representing ownership of DSTs, which is a long-known language shortcoming.

@nikomatsakis nikomatsakis self-assigned this Mar 19, 2015
@nikomatsakis
Copy link
Contributor

My feeling is that it is way too soon to consider adding new builtin pointer types. In teaching Rust, I've found that the current setup (T, &T, &mut T) hits a pretty sweet spot between being simple enough to explain and complex enough to cover almost everything you want to do. When we used to have more types (@T, ~T, etc) the cognitive overhead was very large and it was a very common complaint. Any such addition is going to have to pass a very high bar in terms of solving problems that we can't address another way; and even then I'm inclined to wait some time until we have more experience teaching Rust to general audiences.

In any case, I think many of the use cases in this RFC can be addressed using library types. For example, @eddyb once observed to me if we have custom allocators, then you can probably express the idea of an owned value that resides in an outer stack frame (i.e., &'a own T) using something like Box<T, StackSlot<'a>> where StackSlot is a custom allocator.

@eddyb
Copy link
Member

eddyb commented Mar 21, 2015

I probably need to review everything else said here, but I have to make one thing clear:
This cannot be done only with library types. That StackSlot<'a> (or Stack<'a> as I like to call it) would have to be a lang item with a special operation to go from a T lvalue to Own<T, Stack<'a>> (which has been historically named &move, &own, Move, Open and Box).

@nikomatsakis
Copy link
Contributor

@eddyb yes, sorry, I didn't mean to imply that there was no need for language treatment, just no need for builtin types. (That said, it may be that we can find some kind of solution that doesn't involve growing the language at all, I don't know. I haven't thought about this in a while and would like to revisit it at some point.)

In any case, we discussed this RFC at triage and came to the conclusion that it should be postponed. There is definitely a real use case here but it's not yet time to dive into it. I was surprised that I did not find a suitable pre-existing issue, so I opened #998.

Thanks @mahkoh for another thoughtful RFC.

@joshtriplett
Copy link
Member

A question about a case where &own might have some useful semantics:

Suppose you have a structure "Options" with a ::new() constructor and a set of chainable self-mutating methods that each return a &mut to the structure. You can write this:

fn main() {
    let mut opts = Options::new();
    opts.opt1(...).opt2(...);
    method_with_options(&opts);
}

And you can write this:

fn main() {
    method_with_options(&Options::new().opt1(...).opt2(...));
}

But you can't write this:

fn main() {
    let mut opts = Options::new().opt1(...).opt2(...);
    method_with_options(&opts);
}

That fails the borrow checker because the return value from ::new() owns the Options structure, and the &mut references returned from the chained methods just borrow it, so it dies at the end of the first statement, while opts lives until the end of the block.

If &own existed, with the proposed semantics, could .opt1 and .opt2 accept and return an &own instead of a &mut to allow the above syntax?

@nagisa
Copy link
Member

nagisa commented Apr 18, 2015

Suppose you have a structure "Options" with a ::new() constructor and a set of chainable self-mutating methods that each return a &mut to the structure.

A builder.

If &own existed, with the proposed semantics, could .opt1 and .opt2 accept and return an &own instead of a &mut to allow the above syntax?

Yes. You can still do the same by passing self by value, though it might become inefficient for large Options structures. &own works pretty much like passing by value, only via one level of indirection, so you can avoid all the moves.

@joshtriplett
Copy link
Member

@nagisa Oh, that makes sense. So the compiler can relatively easily decide to optimize both exactly the same way, making &own again just an optimization hint that the compiler could derive on its own for builders.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.