Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoreferencing Copy Types #2111

Closed
wants to merge 5 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
199 changes: 199 additions & 0 deletions text/0000-autoref-copy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
- Feature Name: autoref_copy
- Start Date: (fill me in with today's date, YYYY-MM-DD)
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary
[summary]: #summary

This RFC allows autoreferencing of `T` where `T: Copy`.

Example:

```rust
[derive(Copy, Clone)]
struct S(u8);

fn by_val(_: S) {}
fn by_ref(_: &S) {}
fn by_mut_ref(_: &mut S) {}

fn main() {
let s = S(5);
by_val(s);
by_ref(s); // This works after this RFC.
// by_mut_ref(s); // ERROR -- expected `&mut S`, found `S`
}
```

# Motivation
[motivation]: #motivation

When working with `Copy` data, the distinction between borrowed and owned data
is often unimportant. However, generic code often results in code which will
only accept a particular variant of a `Copy` type (`T`, `&T`, `&&T`, ...).
This is a frustration in many places:

```rust
// Comparisons:
let v = vec![0, 1, 2, 3];
v.iter().filter(|x| x > 1); // ERROR: expected &&_, found integral variable
// These work:
0 > 1;
&0 > &1;
// But these don't:
&0 > 1;
0 > &1;

// Trait instantiations:
let mut map: HashMap::new();
map.insert(0, "Hello");
map.insert(1, "world!");
map[0]; // ERROR: expected &{integer}, found integral variable
// or
map.get(1); // ERROR: expected `&{integer}`, found integral variable

// Numeric operators:
// These work:
&0 + &1;
0 + &1;
&0 + 1;
// But these don't:
&&0 + 1;
&&0 + &1;
&&0 + &&1;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RFC should note that the proposed coercion will not make the examples with binary operations compile. Or the examples should be removed, they don't really motivate this RFC.

```

These interactions confuse both new and experienced users without providing
any significant value. It's clear what's intended by `map.get(1)` or
`vec.iter().filter(|x| x > 1)`. When users encounter these errors in practice,
the only reasonable thing they can do is add `&` and `*` as necessary to make
their code compile.

This RFC seeks to address one particular variant of this problem: passing
owned `Copy` data where a reference was expected.

Example:

```rust
use std::collections::HashMap;
fn main() {
let mut map = HashMap::new();
map.insert(1, "hello!");

println!("{:?}", map.get(1)); // ERROR: expected `&{integer}`

// Instead, users have to write this:
println!("{:?}", map.get(&1));
}
```

This is an unnecessary frustration. This RFC proposes to prevent this issue
by auto-referencing `Copy` types.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

When calling a function which expects a reference to a `Copy` type, you can
pass an owned `Copy` type and a reference will be created automatically:

```rust
fn print_num(x: &usize) { println!("{:?}", x); }

fn main() {
let x = 5;
print_num(x);
}
```

This is particularly convenient when working with some generic types whose
methods take references as arguments:

```rust
use std::collections::HashMap;
fn main() {
let mut map = HashMap::new();
map.insert(1, "hello!");

// We can write this:
println!("{:?}", map.get(1));

// Instead of having to write this:
println!("{:?}", map.get(&1));
}
```

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

`Copy` types autoreference: at
[coercion sites](https://github.com/rust-lang/rfcs/blob/master/text/0401-coercions.md#coercions),
when a reference is expected, but an owned
`Copy` type is found, an `&` will be automatically inserted.

Some types have interior mutability through `UnsafeCell`, so passing them
via a hidden reference could allow surprising mutations to occur.
To prevent this, the autoreferencing coercion is limited to types which do not
directly contain an internal `UnsafeCell`.
Types which reference an `UnsafeCell` through an indirection can still be
coerced (e.g. `&MyCopyCell<u8>` to `&&MyCopyCell<u8>`).

This coercion will not occur if the lifetime of the resulting reference would
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That... sounds like a good way to cause "inference ripping itself apart" issues, at the very least. Type-checking is necessarily independent of region inference, and even more necessarily independent of it with NLL.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need a dependency upon region inference? I'd expect it'd only need to examine the signature of the function being called to check if the argument lifetime either comes from outside the function or escapes via the return type.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cramertj On function calls perhaps, but function calls are not the only coercion site.

@arielb1 I'm a noob on compiler internals, but perhaps the type checker could coerce ignoring the lifetime restriction and pass the restriction down to borrowck which would emit a "plz type & here" error if the lifetime is too long.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cramertj

Say hi to generic types, associated types, etc.

last longer than the call site. For example:

```rust
// `foo`'s argument can be coerced to from `u8` because it only lasts for the
// lifetime of the function call:
fn foo(x: &u8) { ... }

// `bar`'s argument cannot be coerced to from `u8` because the lifetime of the
// required reference outlives the scope of `bar`.
fn bar<'a>(x: &'a u8) -> &'a u8 { ... }

fn main() {
foo(1); // OK

let x = bar(5); // ERROR: expected `&u8`, found `u8`.
}
```

# Drawbacks
[drawbacks]: #drawbacks

- This increases the special behavior of `Copy` types, making it potentially
confusing to new users. Some simple, non-generic code becomes more confusing
because `fn foo(x: &i32) { ... }` can now be called like `foo(5)`;
- The existing coercion mechanism doesn't allow for coercion of generic
arguments, nor does it use coercions in trait lookup. Because of this, users
will still need to manually add `&` when trying to, for example, index into
a `HashMap<usize, T>`
(users will still need to write `x[&5]` instead of `x[5]`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do coercions in indexing? At least in the "single impl" case? You can write this today:

use std::collections::HashMap;

fn main() {
    let s : HashMap<u32, u32> = HashMap::new();
    let i: &&u32 = &&0;
    s[i];
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right. I don't know why I thought we didn't. In particular, AsRef examples like &Rc<i32> work just fine:

use std::collections::HashMap;
use std::rc::Rc;

fn main() {
    let mut x = HashMap::new();
    x.insert(5, 5);
    x[&Rc::new(5)];
}

I will remove indexing from the examples. Thanks for catching me on that!


# Rationale and Alternatives
[alternatives]: #alternatives
One alternative would be to do nothing. However, this issue is frequently annoying for new users, and the solution is relatively simple.

We could also allow borrows which outlive function call sites, but these could
produce surprising errors (for example, if the user attempted to mutate a
variable while it was implicitly borrowed).

We could make the autoreferencing coercion more general and allow `T -> &T`
for all types which don't contain an `UnsafeCell`. However, this could harm
users' ability to reason about when variables are `move`d.

Another alternative would be to also add the following conversions for
`T: Copy`:
- `T` to `&mut T`: This conversion has the potential to introduce hidden
mutations. With this change, passing a variable to a function could allow
the function to change the value of the variable, even if no `&mut` is present.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As already mentioned elsewhere I think, using & does not exclude this. For example, Cell could well be Copy, and it is possible for a user crate to soundly implement a Copy Cell today.

I think this RFC should take the possibility of Copy interior mutable types into account.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically, this is not possible, because UnsafeCell is not Copy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't this get fixed recently? Or am I imagining it?

Copy link
Member

@RalfJung RalfJung Aug 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not Copy yet, but AFAIK there is any good reason for that. So this RFC should work under the assumption that UnsafeCell can be Copy.

EDIT: Also see rust-lang/rust#25053

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find @RalfJung's logic compelling -- we should consider interior mutability. That said, when we last discussed the Copy question, we had thought to add a lint to reflect the "can copy" vs "should copy" distinction (similar to "must use"). I thought this was a good plan, but it never happened. If it had -- and of course it still can! -- it could certainly apply here.

- `&mut T` and `&T` to `T`: This conversion would cause extra copies to occur
which could be difficult to identify when attempting to optimize code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the fact that copying is already implicit in many places, I don't think this is a strong concern. I would argue that if copying a type is actually noticeable in cost, it should not be Copy -- then the clone calls will make it obvious where the cost is incurred.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree with everything you said, it seems like the PoR is for [i128; 1_000_000] to be Copy...


# Unresolved questions
[unresolved]: #unresolved-questions
- Can we make it easier to use copy values where references are expected when
working with traits and generic code? In particular, it would be nice to make
operators such as `Index` or `Add` auto-reference and auto-dereference.
It would also be great if we could find some way to trigger existing deref
coercions in generic cases such as passing `&Rc<MyT>` to
`fn foo<T: SomeTrait>(t: &T)`.