Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for RFC #1909: Unsized Rvalues (unsized_locals, unsized_fn_params) #48055

Open
2 of 13 tasks
aturon opened this issue Feb 7, 2018 · 64 comments
Open
2 of 13 tasks
Labels
B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC F-unsized_fn_params `#![feature(unsized_fn_params)]` F-unsized_locals `#![feature(unsized_locals)]` S-tracking-design-concerns Status: There are blocking design concerns. S-tracking-needs-summary Status: It's hard to tell what's been done and what hasn't! Someone should do some investigation. T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@aturon
Copy link
Member

aturon commented Feb 7, 2018

This is a tracking issue for the RFC "Unsized Rvalues " (rust-lang/rfcs#1909).

Steps:

Blocking bugs for unsized_fn_params:

Related bugs:

Unresolved questions:

@aturon aturon added B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented. T-lang Relevant to the language team, which will review and decide on the PR/issue. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC labels Feb 7, 2018
@Aaron1011
Copy link
Member

How do we handle truely-unsized DSTs when we get them?

@aturon: Are you referring to extern type?

@aturon
Copy link
Member Author

aturon commented Feb 20, 2018

@Aaron1011 that was copied straight from the RFC. But yes, I presume that's what it's referring to.

@ldr709
Copy link
Contributor

ldr709 commented Feb 28, 2018

Why would unsized temporaries ever be necessary? The only way it would make sense to pass them as arguments would be by fat pointer, and I cannot think of a situation that would require the memory to be copied/moved. They cannot be assigned or returned from functions under the RFC. Unsized local variables could also be treated as pointers.

In other words, is there any reason why unsized temporary elision shouldn't be always guaranteed?

@F001
Copy link
Contributor

F001 commented May 11, 2018

Is there any progress on this issue?
I'm trying to implement VLA in the compiler. For the AST and HIR part, I added a new enum member for syntax::ast::ExprKind::Repeat and hir::Expr_::ExprRepeat to save the count expression as below:

enum RepeatSyntax { Dyn, None }
syntax::ast::ExprKind::Repeat(P<Expr>, P<Expr>, RepeatSyntax)

enum RepeatExprCount {
  Const(BodyId),
  Dyn(P<Expr>),
}
hir::Expr_::ExprRepeat(P<Expr>, RepeatExprCount)

But for the MIR part, I have no idea how to construct a correct MIR. Should I update the structure of mir::RValue::Repeat and corresponding trans_rvalue function? What should they look like? What is the expected LLVM-IR?

Thanks in advance if someone would like to write a simple mentoring instruction.

@qnighy
Copy link
Contributor

qnighy commented May 26, 2018

I'm trying to remove the Sized bounds and translate MIRs accordingly.

@mikeyhew
Copy link
Contributor

mikeyhew commented Jul 14, 2018

An alternative that would solve both of the unresolved questions would be explicit &move references. We could have an explicit alloca! expression that returns &move T, and truly unsized types work with &move T because it is just a pointer.

If I remember correctly, the main reason for this RFC was to get dyn FnOnce() to be callable. Since FnOnce() is not implementable in stable Rust, would it be a backward-compatible change to make FnOnce::call_once take &move Self instead? If that was the case, then we could make &move FnOnce() be callable, as well as Box<FnOnce()> (via DerefMove).

cc @arielb1 (RFC author) @qnighy (currently implementing this RFC in #51131) @eddyb (knows a lot about this stuff)

@eddyb
Copy link
Member

eddyb commented Jul 14, 2018

@mikeyhew There's not really much of a problem with making by-value self work and IMO it's more ergonomic anyway. We might eventually even have DerefMove without &move at all.

@mikeyhew
Copy link
Contributor

mikeyhew commented Jul 20, 2018

@eddyb

I guess I can see why people think it's more ergonomic: in order to opt into it, you just have to add ?Sized to your function signature, or in the case of trait methods, do nothing. And maybe it will help new users of the language, since &move wouldn't be show up in documentation everywhere.

If we're going to go ahead with this implicit syntax, then there are a few details that would be good to nail down:

  • If this is syntactic sugar for &move references, what does it desugar too? For function arguments, this could be pretty straightforward: the lifetime of the reference would be limited to the function call, and if you want to extend it past that, you'd have to use explicit &move references. So

    fn call_once(f: FnOnce(i32))) -> i32

    desugars too

    fn call_once(f: &move FnOnce(i32)) -> i32

    and you can call the function directly on its argument, so foo(|x| x + 1) desugars to foo(&move (|x| x + 1)).

    And to do something fancier, you'd have to resort to the explicit version:

    fn make_owned_pin<'a, T: 'a + ?Sized>(value: &'a move T) -> PinMove<'a, T> { ... }
    
    struct Thunk<'a> {
        f: &'a move FnOnce()
    }

    Given the above semantics, DerefMove could be expressed using unsized rvalues, as you said:

    EDIT: This is kind of sketchy though. What happens if the implementation is wrong, and doesn't call f?

    // this is the "closure" version of DerefMove. The alternative would be to have an associated type
    // `Cleanup` and return `(Self::Target, Self::Cleanup)`, but that wouldn't work with unsized
    // rvalues because you can't return a DST by value
    fn deref_move<F: FnOnce(Self::Target) -> O, O>(self, f: F) -> O;
    
    // explicit form
    fn deref_move<F: for<'a>FnOnce(&'a move Self::Target) -> O, O>(&'a move self, f: F) -> O;

    I should probably write an RFC for this.

  • When do there need to be implicit allocas? I can't actually think of a case where an implicit alloca would be needed. Any function arguments would last as long as the function does, and wouldn't need to be alloca'd. Maybe something involving stack-allocated dynamic arrays, if they are returned from a block, but I'm pretty sure that's explicitly disallowed by the RFC.

@mikeyhew
Copy link
Contributor

@eddyb have you seen @alercah's RFC for DerefMove? rust-lang/rfcs#2439

bors added a commit that referenced this issue Aug 19, 2018
Implement Unsized Rvalues

This PR is the first step to implement RFC1909: unsized rvalues (#48055).

## Implemented

- `Sized` is removed for arguments and local bindings. (under `#![feature(unsized_locals)]`)
- Unsized locations are allowed in MIR
- Unsized places and operands are correctly translated at codegen

## Not implemented in this PR

- Additional `Sized` checks:
  - tuple struct constructor (accidentally compiles now)
  - closure arguments at closure generation (accidentally compiles now)
  - upvars (ICEs now)
- Generating vtable for `fn method(self)` (ICEs now)
- VLAs: `[e; n]` where `n` isn't const
- Reduce unnecessary allocations

## Current status

- [x] Fix `__rust_probestack` (rust-lang/compiler-builtins#244)
  - [x] Get the fix merged
- [x] `#![feature(unsized_locals)]`
  - [x] Give it a tracking issue number
- [x] Lift sized checks in typeck and MIR-borrowck
  - [ ] <del>Forbid `A(unsized-expr)`</del> will be another PR
- [x] Minimum working codegen
- [x] Add more examples and fill in unimplemented codegen paths
- [ ] <del>Loosen object-safety rules (will be another PR)</del>
- [ ] <del>Implement `Box<FnOnce>` (will be another PR)</del>
- [ ] <del>Reduce temporaries (will be another PR)</del>
@qnighy
Copy link
Contributor

qnighy commented Aug 20, 2018

As a next step, I'll be working on trait object safety.

@alexreg
Copy link
Contributor

alexreg commented Aug 22, 2018

@mikeyhew Sadly @alercah just postponed their DerefMove RFC, but I think a separate RFC for &move that complements that (when it does get revived) would be very much desirable. I would be glad to assist with that even, if you're interested.

@mikeyhew
Copy link
Contributor

@alexreg I would definitely appreciate your help, if I end up writing an RFC for &move.

The idea I have so far is to treat unsized rvalues as a sort of sugar for &move references with an implicit lifetime. So if a function argument has type T, it will be either be passed by value (if T is Sized) or as a &'a move T, and the lifetime 'a of the reference will outlive the function call, but we can't assume any more than that. For an unsized local variable, the lifetime would be the variable's scope. If you want something that lives longer than that, e.g. you want to take an unsized value and return it, you'd have to use an explicit &move reference so that the borrow checker can make sure it lives long enough.

@alexreg
Copy link
Contributor

alexreg commented Aug 24, 2018

@mikeyhew That sounds like a reasonable approach to me. Has anyone specified the supposed semantics of &move yet, even informally? (Also, I'm not sure if bikeshedding on this has already been done, but we should probably consider calling it &own.)

@eddyb
Copy link
Member

eddyb commented Aug 25, 2018

Not sure if this is the right place to document this, but I found a way to make a subset of unsized returns (technically, all of them, given a T -> Box<T> lang item) work without ABI (LLVM) support:

  • only Rust ABI functions can return unsized types
  • instead of passing a return pointer in the call ABI, we pass a return continuation
    • we can already pass unsized values to functions, so if we could CPS-convert Rust functions (or wanted to), we'd be done (at the cost of a stack that keeps growing)
    • @nikomatsakis came up with something similar (but only for Box) a few years ago
  • however, only the callee (potentially a virtual method) needs to be CPS-like, and only in the ABI, the callers can be restricted and/or rely on dynamic allocation, not get CPS-transformed
  • while Clone becoming object-safe is harder, this is an alright starting point:
// Rust definitions
trait CloneAs<T: ?Sized> {
    fn clone_as(&self) -> T;
}
impl<T:  Trait + Clone> CloneAs<dyn Trait> for T {
    fn clone_as(&self) -> dyn Trait { self.clone() }
}
trait Trait: CloneAs<dyn Trait> {}
// Call ABI signature for `<dyn Trait as CloneAs<dyn Trait>>::clone_as`
fn(
     // opaque pointer passed to `ret` as the first argument
    ret_opaque: *(),
    // called to return the unsized value
    ret: fn(
        // `ret_opaque` from above
        opaque: *(),
        // the `dyn Trait` return value's components
        ptr: *(), vtable: *(),
    ) -> (),
    // `self: &dyn Trait`'s components
    self_ptr: *(), self_vtable: *(),
) -> ()
  • the caller would use the ret_opaque pointer to pass one or more sized values to its stack frame
    • could allow ret return one or two pointer-sized values, but that's an optional optimization
  • we can start by allowing composed calls, of this MIR shape:
y = call f(x); // returns an unsized value
z = call g(y); // takes the unsized value and returns a sized one
// by compiling it into:
f(&mut z, |z, y| { *z = call g(y); }, x)
  • this should work out of the box for {Box,Rc,...}::new(obj.clone_as())
  • while we could extract entire portions of the MIR into these "return continuations", that's not necessary for being able to express most things: worst case, you write a separate function
  • since Box::new works, anything with a global allocator around could fall back to that
    • let y = f(x); would work as well as let y = *Box::new(f(x));
    • its cost might be a bit high, but so would that of a proper "unsized return" ABI
  • we can, at any point, switch to an ABI where e.g. the value is copied onto the caller's stack, effectively "extending it on return", and there shouldn't be any observable differences

cc @rust-lang/compiler

@mikeyhew
Copy link
Contributor

@alexreg

Has anyone specified the supposed semantics of &move yet, even informally?

I don't think it's been formally specified. Informally, &'a move T is a reference that owns its T. It's like

  • an &'a mut T that owns the T instead of mutably borrowing it, and therefore drops the T when dropped, or
  • a Box<T> that is only valid for the lifetime 'a, and doesn't free heap allocated memory when dropped (but still drops the T).

(Also, I'm not sure if bikeshedding on this has already been done, but we should probably consider calling it &own.)

Don't think that bikeshed has been painted yet. I guess &own is better. It requires a new keyword, but afaik it can be a contextual keyword, and it more accurately describes what is going on. Often times you would use it to avoid moving something in memory, so calling it &move T would be confusing, and plus there's the problem of &move ||{}, which looks like &move (||{}) but would have to mean & (move ||{}) for backward compatibility.

github-actions bot pushed a commit to rust-lang/glacier that referenced this issue Oct 29, 2020
=== stdout ===
=== stderr ===
warning: the feature `unsized_locals` is incomplete and may not be safe to use and/or cause compiler crashes
 --> /home/runner/work/glacier/glacier/ices/67981.rs:1:12
  |
1 | #![feature(unsized_locals)]
  |            ^^^^^^^^^^^^^^
  |
  = note: `#[warn(incomplete_features)]` on by default
  = note: see issue #48055 <rust-lang/rust#48055> for more information

error[E0277]: the size for values of type `[u8]` cannot be known at compilation time
 --> /home/runner/work/glacier/glacier/ices/67981.rs:4:24
  |
4 |     let f: fn([u8]) = |_| {};
  |                        ^ doesn't have a size known at compile-time
  |
  = help: the trait `Sized` is not implemented for `[u8]`
  = help: unsized fn params are gated as an unstable feature
help: function arguments must have a statically known size, borrowed types always have a known size
  |
4 |     let f: fn([u8]) = |&_| {};
  |                        ^

error: aborting due to previous error; 1 warning emitted

For more information about this error, try `rustc --explain E0277`.
==============
github-actions bot pushed a commit to rust-lang/glacier that referenced this issue Oct 29, 2020
=== stdout ===
=== stderr ===
warning: the feature `unsized_locals` is incomplete and may not be safe to use and/or cause compiler crashes
 --> /home/runner/work/glacier/glacier/ices/68538.rs:1:12
  |
1 | #![feature(unsized_locals)]
  |            ^^^^^^^^^^^^^^
  |
  = note: `#[warn(incomplete_features)]` on by default
  = note: see issue #48055 <rust-lang/rust#48055> for more information

error[E0277]: the size for values of type `[u8]` cannot be known at compilation time
 --> /home/runner/work/glacier/glacier/ices/68538.rs:4:27
  |
4 | pub fn take_unsized_slice(s: [u8]) {
  |                           ^ doesn't have a size known at compile-time
  |
  = help: the trait `Sized` is not implemented for `[u8]`
  = help: unsized fn params are gated as an unstable feature
help: function arguments must have a statically known size, borrowed types always have a known size
  |
4 | pub fn take_unsized_slice(&s: [u8]) {
  |                           ^

error: aborting due to previous error; 1 warning emitted

For more information about this error, try `rustc --explain E0277`.
==============
github-actions bot pushed a commit to rust-lang/glacier that referenced this issue Oct 29, 2020
=== stdout ===
=== stderr ===
warning: the feature `unsized_locals` is incomplete and may not be safe to use and/or cause compiler crashes
 --> /home/runner/work/glacier/glacier/ices/68543.rs:1:12
  |
1 | #![feature(unsized_locals)]
  |            ^^^^^^^^^^^^^^
  |
  = note: `#[warn(incomplete_features)]` on by default
  = note: see issue #48055 <rust-lang/rust#48055> for more information

error[E0277]: the size for values of type `(dyn Future<Output = T> + Unpin + 'static)` cannot be known at compilation time
 --> /home/runner/work/glacier/glacier/ices/68543.rs:6:17
  |
6 | async fn bug<T>(mut f: dyn Future<Output = T> + Unpin) -> T {
  |                 ^^^^^ doesn't have a size known at compile-time
  |
  = help: the trait `Sized` is not implemented for `(dyn Future<Output = T> + Unpin + 'static)`
  = help: unsized fn params are gated as an unstable feature
help: function arguments must have a statically known size, borrowed types always have a known size
  |
6 | async fn bug<T>(&mut f: dyn Future<Output = T> + Unpin) -> T {
  |                 ^

error: aborting due to previous error; 1 warning emitted

For more information about this error, try `rustc --explain E0277`.
==============
@ldr709
Copy link
Contributor

ldr709 commented Dec 20, 2020

Will there be any way to do an unsized coercion on an unsized local without using dynamic memory allocation? The RFC didn't seem clear on this point. At the moment, as far as I can tell the only way is to go through Box and try to get the compiler to optimize out the memory allocation. For example, if you have

fn run_fn_dyn<'a>(f: dyn FnOnce() -> u32 + 'a) -> u32 {
    f() + 1
}

and want to run it on a known size FnOnce() -> u32, you have to convert it like this:

fn run_fn<'a, F: FnOnce() -> u32 + 'a>(f: F) -> u32 {
    // With optimizations enabled the dynamic allocation seems to be removed.
    let f = {
        // Declare b in local scope so that it gets dropped before run_fn_dyn is called. Otherwise
        // the compiler isn't smart enough to figure out that the memory allocation is unnecessary
        // and remove it.
        let b = Box::new(f) as Box<dyn FnOnce() -> u32 + 'a>;
        *b
    };
    run_fn_dyn(f)
}

@JohnScience
Copy link
Contributor

JohnScience commented Dec 24, 2021

I'm not sure how exactly how such (https://stackoverflow.com/questions/70463366/the-data-structure-that-is-the-result-of-stack-based-flattening-of-nested-homoge) data structure for some known dimensionality (=level of nesting) should interact with allocation on the heap. It seems that my data structure must keep track of its lengths in terms of gcd(sizeof(T), sizeof(usize)) and allow conversion to length in bytes.

EDIT: even better than that, it can track the count of lengths len_count and count of elements elem_count. Then the byte-length of the data structure will be the integer linear combination len_count * sizeof(usize) + elem_count * sizeof(T).

@joshtriplett joshtriplett added S-tracking-design-concerns Status: There are blocking design concerns. S-tracking-needs-summary Status: It's hard to tell what's been done and what hasn't! Someone should do some investigation. labels Mar 16, 2022
@Jules-Bertholet
Copy link
Contributor

@rustbot label F-unsized_fn_params

@rustbot rustbot added the F-unsized_fn_params `#![feature(unsized_fn_params)]` label May 5, 2023
github-actions bot pushed a commit to rust-lang/glacier that referenced this issue Sep 24, 2023
=== stdout ===
=== stderr ===
warning: the feature `unsized_locals` is incomplete and may not be safe to use and/or cause compiler crashes
 --> /home/runner/work/glacier/glacier/ices/61335.rs:2:12
  |
2 | #![feature(unsized_locals)]
  |            ^^^^^^^^^^^^^^
  |
  = note: see issue #48055 <rust-lang/rust#48055> for more information
  = note: `#[warn(incomplete_features)]` on by default

warning: the feature `async_await` has been stable since 1.39.0 and no longer requires an attribute to enable
 --> /home/runner/work/glacier/glacier/ices/61335.rs:1:12
  |
1 | #![feature(async_await)]
  |            ^^^^^^^^^^^
  |
  = note: `#[warn(stable_features)]` on by default

error[E0277]: the size for values of type `dyn std::fmt::Display` cannot be known at compilation time
 --> /home/runner/work/glacier/glacier/ices/61335.rs:7:9
  |
7 |     let _x = *x;
  |         ^^ doesn't have a size known at compile-time
  |
  = help: the trait `Sized` is not implemented for `dyn std::fmt::Display`
  = note: all values live across `await` must have a statically known size

error: aborting due to previous error; 2 warnings emitted

For more information about this error, try `rustc --explain E0277`.
==============
github-actions bot pushed a commit to rust-lang/glacier that referenced this issue Sep 24, 2023
=== stdout ===
=== stderr ===
warning: the feature `unsized_locals` is incomplete and may not be safe to use and/or cause compiler crashes
 --> /home/runner/work/glacier/glacier/ices/68543.rs:1:31
  |
1 | #![feature(unsized_fn_params, unsized_locals)]
  |                               ^^^^^^^^^^^^^^
  |
  = note: see issue #48055 <rust-lang/rust#48055> for more information
  = note: `#[warn(incomplete_features)]` on by default

error[E0277]: the size for values of type `(dyn Future<Output = T> + Unpin + 'static)` cannot be known at compilation time
 --> /home/runner/work/glacier/glacier/ices/68543.rs:6:17
  |
6 | async fn bug<T>(mut f: dyn Future<Output = T> + Unpin) -> T {
  |                 ^^^^^ doesn't have a size known at compile-time
  |
  = help: the trait `Sized` is not implemented for `(dyn Future<Output = T> + Unpin + 'static)`
  = note: all values captured by value by a closure must have a statically known size

error: aborting due to previous error; 1 warning emitted

For more information about this error, try `rustc --explain E0277`.
==============
github-actions bot pushed a commit to rust-lang/glacier that referenced this issue Sep 24, 2023
=== stdout ===
=== stderr ===
warning: the feature `unsized_locals` is incomplete and may not be safe to use and/or cause compiler crashes
 --> /home/runner/work/glacier/glacier/ices/88212.rs:1:12
  |
1 | #![feature(unsized_locals)]
  |            ^^^^^^^^^^^^^^
  |
  = note: see issue #48055 <rust-lang/rust#48055> for more information
  = note: `#[warn(incomplete_features)]` on by default

error[E0277]: the size for values of type `dyn Example` cannot be known at compilation time
  --> /home/runner/work/glacier/glacier/ices/88212.rs:16:18
   |
15 |     (move || {  // ERROR
   |           -- this closure captures all values by move
16 |         let _y = x;
   |                  ^ doesn't have a size known at compile-time
   |
   = help: the trait `Sized` is not implemented for `dyn Example`
   = note: all values captured by value by a closure must have a statically known size

error: aborting due to previous error; 1 warning emitted

For more information about this error, try `rustc --explain E0277`.
==============
@RalfJung RalfJung changed the title Tracking issue for RFC #1909: Unsized Rvalues Tracking issue for RFC #1909: Unsized Rvalues (unsized_locals, unsized_fn_params) Nov 1, 2023
@RalfJung
Copy link
Member

RalfJung commented Nov 1, 2023

We should probably reject unsized arguments for non-Rust ABIs... it makes little sense to do this with an extern "C" function since the C ABI does not support unsized arguments.

@RalfJung
Copy link
Member

RalfJung commented Dec 3, 2023

With #111374, unsized locals are no longer blatantly unsound. However, they still lack an actual operational semantics in MIR -- and the way they are represented in MIR doesn't lend itself to a sensible semantics; they need a from-scratch re-design I think. We are getting more and more MIR optimizations and without a semantics, the interactions of unsized locals with those optimizations are basically unpredictable.

The issue with their MIR form is that and assignment let x = y; gets compiled to MIR like

StorageLive(x); // allocates the memory for x
x = Move(y); // copies the data from y to x

However, when x is unsized, we cannot allocate the memory for x in the first step, since we don't know how big x is. The IR just fundamentally doesn't make any sense, with the way we now understand StorageLive to work.

If they were suggested for addition to rustc today, we'd not accept a PR adding them to MIR without giving them semantics. Unsized locals are the only part of MIR that doesn't even have a proposed semantics that could be implemented in Miri. (We used to have a hack, but I removed it because it was hideous and affected the entire interpreter.) I'm not comfortable having even an unstable feature be in such a bad state, with no sign of improvement for many years. So I still feel that unsized locals should be either re-implemented in a well-designed way, or removed -- the current status is very unsatisfying and prone to bugs. Unstable features are what we use to experiment, and sometimes the result of an experiment is that whatever we do doesn't work and we need to try something else.

(Unsized argument do not have that issue: function arguments get marked as "live" when the stack frame is pushed, and at that moment we know the values for all the arguments. Allocating the local and initializing it are done together as part of argument passing. That means we can use the size information we get from the value that the caller chose to allocate the right amount of memory in the callee.)

@SpriteOvO

This comment was marked as off-topic.

@NobodyXu

This comment was marked as off-topic.

@RalfJung

This comment was marked as off-topic.

@SpriteOvO

This comment was marked as off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC F-unsized_fn_params `#![feature(unsized_fn_params)]` F-unsized_locals `#![feature(unsized_locals)]` S-tracking-design-concerns Status: There are blocking design concerns. S-tracking-needs-summary Status: It's hard to tell what's been done and what hasn't! Someone should do some investigation. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests