Skip to content
Closed
Changes from 1 commit
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
aa5c8e1
Start procedural vtables RFC
oli-obk Aug 1, 2020
da9f3fc
First draft
hackmd-deploy Aug 1, 2020
446d1ad
Add PR id
hackmd-deploy Aug 1, 2020
5d1b767
Add future possibilty for C++-ish vtables
hackmd-deploy Aug 1, 2020
4938454
Add lots of full examples
hackmd-deploy Aug 1, 2020
0dbcf30
Correctly use `unsafe` in the examples
hackmd-deploy Aug 1, 2020
6157040
Discuss `dyn A + B`
hackmd-deploy Aug 1, 2020
4e73560
Split unsizing and other ops into separate functions
hackmd-deploy Aug 2, 2020
66c7438
Show how to use this for `[T]`
hackmd-deploy Aug 2, 2020
24f8f5a
Vtables have a const generic param for the length, not the trait tree
hackmd-deploy Aug 2, 2020
49f2b68
Rewrite the RFC to be more type safe
hackmd-deploy Aug 2, 2020
b7e9b92
Clarify some things and write a more extensive intro
hackmd-deploy Aug 2, 2020
41599c4
Fix up variable names
hackmd-deploy Aug 6, 2020
b4ecf47
Copy paste mistakes
hackmd-deploy Aug 6, 2020
45b844b
Generalize the scheme to allow unsizing from arbitrary types
hackmd-deploy Aug 8, 2020
e7a0ef7
Fix trait method argument types
hackmd-deploy Aug 8, 2020
70a224c
custom index operation results
hackmd-deploy Aug 8, 2020
85b9d36
Remove reference to old version of RFC
hackmd-deploy Aug 8, 2020
5f63195
Various smaller review changes
hackmd-deploy Aug 8, 2020
c8c9a87
Explain how unsized structs work
hackmd-deploy Aug 12, 2020
6c89ffe
Intermediate push (no projection into unsized fields for C++-vtables)
hackmd-deploy Sep 10, 2020
e0c8ca7
Address some review nits
hackmd-deploy Sep 10, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 114 additions & 34 deletions text/0000-procedural-vtables.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,9 +62,26 @@ As an example, consider the function which is what normally generates your metad

```rust
#[repr(C)]
struct Pointer<T> {
ptr: *mut T,
vtable: &'static VTable<T>,
struct Pointer {
ptr: *mut u8,
vtable: &'static VTable,
}

/// For going from a `&SomeStruct<dyn Trait>` to a field of `SomeStruct`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These might be better as defaulted methods in Unsized?

unsafe impl CustomProjection for Pointer {
unsafe fn project(ptr: Pointer, offset: usize) -> *mut u8 {
ptr.ptr.offset(offset),
}
}

unsafe impl CustomProjectionToUnsized for Pointer {
/// For going from `&SomeStruct<dyn Trait>` to the unsized field
unsafe fn project_unsized(ptr: Pointer, offset: usize) -> Pointer {
Pointer {
ptr: ptr.ptr,
vtable: ptr.vtable,
}
}
}

/// If the `owned` flag is `true`, this is an owned conversion like
Expand All @@ -75,14 +92,14 @@ struct Pointer<T> {
/// unsizings by triggering a compile-time `panic` with an explanation
/// for the user.
unsafe impl<T: MyTrait> const CustomUnsize<Pointer> for T {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might just be the Unsize trait.

fn unsize<const owned: bool>(ptr: *mut T) -> Pointer<T> {
fn unsize<const owned: bool>(ptr: *mut T) -> Pointer {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still unconvinced that allowing arbitrary stuff (including reallocation) to happen during implicit coercion is a good idea. I know that, at minimum, some procedural action must be taken to construct the extension. But I'd like it if you mentioned in the drawbacks that implicit coercions are no longer simple transmutes/bitwise-copies unless we replace this function with an associated const.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I did not appreciate how quickly this can go crazy. I mean, my first design (unpublished) was to emit a value of a struct that describes the pointer layout and how to extract the pointer. Then I realized that I'm just inventing a bad way to write code and went full "just write your own pointer creation and management", but as noted in the unresolved question section: I don't know if we really want this level of control at all.

I mean we could reduce the capabilities by forcing unsize to be const fn, but that still means that in the future you'll be able to allocate memory and stuff..

So... one weird idea I had was to make it a const fn, and then invoke the function in CTFE with a symbolic pointer. The resulting struct is then analyzed at compile-time and a transformation function is generated that will replicate the result memory (of the wide pointer) but replace all occurences of the symbolic pointer with the pointer we're unsizing.

This would severely limit what kind of transformations you can do, while making it trivial for the compiler to figure out the replication strategy (raw copy all bits before the symbolic pointer, write the unsize pointer, raw copy all bits after the symbolic pointer). In case the symbolic pointer was offset, apply said offset.

Copy link
Contributor

@samsartor samsartor Sep 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. This is really interesting. Yah, you could use some totally opaque type as a placeholder for the data pointer (maybe an extern type?) to be swapped out at coercion time. I think that would also solve the issues with projection because Rust could to do whatever it wants with that particular field?

I'll have to think about it some more but I like that direction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't actually need a type, we can create a dangling symbolic pointer (this is a value in mir interpreter!) and just use *const T for whatever T we desire. Any interaction with the pointer beyond copying it around will make CTFE complain to you at compile-time. Since this is a special case, we can actually not just tell you about "dangling pointer" but special case the diagnostic to be really helpful

// We generate the metadata and put a pointer to the metadata into
// the field. This looks like it's passing a reference to a temporary
// value, but this uses promotion
// (https://doc.rust-lang.org/stable/reference/destructors.html?highlight=promotion#constant-promotion),
// so the value lives long enough.
Pointer {
ptr,
ptr: ptr as *mut u8,
vtable: &default_vtable::<T>(),
}
}
Expand All @@ -92,6 +109,7 @@ unsafe impl<T: MyTrait> const CustomUnsize<Pointer> for T {
/// default trait objects. Your own trait objects can use any metadata and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow what "default trait objects" means in this context.

/// thus "vtable" layout that they want.
unsafe impl Unsized for dyn MyTrait {
// Using a dummy type for the vtable
type WidePointer = Pointer;
fn size_of(ptr: Pointer) -> usize {
ptr.vtable.size
Expand All @@ -112,7 +130,7 @@ impl Drop for dyn MyTrait {
}
}

const fn default_vtable<T: MyTrait>() -> VTable<T> {
const fn default_vtable<T: MyTrait>() -> VTable {
// `VTable` is a `#[repr(C)]` type with fields at the appropriate
// places.
VTable {
Expand Down Expand Up @@ -157,28 +175,43 @@ a `std::slice::Slice` (or `StrSlice`) trait.
pub trait Slice<T> {}

#[repr(C)]
struct SlicePtr<T> {
ptr: *mut T,
struct SlicePtr {
ptr: *mut u8,
len: usize,
}

unsafe impl CustomProjection for SlicePtr {
unsafe fn project(ptr: Pointer, offset: usize) -> *mut u8 {
ptr.ptr.offset(offset)
}
}

unsafe impl CustomProjectionToUnsized for SlicePtr {
unsafe fn project_unsized(ptr: Pointer, offset: usize) -> Pointer {
Pointer {
ptr.ptr.offset(offset),
ptr.len,
}
}
}

// This impl must be in the `vec` module, to give it access to the `vec`
// internals instead of going through `&mut Vec<T>` or `&Vec<T>`.
impl<T> CustomUnsize<SlicePtr<T>> for Vec<T> {
fn unsize<const owned: bool>(ptr: *mut Vec<T>) -> SlicePtr<T> {
impl<T> CustomUnsize<SlicePtr> for Vec<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorta like the CStr example, Vec uses Deref. I think this should be implemented on [T; N] instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see what the advantage is beyond having a simpler implementation. We can't reduce the complexity of the trait system described in this RFC if we limit ourselves to simple to convert types.

Copy link
Contributor

@samsartor samsartor Sep 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just the semantics of Vec. Vec is not a slice, it manages a slice. This is clear if you think about the implications of a &Struct<Vec<T>> to &Struct<[T]> coercion.
I suppose it isn't a big deal. I'd just like to see a use case where there is no simple work around to this RFC via Deref (e.g. [T; N]).

fn unsize<const owned: bool>(ptr: *mut Vec<T>) -> SlicePtr {
SlicePtr {
ptr: vec.data,
ptr: vec.data as *mut _,
len: vec.len,
}
}
}

impl<T> CustomUnsized for dyn Slice<T> {
type WidePointer = SlicePtr<T>;
fn size_of(ptr: SlicePtr<T>) -> usize {
type WidePointer = SlicePtr;
fn size_of(ptr: SlicePtr) -> usize {
ptr.len * std::mem::size_of::<T>()
}
fn align_of(_: SlicePtr<T>) -> usize {
fn align_of(_: SlicePtr) -> usize {
std::mem::align_of::<T>()
}
}
Expand All @@ -203,27 +236,40 @@ Most of the boilerplate is the same as with regular vtables.

```rust
unsafe impl<T: MyTrait> const CustomUnsize<dyn MyTrait> for T {
fn unsize<const owned: bool>(ptr: *mut T) -> *mut (Vtable, ()) {
fn unsize<const owned: bool>(ptr: *mut T) -> CppPtr {
unsafe {
let new = Box::new((default_vtable::<T>(), std::ptr::read(ptr)));
std::alloc::dealloc(ptr);

Box::into_ptr(new) as *mut _
CppPtr(Box::into_ptr(new) as *mut _)
}
}
}

struct CppPtr(*mut (Vtable, ()));

unsafe impl CustomProjection for CppPtr {
unsafe fn project(ptr: CppPtr, offset: usize) -> *mut u8 {
(&raw mut (*ptr.0).1).offset(offset)
}
}

// No CustomProjectionToUnsized as you can't have
// `struct Foo(i32, dyn MyTrait);` as that would require
// us to rewrite the vtable on unsizing. Rust puts the
// unsized field at the end, while C++ puts the in the front of
// the class.

unsafe impl CustomUnsized for dyn MyTrait {
type WidePointer = *mut (VTable, ());
fn size_of(ptr: Self::WidePointer) -> usize {
type WidePointer = CppPtr;
fn size_of(ptr: CppPtr) -> usize {
unsafe {
(*ptr).0.size
(*ptr.0).0.size
}
}
fn align_of(self: Self::WidePointer) -> usize {
fn align_of(self: CppPtr) -> usize {
unsafe {
(*ptr).0.align
(*ptr.0).0.align
}
}
}
Expand Down Expand Up @@ -269,30 +315,46 @@ impl<'a, T, U, V> Index<&'a SliceInfo<U, V>> for Array2<T> {
## Zero sized references to MMIO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this example, what happens if I do this?

// Reads the data behind `r` as bytes. Sound if `T` has no padding between fields.
unsafe fn read_memory_behind<T>(r: &T) -> Vec<u8> {
    let len = std::mem::size_of_val(r) as isize;
    let ptr = r as *const T as *const u8;
    (0..len).map(|o| *ptr.offset(o)).collect()
}

let bank: &dyn MyRegisterBank = ...;
read_memory_behind(bank); // WTF

The natural assumption is that this should output the 4 bytes in the register bank. But it doesn't necessarily because despite there being 4 bytes behind r according to the Unsize impl, r doesn't actually point to anything. What is the value of ptr? Is it undefined?

The best solution I can come up with is for as *const u8 to call project(0) but if that is the case, it should probably be documented here.


Instead of having one type per MMIO register bank, we could have one
trait per bank and use a zero sized wide pointer format.
trait per bank and use a zero sized wide pointer format. There's no `Unsize`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on this? Why do we want a trait per bank instead of simply

pub fn flip_important_bit() {
    std::mem::volatile_write::<bool>(REG_ADDR, !std::mem::volatile_read::<bool>(REG_ADDR));
}

or, if exclusivity is needed, an ordinary mutable reference to zero-sized struct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just one of the design scopes. Right now there are already setups where ppl do this, but they do it with one ZST struct per bank and methods on those. I don't want to get into the merit of this at all, even if I have strong opinions on it. This is just to show that you can do it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's simply a design parameter that significantly complicates the design space, so I think the RFC would be improved if it was strongly justified.

impl as you can't create these pointers except by transmuting a zst to them.

```rust
const REG_ADDR: usize = 42;

trait MyRegisterBank {
fn flip_important_bit(&mut self);
}

struct NoPointer;

impl CustomProjection for NoPointer {
fn project(NoPointer: NoPointer, offset: usize) -> *mut u8 {
(REG_ADDR + offset) as *mut u8
}
}

// No CustomProjectionToUnsized as there's nothing there to access

unsafe impl CustomUnsized for dyn MyRegisterBank {
type WidePointer = ();
fn size_of(():()) -> usize {
4
type WidePointer = NoPointer;
fn size_of(NoPointer:NoPointer) -> usize {
4 // MMIO registers on our hypothetical systems are 32 bit
}
fn align_of(():()) -> usize {
fn align_of(NoPointer:NoPointer) -> usize {
4
}
}

impl MyRegisterBank for dyn MyRegisterBank {
fn flip_important_bit(&mut self) {
std::mem::volatile_write::<bool>(42, !std::mem::volatile_read::<bool>(42))
std::mem::volatile_write::<bool>(REG_ADDR, !std::mem::volatile_read::<bool>(REG_ADDR))
}
}
```

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

## Unsized structs

This RFC does not actually affect unsized structs (or tuples for that matter), because the unsizing of aggregates
Expand All @@ -307,7 +369,7 @@ struct Foo<T: ?Sized + MyTrait> {
}
```

and you're using it to convert from a pointer to the sized version to an unsized version (Side note: I'm assuming you are talking about sized vs unsized, even though you mentioned "owned". `Box<dyn Trait>` is also owned, just owning an unsized thing).
and you're using it to convert from a pointer to the sized version to an unsized version

```rust
let x = Foo { a: 42, b: SomeStruct };
Expand All @@ -325,17 +387,22 @@ Pointer {
}
```

where `vtable` is the same vtable you'd get for `&SomeStruct as &dyn MyTrait`. Since you can't invoke `MyTrait` methods on `Foo<dyn MyTrait>`, there are no pointer indirection problems or anything. This is also how it works without this RFC. If you want to invoke methods on the `b` field, you have to do `z.b.foo()`, which will give you a `&dyn MyTrait` reference via `&z.b` and then everything works out just like with direct references (well, because now you have a direct reference).
where `vtable` is the same vtable you'd get for `&SomeStruct as &dyn MyTrait`. Since you can't invoke `MyTrait` methods on `Foo<dyn MyTrait>`, there are no pointer indirection problems or anything. This is also how it works without this RFC.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation
If you want to invoke methods on the `b` field, you have to do `z.b.foo()`, which will works by
invoking `CustomProjectionToUnsized::project_unsized(z, offset!(SomeStruct::b))`. The resulting pointer
is again a wide pointer `&dyn MyTrait`, but with an adjusted data pointer to allow any trait methods to properly
work on the type. This data pointer adjustment is wide pointer specific and overridable via the `CustomProjectionToUnsized` trait.
For regular fields the `CustomProjection` trait handles the extraction of the sized pointer to the field.

## Traits managing the unsizing and projecting

When unsizing, the `<dyn MyTrait as CustomUnsize>::unsize` is invoked.
The only reason that trait must be
`impl const CustomUnsize` is to restrict what kind of things you can do in there, it's not
strictly necessary. This restriction may be lifted in the future.

For all other operations, the methods on `<dyn MyTrait as CustomUnsized>` is invoked.
For all other operations, the methods on `<dyn MyTrait as CustomUnsized>::WidePointer` are invoked.

When obtaining function pointers from vtables, instead of computing an offset, the `MyTrait for dyn MyTrait` impl's
methods are invoked, allowing users to insert their own logic for obtaining the runtime function pointer.
Expand All @@ -355,6 +422,16 @@ unsafe trait CustomUnsized {
unsafe trait CustomUnsize<DynTrait> where DynTrait: CustomUnsized {
fn unsize<const owned: bool>(t: *mut Self) -> DynTrait::WidePointer;
}

unsafe trait CustomProjection: CustomUnsized {
/// The offset is in bytes.
unsafe fn project(ptr: <Self as CustomUnsized>::WidePointer, offset: usize) -> *mut u8;
}

unsafe trait CustomProjectionToUnsized: CustomUnsized {
/// The offset is in bytes and must be the exact offset from the start of the unsized struct to its unsized field.
unsafe fn project_unsized(ptr: <Self as CustomUnsized>::WidePointer, offset: usize) -> <Self as CustomUnsized>::WidePointer;
}
```

The
Expand Down Expand Up @@ -391,7 +468,9 @@ This list is shamelessly taken from [strega-nil's Custom DST RFC](https://github
- [Syntax of ?Sized](https://github.com/rust-lang/rfcs/pull/490)

This RFC differs from all the other RFCs in that it focusses on a procedural way to generate vtables,
thus also permitting arbitrary user-defined compile-time conditions by aborting via `panic!`.
thus also permitting arbitrary user-defined compile-time conditions by aborting via `panic!`. Another
difference is that this RFC allows arbitrary layouts of the wide pointer instead of just allowing custom
metadata fields of wide pointers.

# Unresolved questions
[unresolved-questions]: #unresolved-questions
Expand All @@ -408,4 +487,5 @@ thus also permitting arbitrary user-defined compile-time conditions by aborting

* We can change string slice (`str`) types to be backed by a `trait StrSlice` which uses this scheme
to generate just a single `usize` for the metadata (see also the `[T]` demo).
* This scheme is forward compatible to adding associated fields later, but it is a breaking change to add such fields to an existing trait.
* This scheme is forward compatible to adding associated fields later, but it is a breaking change to add such fields to an existing trait.
* We can add a scheme for safely converting from a wide pointer to its representation struct.