Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: truly unsized types #709

Closed
wants to merge 4 commits into from
Closed

Conversation

mzabaluev
Copy link
Contributor

Further subdivide unsized types into dynamically-sized types, implementing
an intrinsic trait DynamicSize, and types of indeterminate size. References
for the latter kind will be thin, while allocating slots or copying values
of such types is not possible outside unsafe code.

Rendered

@mzabaluev
Copy link
Contributor Author

@ftxqxd
Copy link
Contributor

ftxqxd commented Jan 23, 2015

What exactly is the difference between ‘truly’ unsized types and DSTs? The only difference I can see is that DSTs use fat pointers, while ‘truly’ unsized types use thin pointers. I don’t think that’s a necessary distinction to make, because they both have the same restriction: they cannot be used without being behind a pointer. If we, for example, introduced fat pointers that have more than one word of extra data (a feature I’ve wanted a few times), they wouldn’t need extra traits for each separate size, so why should having zero bytes of extra data be any different?

@mzabaluev
Copy link
Contributor Author

The only difference I can see is that DSTs use fat pointers, while ‘truly’ unsized types use thin pointers.

DSTs are dynamically sized types, meaning that the size of the value is known while the value exists. To reconstruct a reference to a DST from a raw pointer, one has to obtain the size. This RFC proposes to lift this restriction in cases when the size of the holistic value is not needed immediately or is unknown.

@Ericson2314
Copy link
Contributor

A huge area this would help is disambiguating function pointers and functions. Basically it would be cool if fn(A..) -> B is the type of function themselves, and then &'a fn(A..) -> B is the type of function pointers, 'static being the common case. While I prefer this even out of elegance alone, it potentially really help with safe dynamic linking, etc.

There was a thread around this, but I'm afraid and I can't find it. IIRC @eddyb said it wouldn't work exactly because the unsized vs dynamically sized problem.

@mzabaluev
Copy link
Contributor Author

@Ericson2314: you might be referring to #661.

@Ericson2314
Copy link
Contributor

Hmm, that didn't have the conversation with eddyb I remember, but that's definitely one example of people wanting safer dynamic linking. Might of been a conversation about JITing, where perhaps this is even more useful (I've never heard of unlinking dynamically linked libraries).

@ftxqxd
Copy link
Contributor

ftxqxd commented Jan 26, 2015

@Ericson2314 Is this discuss post on fn lifetimes the discussion you were thinking about?

@Ericson2314
Copy link
Contributor

@P1start well, not exactly as I remembered, but that's probably it. Thanks!

@Diggsey
Copy link
Contributor

Diggsey commented Jan 29, 2015

This is a bit of a crazy idea, but you could give DynamicSize an associated type (Ptr) which the compiler would automatically use for references to that type. This would eliminate the need for special handling of slices and trait objects within the compiler.

This type would fullfill the role of a raw pointer, *T by implementing Deref, from which the compiler constructs an implicit reference type &T which enforces the correct borrow rules, but otherwise acts like the Ptr type. A &T then implicitly converts only to DynamicSize::Ptr, but not *T.

The Unsized bound is equivalent to DynamicSize<Ptr = *T>, ie. a raw pointer.

@eddyb
Copy link
Member

eddyb commented Jan 29, 2015

My understanding of "unsized" is that it is the result of "unsizing", which turns static type info into dynamic values. Not that I have a better name for the types that lack any size information whatsoever.

@Kimundi
Copy link
Member

Kimundi commented Jan 29, 2015

I agree with @Diggsey that a generalization of DST metadata might be more worthwhile than adding special cases to the possible DST values. At least, if those special cases add additional syntax and semantic, like in this proposal.

@mzabaluev
Copy link
Contributor Author

@Kimundi, what additional syntax you are referring to? The only visible changes this proposal adds are a new marker type and the DynamicSize trait, both pretty conventional as per the current syntax.

Semantics do change, but I think there are two different use cases currently lumped in with Sized:

  1. The Sized bound tells that the value has a statically known size, so it can be copied or moved.
  2. Lifting the Sized bound to accommodate DSTs, while keeping the assumption that the size of the value is known at runtime.

I'm not entirely sure that case 2 is a real concern, since an implementation of a generic trait parametric on an ?Sized type would need to specify the DST in contexts where the size is needed. So there may be little need to use the DynamicSize bound explicitly. However, I haven't gone over this in a formal way and I'm not familiar with the intricacies of the type system, so your help is appreciated with proving or disproving my assumption.

@Diggsey
Copy link
Contributor

Diggsey commented Jan 29, 2015

@mzabaluev
My suggestion is in no way an objection to this RFC - I think this has a chance of being accepted for 1.0 which would be great, while mine obviously doesn't, and I think with this RFC in place, expanding on DynamicSize can be done in a backward compatible way.

With regard case 2: there was a PR somewhere to allow "size_of_val" and friends to work on DSTs, so the distinction between DynamicSize and !Sized is definitely needed, although there's still a bit of a grey area between the cases of "raw pointer, but can figure out the size at runtime (not necessarily in an efficient way)" vs "raw pointer, can't figure out size/type has no concept of size".

@Kimundi
Copy link
Member

Kimundi commented Jan 29, 2015

@mzabaluev: I meant the addition of the additional marker types and traits as additional "syntax".

In my opinion, there is no difference between unsized and dynamically sized - in both cases being sized refers the knowing the size at compiletime, and in both cases there is some runtime mechanism for finding it out.

Whether that runtime mechanism involves storing the size directly (slices), or in form of a vtable (trait objects), or as part of the data structure pointed-at (CStr) seems unrelated to that core distinction to me.

I'm not saying that a DST value with a thin pointer representation is not useful, I just don't think it needs to be its own "thing".

@Diggsey
Copy link
Contributor

Diggsey commented Jan 29, 2015

@Kimundi I think there is a useful difference between thin/fat pointer DSTs, which is that thin pointers can be transmuted between each other, and passed to C/C++ code as a raw pointer. It's quite easy to come up with examples where the generic constraints should be "is a thin pointer" rather than "is not a DST" (for example, compatibility with void*).

@Ericson2314
Copy link
Contributor

In my opinion, there is no difference between unsized and dynamically sized - in both cases being sized refers the knowing the size at compiletime, and in both cases there is some runtime mechanism for finding it out.

It was my understanding that this would support cases where you couldn't find out. This is needed for functions.

@mzabaluev
Copy link
Contributor Author

@Kimundi even the types with a size that can be calculated from content may have sufficiently different performance characteristics for this operation (e.g. O(N) for C strings vs O(1) for DSTs) so genericity may not be desirable. Anyway, there doesn't seem to be a generic way to calculate the size of a DST value, so the only difference is whether a fat pointer is required to represent a reference.

@mzabaluev
Copy link
Contributor Author

@Diggsey If I understood your proposal about DynamicSize::Ptr correctly, the DynamicSize trait is meant for the DSTs in their current form only, and a thin-pointer requirement would still need a negative bound on that (or a positive bound on its complement provided by the compiler), right?

@Diggsey
Copy link
Contributor

Diggsey commented Jan 30, 2015

@mzabaluev Originally, I was thinking of something like this:

  • All types become either Sized or DynamicSize, depending solely on whether their size is known at compile-time.
  • DynamicSize has an associated type, Ptr
  • What used to be "Unsized" is now just "DynamicSize<Ptr = *T>", ie. uses a thin pointer. (edit: it's been pointed out that even *T is not always a thin pointer, see below for alternative)

However, it might make more sense like this:

  • The Unsized trait has an associated type, Ptr
  • All types whose size is known at compile-time, implement Sized, unless they opt-out
  • Unsized is implemented automatically for all T: Sized, with Ptr = *T
  • All other types must implement Unsized directly
  • Some !Sized types may implement RuntimeSized, which has a method to calculate the size of a value at runtime. This would include all current DSTs.
  • All Sized types implement RuntimeSized automatically.

So the useful bounds become:

  • Has thin pointer => size_of(::Ptr) == size_of(isize)
  • Has compile-time size => T: Sized
  • Has runtime size => T: RuntimeSized
  • And the complements of the above, using !

And for any type, it's pointer type can be obtained via:

  • ::Ptr

@SSheldon
Copy link

"DynamicSize<Ptr = *T>", ie. uses a thin pointer

@Diggsey, fyi *T is not guaranteed to be a thin pointer; for types with fat references, *T is also fat (like *str, *Trait, *[T])

@Diggsey
Copy link
Contributor

Diggsey commented Jan 30, 2015

@SSheldon Ah, I didn't realise that - I've updated my previous post to reflect that.

@mzabaluev
Copy link
Contributor Author

@Diggsey:

Has thin pointer => size_of(::Ptr) == size_of(isize)

That's quite a mouthful, and I don't think current Rust allows expressions as bounds. But if bounds like that could actually be used, a convenience trait could be provided to assert it.

@alexcrichton
Copy link
Member

Given #738 it looks like we may end up removing all of the various marker structs, in which case adding a new NotSized may stick out a bit.

Perhaps we could beef up the compiler to consider types unsized such as:

struct CStr {
    data: c_char,
    marker: Phantom<[c_char]>,
}

In this case we'd basically be saying that the compiler for type analysis should consider CStr as containing [c_str] but for representation purposes it only has one c_char field.

(just a thought)

@pnkfelix
Copy link
Member

pnkfelix commented Feb 5, 2015

postponing for post 1.0; cannot spend time thinking about this.

(Also, RFC is thin on details, at least for a change of this size.)

@pnkfelix pnkfelix closed this Feb 5, 2015
@pnkfelix pnkfelix added the postponed RFCs that have been postponed and may be revisited at a later time. label Feb 5, 2015
@pnkfelix pnkfelix mentioned this pull request Feb 5, 2015
@petrochenkov petrochenkov added T-lang Relevant to the language team, which will review and decide on the RFC. and removed postponed RFCs that have been postponed and may be revisited at a later time. labels Feb 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants