Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tuple cons cell syntax #1582

Closed
wants to merge 4 commits into from
Closed

Conversation

canndrew
Copy link
Contributor

@canndrew canndrew commented Apr 17, 2016

Add syntax for expressing tuples as a head and tail pair, similar to a Lisp cons cell.

Rendered

@comex
Copy link

comex commented Apr 17, 2016

One possible alternative syntax is (a, b, ...more_elms), like JavaScript, but that would conflict with the inclusive ranges RFC (not stable though). Ruby and Python use * for this, which would definitely be unsuitable here for obvious reasons.

@nagisa
Copy link
Member

nagisa commented Apr 17, 2016

If you’re proposing syntax changes/additions/removals, please strive to describe the changes in terms of changes to the grammar in the RFC as well.

@petrochenkov
Copy link
Contributor

Yay, variadic generics.
I think tuples shouldn't be considered in isolation, the solution should cover all heterogeneous type/value lists, with function parameters/arguments being, probably, the most important.

@glaebhoerl
Copy link
Contributor

I feel like there must be some kind of complications around this but I can't think of them at the moment. Maybe they'll come to me later. If there actually aren't then let's definitely do this.

Is there any use case where you would ever want to use this outside of trait impls?

@nrc nrc added the T-lang Relevant to the language team, which will review and decide on the RFC. label Apr 17, 2016
@canndrew
Copy link
Contributor Author

canndrew commented Apr 18, 2016

One possible alternative syntax is ...

Thanks, I've updated the RFC to mention other syntaxes.

If you’re proposing syntax changes/additions/removals, please strive to describe the changes in terms of changes to the grammar in the RFC as well.

You mean supply a diff for parser-lalr.y with the RFC?

Yay, variadic generics.

Ah, I'd forgotten about that RFC.

I think tuples shouldn't be considered in isolation, the solution should cover all heterogeneous type/value lists, with function parameters/arguments being, probably, the most important.

Good point. The RFC as it is would at least get you some of the way there because it would be possible to impl<H, T: Tuple> Fn<(H; T)> for Foo

Is there any use case where you would ever want to use this outside of trait impls?

Not that I can think of. You do need to be able to pattern-match on generically sized tuples in order to write those trait impls though (eg. let (ref head; ref tail) = arg).

@ahicks92
Copy link

It occurred to me that this could be extended for structs, perhaps with some sort of brace-based syntax (I consider the syntax unimportant because I have a feeling this would be a separate RFC).

basically, in some sense, a struct is itself a tuple, with types equivalent to the types of the fields as declared in the struct. If the syntax were to be extended to structs, it would provide a method to implement traits that apply if all contained fields of a struct implement a specific trait. This would be hard to work with for traits that actually do things, but might be useful for marker traits (but then we need a syntax to opt out which I'm not sure we have).

Possibly I'm saying the same thing @petrochenkov is.

On another note, this gets us to within 6 inches of variadic functions and let me be the first to say that I'll certainly be entering the race to submit that RFC, should this one get accepted. Speaking as a C++ programmer, I don't use them often. But they're a godsend when they're needed.

@bluss
Copy link
Member

bluss commented Apr 18, 2016

The type (A, B, C) is laid out like a struct, and it does not in general contain a subrange of it that has identical representation with the tuple (B, C). Because of that you can not in general take a reference to the tail.

Example: (u8, u8, u64) does not contain a subrange with the same representation as (u8, u64).

@canndrew
Copy link
Contributor Author

@bluss That sucks. Is the representation of tuples set in stone?

@canndrew
Copy link
Contributor Author

Also, not sure whether to close this and just turn it into a comment on #376. It's the same basic idea as that RFC just done a little bit differently.

@bluss
Copy link
Member

bluss commented Apr 18, 2016

The representation is not set in stone. That the tuple (u8, u8, u64) would use 24 bytes instead of 16 doesn't sound like a workable alternative though.

@canndrew
Copy link
Contributor Author

Would a workable alternative be to be a bit cleverer about how we do padding? We could represent a (u8, u64) as 16 bytes where byte[0..7] are dummy bytes, byte[7] is the u8 and byte[8..16] are the u64. Then we can represent a (u8, u8, u64) the same way except with the first u8 at byte[6].

@bluss
Copy link
Member

bluss commented Apr 18, 2016

@canndrew It's an interesting idea and it fixes some cases, but the rule does not handle this example

type T = (u32, u8, u8, u16);
type Tail = (u8, u16);

The tail of T would point to the last four bytes, the front byte being "padding" but also a value byte of the part outside of that (u8, u8, u16), which means that a mutable pointer to Tail could overwrite the value byte outside the tail.

This kind of cons-tail scheme seems to be much easier to work out for value-only destructuring (which may require copying). Just notice today how much (A, B, C) and (A, (B, C)) can differ in size depending on the types involved, they are definitely not represented the same way.

@canndrew
Copy link
Contributor Author

which means that a mutable pointer to Tail could overwrite the value byte "outside" the tuple.

Not if we ensure that modifications to a tuple never modify any "padding" bytes at the start of the tuple. I imagine this might cause extra overhead some of the time. For example it might be more efficient to modify the u16 of a (u16, u32) by overwriting the first 32 bits.

@bluss
Copy link
Member

bluss commented Apr 18, 2016

It's incompatible with all the tools of a low level language like rust, that assumes it can for example memcpy over the whole memory of a value (like std::mem::swap does).

@canndrew
Copy link
Contributor Author

Right, of course. Well that screws this whole plan then as far as I can see. You can't implement &self methods on a (H; T) if you can't take a reference to the T.

@glaebhoerl
Copy link
Contributor

This also sounds potentially related to #1397

@bluss
Copy link
Member

bluss commented Apr 18, 2016

It seems like this can not be done in regular rust reference to compound type parts semantics, instead variadic generics needs its own syntax and semantics.

@canndrew
Copy link
Contributor Author

@bluss Actually, if we implemented the suggestion in the link @glaebhoerl posted it seems like we could do this. Just layout tuples the other way around than to what I was proposing.

@eddyb
Copy link
Member

eddyb commented Apr 18, 2016

There is an unused syntax, (head, tail...), because x... doesn't offer much value over x.. (maybe with floating-point it would include Infinity itself, but do we really need that?).

I like the solution for the representation incompatibility, seems easier than my contraptions to turn e.g. &(A, B, C) into (&A, &B, &C) (which work, mind you, but are still non-trivial).

I'm cc-ing @ubsan as them and I have experimented with (working) prototypes of variadic generics, one of mine being https://play.rust-lang.org/?gist=030b0b74b3a72eecd09d567368a592bb&version=nightly.

What's really cool in that one is that CloneFn is for<T> <T as Clone>::clone and we might be able to keep that polymorphism on function item types so that you can do .map(Clone::clone) on a tuple.

@eddyb
Copy link
Member

eddyb commented Apr 18, 2016

Ah, I should've known it was too good to be true.
Do you plan to waste space on unnecessary padding to make this happen?
That is, the (u32, u8, u8, u16) example above would have to be laid out in the exact same way that (u32, (u8, (u8, u16))) is today, which has 2 extra padding bytes.

In that case, I prefer my "take references to all fields" contraptions.

@canndrew
Copy link
Contributor Author

@eddyb It works with #1397. But you have to either layout the tuple fields in reverse order or treat tuples as a prefix+final_element pair rather than a head+tail pair.

@eddyb
Copy link
Member

eddyb commented Apr 19, 2016

@canndrew I see your point about the reverse layout order. Seems interesting to say the least. It makes sense that to add an element (the head) and keep the rest (the tail), you need to add it at the end.

As for (Prefix..., Final) being the default way of working on tuples, it doesn't sound tractable, since most implementations I can think of need left-to-right evaluation.

@canndrew
Copy link
Contributor Author

Yeah I'd prefer reverse layout order.

@HaronK
Copy link

HaronK commented Apr 21, 2016

I'm wondering why do we need extra syntax (";" or "...") for this feature at all.
What we can do is just group extra elements in a tuple:

let (head, tail) = (1, 2, 3, "4");         // 1. head = 1, tail = (2, 3, "4")
let (head1, head2, tail) = (1, 2, 3, "4"); // 2. head1 = 1, head2 = 2, tail = (3, "4")
let (head1, head2, tail) = (1, 2, 3);      // 3. head1 = 1, head2 = 2, tail = 3
let (head1, head2, tail) = (1, 2);         // 4. head1 = 1, head2 = 2, tail = ()

Am I miss some Rust feature that prevents this?

@eddyb
Copy link
Member

eddyb commented Apr 21, 2016

@HaronK You can't do that without knowing you have a type mismatch first, so what if you need to retcon it?
Making the following work is impossible in the current type-checking system:

let x = (1, 2, 3);
let y: &(_, _) = &x;

You also still need a syntax for expanding a tuple value/type in-place and (variadic) function arguments.

@HaronK
Copy link

HaronK commented Apr 21, 2016

I understand that it's not possible right now. That's why we have this RFC. I only suggest instead of extending tuple syntax, update current type-checking system to group tail part of the tuple.
(I have updated an example code in my previous comment)

@burdges
Copy link

burdges commented Oct 8, 2016

There is a lot more one could do at compile time obviously. As a first, it's quite obnoxious that mem::size_of<T>() cannot be used when declaring types. It'd be nice if one could write :

struct BytesOf<T>([u8; mem::size_of::<T>()])

In general, there are way too many libraries using [u8] instead of [u8; n] when they actually know the size n at compile time. I think every endianness library falls into this. And too many with lists of impls for different n. I'd think addressing those issues with [;] should be a higher priority than anything with tuples, but some RFC for that exists if I recall.

About tuples though : I doubt one can impose alignment requirements on tuples.

Instead, one might do tuple cons backwards, so if t : (A,B,C,D) then let (..foo, bar, baz) = t gives foo:(A,B) and bar: C and baz: D, so no alignment issues for foo. There are alignment issues with bar and baz too, but presumably those could be handled as with total destructuring today. If that were not sufficient, maybe bar and baz could be given some special invisible Unaligned<T> type from which a properly aligned T gets moved, copied, etc. whenever anything happens.

As for looping over tuples at compile time, @Amanieu's generic closures sound way nicer than static while, etc. You'll frequently need temporary trait and impls, which gets slightly C++ish.

@ahicks92
Copy link

ahicks92 commented Oct 8, 2016

There's an RFC for type-level integers which I can never find quickly, and thus I don't have the link.

My work on layout which is maybe 2/3rds finished if I'm lucky will make changing the representation of a tuple a trivial operation. It will also probably make a tuple not necessarily ordered in memory to match the field indexes.

Miri could allow for compile-time maps and (with generic closures) folds of constant tuples. I can't quite envision how you write the types though; it seems like the constraints on the closure that the mapping function takes would have to somehow match those on the closure as written at the call site.

@NXTangl
Copy link

NXTangl commented Oct 10, 2016

@burdges Yeah, we really need constexpr. I myself also like the idea of <const Param: T> for statically known parameters.

@mglagla
Copy link
Contributor

mglagla commented Oct 19, 2016

There's an RFC for type-level integers which I can never find quickly, and thus I don't have the link.

@camlorn Do you mean the const-dependent type system RFC?

@ahicks92
Copy link

@mglagla
Yeah, but GitHub does everything by numbers and Google is bad at finding things by half-remembered title.

Someone needs to work out how to make a system like this come up with memorable identifiers.

@withoutboats
Copy link
Contributor

@rfcbot fcp postpone

As of this writing we have 34 open lang RFCs. People who follow many RFCs may have noticed that there's been an effort recently to move many of the open RFCs to a resolution, especially if they haven't seen recent activity.

This RFC has had basically no activity since June, and isn't likely to be prioritized as a result of the 2017 roadmap. However, this RFC is a big proposal, taking steps to solve a big problem (variadic generics), and so I want to be emphatic that I'm proposing postponement only because it seems likely to sit quietly open for a good long while before we take further action on it. Once solving this problem is a more immediate term priority, I'd be glad to see it re-opened.

@withoutboats
Copy link
Contributor

My own opinion is that it seems like the solution proposes by this RFC is improperly generalized.

The problem this RFC is trying to solve is "implementing a trait for all tuples." This RFC solves that problem, but the connection between the problem and the proposed syntax is not obvious. The syntax is general purpose, but as far as I can tell it would be dubious to use the syntax for anything other than implementing traits for all tuples.

Basically, I wonder if there is a solution to this problem that is more targeted on implementing traits for tuples. Unless we have other motivations for this syntax, I don't want to add overly general features because that has its own complexity cost.

I do think this RFC takes the right approach to solving variadic generics though - start with solving the most common problem (implementing a trait for all tuples), before moving to the wider problem.

@rfcbot
Copy link
Collaborator

rfcbot commented Dec 7, 2016

Team member @withoutboats has proposed to postpone this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@petrochenkov
Copy link
Contributor

Status update on "static for" mentioned above:
It turns out this is already possible to do in C++ with recently introduced fold expressions!
This program goes through a heterogeneous list and prints all its elements:

#include <iostream>

template<typename... T> 
void print(T... items) { 
  ([&](auto&& item) { 
      std::cout << item << std::endl; 
  }(items),...); 
}

int main() {
    print(10, 11.1, "abc");
}

The lambda's operator() is monomorphized several times for each item's type and all its versions are called in order. And everything is imperative, no recursion or specialization in user code, yay.

@eddyb
Copy link
Member

eddyb commented Dec 7, 2016

@petrochenkov I've mentioned generic closures for this before, I also think it has a lot of potential.

@rfcbot
Copy link
Collaborator

rfcbot commented Jan 6, 2017

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label Jan 6, 2017
@dckc
Copy link

dckc commented Jan 11, 2017

I see a lot of discussion in the abstract with A and B and such, but more concrete examples would make the motivation more clear. Are there examples of projects that suffer for the lackof this?

@dckc
Copy link

dckc commented Jan 11, 2017

Tuples are product types where cons cell lists are sum types. My intuition says mixing them together is asking for trouble.

The construct proposed here reminds me of HList, which, IIRC, requires HKT.

@eddyb
Copy link
Member

eddyb commented Jan 11, 2017

HList doesn't require HKT AFAIK, there are implementations in Rust, and tuples are just that, but they're missing composition and decomposition tools.
The "cons cell" analogy is unfortunate, I agree. I prefer spread/capture operators myself.
As for usecases: variadic generics and variadic functions. Implementing traits for any tuple.
In particular, Fn* traits currently use a very hacky version of this to be variadic.

@scottmcm
Copy link
Member

I find the "change the layout of tuples" solution very, very surprising.

It feels totally reasonable to me to expect that simd::i32x4, (i32, i32, i32, i32), and [i32;4] are all layout-compatible in the .0-is-[0] way.

I would be very confused to find my code breaking or its performance changing because I went from a (A,B,C) to a TupleStruct(A,B,C) or MyStruct{a:A,b:B,c:C}.

For consistency and simplicity, I hope that it always remains true that "a tuple is the same as the equivalent struct with 0, 1, ... for the field names".

@glaebhoerl
Copy link
Contributor

@scottmcm I don't believe the layout of homogenous tuples would be impacted?

I think there is a completely principled reason for tuples and structs to have different representation: one is ordered, the other is not. Just like you presumably don't find it surprising that HashMap and BTreeMap are represented differently and perform differently. Copying my comment on this from earlier in the thread...

... while the alternative of laying out tuples less efficiently is shaky on "don't sacrifice potential performance ever ever" grounds, I do think it could pass the "don't pay for what you don't use" test. Heretofore, tuples and structs have been semantically equivalent, so it makes perfect sense that they would use the same representation. With this change, tuples would now be more powerful than structs: they would not merely be "anonymous structs", as they had been before, but heterogenously-typed lists which can be iterated over. Furthermore, struct fields are unordered, while tuple fields would have a strict ordering not just syntactically but now semantically as well. Just as unordered collections often get to use more efficient representations than ordered ones, it also makes sense that unordered structs would get to use more efficient representations than ordered tuples. But we should collect statistics and benchmarks about the impact on existing real-world Rust code if we choose to pursue this option. My suspicion is that the impact would be negligible - tuples with more than two elements and of different types are already an uncommon case, I think.

@canndrew
Copy link
Contributor Author

Tuples wouldn't have to be strictly ordered under this proposal. The fundamental unit here is pairs, and pairs can be ordered either which way the compiler chooses. For an n-tuple that still gives the compiler 2^(n-1) different layout options (rather than n! in the case of structs).

ie. for a tuple (a, b, c), any layout that keeps b and c adjacent is admissable.

@scottmcm
Copy link
Member

scottmcm commented Jan 17, 2017

Something else I noticed: RFC 1506 uses and extends the "a tuple struct is equivalent to a braced struct with integer literal field names" idea: https://github.com/rust-lang/rfcs/blob/master/text/1506-adt-kinds.md.

@glaebhoerl The RFC says "Secondly, we layout tuples in reverse order". I don't see any note about homogeneous sections being different.

Also, it's not completely true that "one is ordered, the other is not". The last field in a braced struct is already special: only the last field of a struct may have a dynamically sized type. The drop order (currently and proposed) is also dependent on declaration order.

@rfcbot
Copy link
Collaborator

rfcbot commented Jan 23, 2017

The final comment period is now complete.

@aturon
Copy link
Member

aturon commented Jan 23, 2017

I'm going to close this RFC as postponed. While there's desire to eventually support this kind of tuple-generic programming, there are many thorny issues and in terms of the roadmap the feature doesn't rank high enough to devote the necessary bandwidth at this time.

@aturon aturon closed this Jan 23, 2017
@cramertj
Copy link
Member

@eddyb Looking back on this, it seems like most of the disagreements were focused around the tuple representation and the new syntax. Is there any reason not to reopen a more minimal proposal based on your playground example? It seems like this approach avoids the representation issue, and any special syntax for pattern matching out the head of a tuple could be implemented as sugar for Split::unpack.

P.S. If such a proposal were to move forward, I'd also highly suggest implementing Cons<T> and Split for arrays of type [T; n] to allow for generic array impls without needing to wait for type-level numerics.

@eddyb
Copy link
Member

eddyb commented Jan 25, 2017

to allow for generic array impls without needing to wait for type-level numerics

I thought so too, a long time ago, but we're actually closer to the latter at this point.

@cramertj
Copy link
Member

@eddyb Even if so, I think there are many cases where it's useful to be generic over both tuples and arrays, and getting both with one trait impl seems like a great advantage to this approach.

That aside, is there a reason not to have these traits in the language? The only downside I can think to avoiding them is that it wouldn't be immediately obvious to library consumers that a Split trait impl means they can use tuples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.