Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undisambiguated generic arguments in expressions #2527

Closed
wants to merge 5 commits into from

Conversation

varkor
Copy link
Member

@varkor varkor commented Aug 21, 2018

Make disambiguating generic arguments in expressions with :: optional, allowing generic arguments to be specified without :: (making the "turbofish" notation no longer necessary).

Rendered

This makes the following valid syntax:

struct Nooper<T>(T);

impl<T> Nooper<T> {
    fn noop<U>(&self, _: U) {}
}

fn id<T>(t: T) -> T {
    t
}

fn main() {
    id<u32>(0u32); // ok
    let n = Nooper<&str>(":)"); // ok
    n.noop<()>(()); // ok
}

Thanks to @Centril, @scottmcm, @joshtriplett and @rpjohnst for feedback!

@Centril Centril added the T-lang Relevant to the language team, which will review and decide on the RFC. label Aug 21, 2018
@KamilaBorowska
Copy link
Contributor

KamilaBorowska commented Aug 21, 2018

What about statements like this?

let a = (b<c, d>(e));

This can either mean a tuple

let a = ((b < c), (d > e));

Or a pair of generic arguments.

let a = b::<c, d>(e);

@varkor
Copy link
Member Author

varkor commented Aug 21, 2018

@xfix: it seems we had overlooked that ambiguity before, thanks for pointing it out. This looks pretty difficult to resolve to me, so I'm going to close the RFC unless someone has a resolution. It was a nice idea while it lasted 🙁

@varkor varkor closed this Aug 21, 2018
@eddyb
Copy link
Member

eddyb commented Aug 21, 2018

@xfix Aaaaah, that's what I couldn't remember, which made this infeasible last time!
I should've asked @bstrie I guess, they might have remembered.

@Centril
Copy link
Contributor

Centril commented Aug 21, 2018

This was a sad day! 😭

@varkor
Copy link
Member Author

varkor commented Aug 21, 2018

I've opened rust-lang/rust#53562 to ensure that any such attempt to unroot the Turbofish in the future shall have an earlier warning than we did.

@novacrazy
Copy link

Additionally, at least for me personally, turbofish is less ambiguous for mentally parsing code as well, so I’ve never minded the slightly more verbose syntax. I also really don’t like this recent trend of oversimplification in Rust just to save a few characters...

@varkor
Copy link
Member Author

varkor commented Aug 21, 2018

@novacrazy: I think you'll find that there's always been a trend of reducing characters in Rust (take a look at all the keywords) 😉
Regardless, the motivation here wasn't to reduce characters, but to make the syntax more consistent. I think you never get a syntax everyone agrees on, but in the case of turbofish, most of the feedback I've gathered has been positive towards making it unnecessary if possible.
It's a moot point, though.

@eddyb
Copy link
Member

eddyb commented Aug 21, 2018

@varkor Have you considered allowing only f<T>(), i.e. no commas?

@Havvy
Copy link
Contributor

Havvy commented Aug 21, 2018

@eddyb I feel like allowing it in the special case of zero or one type parameters is going to cause confusion and make the grammar that much more irregular.

@BatmanAoD
Copy link
Member

I'm actually with @novacrazy on this: I think the Rust syntax is superior to the C++ syntax. Angle brackets are pretty common; disambiguation is helpful.

@L-as
Copy link

L-as commented Aug 22, 2018

This is really unfortunate, but I'd still really like a unification of providing generic arguments. The simplest way I see would just be to require turbofish everywhere, but this is quite ugly.

@artemshein
Copy link

I like the approach Kotlin takes on the disambiguation -- in case it's not obvious how to parse an expression, Kotlin will ask to put parens. No turbofish and no additional parens 99% of the time.

@comex
Copy link

comex commented Aug 22, 2018

Hmm… interesting. Kotlin doesn't have tuples (or a comma operator like C++), but you can get a similar effect with function calls:

foo(b<c, d>(e))

This appears to always be parsed as a generic, not a pair of comparisons – see this test program.

I suppose that if you did want two comparisons, you could add parentheses:

foo((b<c), d>(e))
    ^   ^

The approach Kotlin uses seems to require infinite lookahead. When encountering a <, the parser can't commit to whether to parse it as a comparison or as the start of a generic parameter list. For example, if the parser has seen this fragment of code so far:

foo(b<c,b<c,b<c,b<c,b<c,b<c,b<c, 

…these are all legal ways for the line to end:

  1. foo(b<c,b<c,b<c,b<c,b<c,b<c,b<c, 4) (all comparisons)
  2. foo(b<c,b<c,b<c,b<c,b<c,b<c,b<c, Int>>>>>>>()) (all generics)
  3. foo(b<c,b<c,b<c,b<c,b<c,b<c,b<c, Int>>>()) (a mix of comparisons and generics!!)

(To play with this, look at lookahead.kt in the last link.)

That leaves me with a few questions:

  • Are there peculiarities in Rust's grammar that would make adopting the same approach difficult or impossible?

  • Was this approach ever considered in Rust's olden days, and rejected due to infinite lookahead or some other downside? (IMO, the benefit of killing turbofish would be well worth it.) Or did nobody realize it was a possibility?

  • Adopting this in Rust would necessarily be a breaking change. Given (a<b, c>(d)), the 'default' parse must be as a generic rather than a comparison – because the comparison can be disambiguated by adding parentheses (((a<b), c>(d))) but there's nowhere to add parentheses that would make a generic the only valid interpretation. However, Rust currently does the opposite, defaulting to a comparison and requiring turbofish to force interpretation as a generic.

    Unfortunately, even if everyone suddenly agreed Rust should switch, AFAIK it would be way too late for the Rust 2018 edition. So… when's the next edition? :)

@BatmanAoD
Copy link
Member

The advantage of ::< is that it looks obviously generics-related. This seems like a clear win for readability, especially compared to the disambiguate-with-parens strategy.

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Aug 22, 2018
…nagisa

Lament the invincibility of the Turbofish

Here a test case is added to ensure that any others attempting to drive the Turbofish to extinction have second thoughts. Previously the [entire test suite would succeed](rust-lang#53511) if generic arguments were accepted without disambiguation, making for [confusing and heartbreaking circumstances](rust-lang/rfcs#2527).
@Centril
Copy link
Contributor

Centril commented Aug 22, 2018

@comex

So… when's the next edition? :)

I'm betting on 2021 :)

@artemshein
Copy link

For me :: looks obviously namespaces-related.

@BatmanAoD
Copy link
Member

BatmanAoD commented Aug 22, 2018

@artemshein I think that makes some sense, though. MyGeneric::<args...> is selecting the <args...> variant from the set of possible functions or types provided by MyGeneric. "Select x from the set X" is also how :: behaves when used on namespaces: X::x.

@artemshein
Copy link

@BatmanAoD haha, ok, maybe, but it's still ugly and doesn't reflect the generic parameters definition syntax (Option::<T>, anyone?) and introduces one more unnecessary thing to learn (big enough to have a full page of explanations like this https://matematikaadit.github.io/posts/rust-turbofish.html).

@BatmanAoD
Copy link
Member

@artemshein The "ugliness" is of course a matter of personal aesthetic preference. Personally, coming from C++, when I saw the turbofish I immediately thought "oh, great, that's an improvement!" I realize I am in the minority in this view, though.

If consistency is the goal, I would personally rather see definition syntax match expression context, i.e. pub enum Option::<T>. I haven't proposed this because I assume that there are very few people who share that aesthetic preference and that the proposal would therefore immediately be shot down.

I also wouldn't call it an "unnecessary thing to learn." Clearly, per this discussion, it is necessary for Rust, even if it's different from C++!

@L-as
Copy link

L-as commented Aug 22, 2018

Yeah I think just using turbofish everywhere might be superior. One idea I also had was since people dislike writing ::<>, change the :: part to !. Then it would be transmute!<f32>(x) and Option!<Vec!<u32>>. You could argue that it would be confused with macros, but is it really so wrong to consider generic items to be a macro of sorts, that generate a variant from the arguments? Though the change this would require might not make it worth it to switch away from the turbofish.

@BatmanAoD
Copy link
Member

@laaas I had the ! thought last night as well. I've seen it used with essentially the correct semantics (except with the reverse order) in non-programming contexts, e.g. referring to the version of Harry Potter from HPMOR as "Rational!Harry". So I think for some people, ! would be a natural way to express the concept.

Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this pull request Aug 22, 2018
…nagisa

Lament the invincibility of the Turbofish

Here a test case is added to ensure that any others attempting to drive the Turbofish to extinction have second thoughts. Previously the [entire test suite would succeed](rust-lang#53511) if generic arguments were accepted without disambiguation, making for [confusing and heartbreaking circumstances](rust-lang/rfcs#2527).
@petrochenkov
Copy link
Contributor

petrochenkov commented Aug 23, 2018

chained comparisons are now banned in Rust

I assumed that they are banned now so they can be allowed in the future in their intuitive meaning - a < b < c == a < b && b < c.
In this case we cannot reuse the syntax for generic arguments.

backtracking is already present for some (uncommon) cases in the Rust parser

All these cases exist purely for diagnostics and can be removed without affecting Rust grammar.

This feature also directly interferes with const generic parameters.
When they are implemented, even the most innocent stuff like x < 10 && y > 10 will start looking like generic arguments.

@varkor
Copy link
Member Author

varkor commented Aug 23, 2018

I assumed that they are banned now so they can be allowed in the future in their intuitive meaning - a < b < c == a < b && b < c.
In this case we cannot reuse the syntax for generic arguments.

That was one of the motivations, but note that chained comparisons in the same direction (e.g. a < b < c) are unambiguous with generics. The syntax that is not is a < b > c: I think a strong case could (and would) be made against this syntax for chained comparisons as it is far less readable than a < b && c < b (or equivalent). So I don't think this conflicts with generic arguments.

@petrochenkov:
Is this backtracking for diagnostic use only:
https://github.com/rust-lang/rust/blob/e9e7e53c31a3d1169ac81f0e82ca9168167ce711/src/libsyntax/parse/parser.rs#L3136-L3145
I couldn't tell from a quick look, but it looked plausibly like it could be used for non-error situations.

This feature also directly interferes with const generic parameters.
When they are implemented, even the most innocent stuff like x < 10 && y > 10 will start looking like generic arguments.

Expressions that are const generic arguments must be enclosed within a block, therefore no additional ambiguity over the tuple case given above is created.

@petrochenkov
Copy link
Contributor

petrochenkov commented Aug 23, 2018

Is this backtracking for diagnostic use only

Yes, if self.parse_ty_no_plus() (which is the only interpretation according to the grammar) fails, then we return an error on every path.

Expressions that are const generic arguments must be enclosed within a block, therefore no additional ambiguity over the tuple case given above is created.

Text and examples in that RFC contradict to itself, the precise rules were written out somewhere in the discussion thread, but were never moved into the RFC text properly.
The important part is that anything starting with a literal is certainly an expression and is unambiguously parsed as a const argument, so things like Type<1, 1 + 2> work without requiring braces.

@comex
Copy link

comex commented Aug 23, 2018

Well, x < 10 && y > 10 wouldn’t be ambiguous because 10 can’t appear after a type or expression.

However, x < 10 && y > -10 would be...

…that is, under the parsing rule for const parameters that you describe. But it’s worth noting that const generics are not stable (or even fully implemented) yet, so it would be possible to change the rule back to what’s described in the RFC. It’s certainly nice to minimize the need for braces in const parameters, but on the other hand, “expressions that start with literals but reference variables later on” seems like a relatively narrow category.

@comex
Copy link

comex commented Aug 23, 2018

...that said, if x<10 && y> - 10 (i.e. the const generics interpretation) would otherwise be valid syntax, and would thus conflict with x < 10 && y > -10, the grammar could and should still resolve the ambiguity in favor of the latter. That’s because in this case, both alternatives have a place to add parentheses to force them: you could force the first interpretation by writing x<{10 && y}> - 10, or force the second with (x < 10) && (y > -10). Thus, either option would be an acceptable default, in the sense of not making anything impossible to express.

Since it’s rather unlikely that 10 && y would typecheck (albeit not impossible), it would clearly be better to default to the other interpretation.

kennytm added a commit to kennytm/rust that referenced this pull request Aug 24, 2018
…nagisa

Lament the invincibility of the Turbofish

Here a test case is added to ensure that any others attempting to drive the Turbofish to extinction have second thoughts. Previously the [entire test suite would succeed](rust-lang#53511) if generic arguments were accepted without disambiguation, making for [confusing and heartbreaking circumstances](rust-lang/rfcs#2527).
@estebank
Copy link
Contributor

Is this backtracking for diagnostic use only:
https://github.com/rust-lang/rust/blob/e9e7e53c31a3d1169ac81f0e82ca9168167ce711/src/libsyntax/parse/parser.rs#L3136-L3145
I couldn't tell from a quick look, but it looked plausibly like it could be used for non-error situations

@varkor If you look at all 3 possible branches after that, the parser will always emit an error, either the original error or the specific error suggesting to use parenthesis around the expression. That code path could be removed and the compiler would start accepting that code as valid (which is what rust-lang/rust#42578 originally did), but the agreement was reached that it was a bad idea to change the grammar in that way.

I would love it if with the code you wrote in rust-lang/rust#53578, we could add the same kind of suggestions for people that may not have the full grasp of how to use the turbofish yet. We would not be fixing the ergonomics issue, but we'd be fixing part of the discoverability problem.

@jorgbrown
Copy link

At the risk of moving the anger from one fanaticism to another, may I suggest that < be considered to mean "less than" only if it is surrounded by whitespace on both sides? And likewise that < be considered to be generic-oriented only if it is NOT surrounded by whitespace on both sides?

Then this x<10 && y> - 10 would be interpreted as generics-code, while this x < 10 && y > -10 would be considered to be comparative-code. I know this will upset those who insist that whitespace shouldn't matter, but for most people I know who read code, their brain is already interpreting space, or the lack of space, in this way.

@varkor
Copy link
Member Author

varkor commented Mar 8, 2021

At the risk of moving the anger from one fanaticism to another, may I suggest that < be considered to mean "less than" only if it is surrounded by whitespace on both sides? And likewise that < be considered to be generic-oriented only if it is NOT surrounded by whitespace on both sides?

This was discussed on Zulip at one point. I thought this could be a reasonable design decision under the circumstances, but there are those (including on the lang team) who don't agree, so I don't think this changes the impasse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.