-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overhaul MacArgs
#96546
Overhaul MacArgs
#96546
Conversation
This commit rearranges the `match`. The new code avoids testing for `MacArgs::Eq` twice, at the cost of repeating the `self.print_path()` call. I think this is worthwhile because it puts the `match` in a more standard and readable form.
Because `Nonterminal` is the type it visits.
The `token` is always an interpolated non-terminal expression, and always a literal in valid code. This commit simplifies the processing accordingly, by directly extracting and using the literal.
The two paths are equivalent -- they both end up calling `visit_expr()`. I have kept the more restrictive path, the one that requires that `token` be an expression nonterminal. (The next commit will simplify this function further.)
This comment has been minimized.
This comment has been minimized.
dde727e
to
671cb71
Compare
I don't think it's a right direction to fine-tune Conceptually the value in I'll look at the changes in detail a bit later. |
The grammar says that As I mentioned above, my main motivation here is to get rid of one of the
It might still be possible to get rid of the |
A token stream that must fit into the expression grammar. We actually have something pretty similar in the language, in #[unexpanded_attribute_macro]
EXPR
In practice, while parsing such a stream we can build an So we can do the same thing with expressions in key-value attributes and represent them as If (*) The use of |
So in this PR we generally need to get rid of the assumption that the expression is always a single literal. One exception is |
Isn't that trivially true for any AST fragment kind? E.g. an item is created from a token stream that must fit in the item grammar, a statement is created from a token stream that must fit into the statement grammar, etc. Or is there something special about expressions here that I'm overlooking? |
Thank you for the detailed explanation. I've uploaded a new commit that addresses some of these comments.
However, to ease the transition to a possible future state that allows more than literals, I added a comment to |
e7c90ba
to
0560567
Compare
Yes, I meant any node under a macro attribute. #[unexpanded_attribute_macro]
NODE I used expression node as an example because key-value attributes also use expressions. |
Thanks for the detailed review comments. I have addressed them all, except where I have explained above. |
This comment has been minimized.
This comment has been minimized.
ad77aae
to
767ad48
Compare
/// historical reasons. We'd like to get rid of it, for multiple reasons. | ||
/// - It's conceptually very strange. Saying a token can contain an AST | ||
/// node is like saying, in natural language, that a word can contain a | ||
/// sentence. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
proc_macro
-style tokens contain "sentences" (delimited groups) in them all the time, but for flattened rustc_parse
-style tokens it's unusual, I guess.
When parser librarification was still in plans, one of the alternatives was to switch rustc_parse
to the proc_macro
-style token model.
The main reason to get rid of Interpolated
tokens from my point of view is that we want to define everything happening during macro expansion in terms of token streams, so we have to treat the AST pieces as token streams at least in some cases, moreover the token stream representation is the source of truth for them.
So these AST pieces are supposed to be a parser caching mechanism at most, which is highly non-obvious when they are represented like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're talking about token trees, rather than tokens, right?
The main reason to get rid of Interpolated tokens from my point of view ...
That's a good reason as well :)
r=me after squashing commits (all of this is basically changing representation of |
The value in `MacArgs::Eq` is currently represented as a `Token`. Because of `TokenKind::Interpolated`, `Token` can be either a token or an arbitrary AST fragment. In practice, a `MacArgs::Eq` starts out as a literal or macro call AST fragment, and then is later lowered to a literal token. But this is very non-obvious. `Token` is a much more general type than what is needed. This commit restricts things, by introducing a new type `MacArgsEqKind` that is either an AST expression (pre-lowering) or an AST literal (post-lowering). The downside is that the code is a bit more verbose in a few places. The benefit is that makes it much clearer what the possibilities are (though also shorter in some other places). Also, it removes one use of `TokenKind::Interpolated`, taking us a step closer to removing that variant, which will let us make `Token` impl `Copy` and remove many "handle Interpolated" code paths in the parser. Things to note: - Error messages have improved. Messages like this: ``` unexpected token: `"bug" + "found"` ``` now say "unexpected expression", which makes more sense. Although arbitrary expressions can exist within tokens thanks to `TokenKind::Interpolated`, that's not obvious to anyone who doesn't know compiler internals. - In `parse_mac_args_common`, we no longer need to collect tokens for the value expression.
767ad48
to
baa18c0
Compare
I have squashed the relevant commits. @bors r=petrochenkov |
📌 Commit baa18c0 has been approved by |
☀️ Test successful - checks-actions |
Finished benchmarking commit (4c60a0e): comparison url. Summary:
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. @rustbot label: -perf-regression Footnotes |
…chenkov Overhaul `MacArgs` Motivation: - Clarify some code that I found hard to understand. - Eliminate one use of three places where `TokenKind::Interpolated` values are created. r? `@petrochenkov`
@@ -1536,11 +1536,20 @@ pub enum MacArgs { | |||
Eq( | |||
/// Span of the `=` token. | |||
Span, | |||
/// "value" as a nonterminal token. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- c
Motivation:
TokenKind::Interpolated
values are created.r? @petrochenkov