-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Named macro capture groups #3649
base: master
Are you sure you want to change the base?
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
Co-authored-by: Jacob Lifshay <[email protected]>
Aren't there still some subtle questions around nesting one has to worry about? Specifically, the RFC doesn't seem to say anything about nested cases like |
Although I cannot remember the exact use-case, I remember wanting something like this to name an optional trailing comma so that I could forward it on to another macro as part of the expansion. 👍 |
Great point, the scope part is tricky - I think what you said is correct. It is probably best if I just add an appendix of examples that do and don't work with reasoning, especially given your comments on Zulip about the existing macro mental model not being very clear (I completely agree). |
A bunch of examples would be great, but they are not substitute for also describing the general rules that they all follow from.
|
( $group1:( $a:ident ),+ ) => { | ||
$group1( println!("{}", $a); )+ | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when you write this I thought you are defining a metavariable $group1
which matches a comma-separated list of $a:ident
, and using $group1
will expand to foo,bar,baz,etc
.
but no the $group1
is just a label 🤔
the rationale mentioned using :
is to be "to be more consistent with existing fragment specifiers" but IMO similarity with fragment specifiers is exactly why this RFC should not use the current syntax.
i prefer getting rid of :
(my main source of confusion) and use a label syntax like
macro_rules! make_functions {
(
names: [ $'names $($name:ident),+ ],
// ^~~~~~~~~
$'greetings $( greeting: $greeting:literal, )?
) => {
$'names $(
fn $name() {
println!("function {} called", stringify!($name));
$'greetings $(println!("{}", $greeting) )?
}
)+
}
}
macro_rules! innermost1 {
( $'outer_rep $($a:ident: $'inner_rep $($b:literal),* );+ ) => {
[$'outer_rep $( $'inner_rep $( ${index($'outer_rep)}, )* )+]
};
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Metavariables matching some parts of the input are an interesting idea in general.
If we also introduce "match exactly once repetitions", then we'll also be able to capture just arbitrary token streams.
It would be nice to also be able to bind a name to "repeated exactly one" groups aka just artibtrary substreams in the input.
// macro LHS, capture "exactly once" repetition
$my_tokens:(a b $var c d)①
// macro RHS, reemit `a b value_for_var c d`
$my_tokens
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the ideas - I updated the default syntax to be $group1(...)
(no colon) and added the label syntax as a possibility.
It would be nice to also be able to bind a name to "repeated exactly one" groups aka just artibtrary substreams in the input.
Is there a standard kleene for a single capture, or could we get by with no kleene at all? I could probably include a proposal for single captures in this RFC, to be accepted as part of it or rejected separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In regexes, a "capture exactly once" group is just (...)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After writing some more examples, it seems like "capture once" and "emit the entire capture" would be pretty useful, so I just included them as part of the proposal. Especially for code that just needs to be validated and passed elsewhere, or matching and reemitting optional keywords/tts like mut
, &
or pub(crate)
(just allowing capture groups that don't contain metavars is enough for that, but $pub_crate
in the expansion is nicer to read than $pub_crate(pub(crate))
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One advantage of the reemitting is that it will use spans from the input, instead of spans from the macro body.
One example I recently observed in rust-lang/rust#119412 is the mir!
macro in standard library.
It will match on something like let $expr = $expr;
and then reemit $expr = $expr;
as an assignment expression.
If =
is emitted manually from the macro it will get the macro span and may mess up the combined span of the assignment expression.
If =
is taken from the input, then the expression will get a precise span from the input, which is good for any kind of DSLs.
cc @markbt |
Drop the `:`. Also add an alternative proposed syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skimmed this: I feel positive on this direction.
…r-e2024, r=petrochenkov Make `missing_fragment_specifier` an error in edition 2024 `missing_fragment_specifier` has been a future compatibility warning since 2017. Uplifting it to an unconditional hard error was attempted in 2020, but eventually reverted due to fallout. Make it an error only in edition >= 2024, leaving the lint for older editions. This change will make it easier to support more macro syntax that relies on usage of `$`. Fixes <rust-lang#40107> --- It is rather late for the edition but since this change is relatively small, it seems worth at least bringing up. This follows a brief [Zulip discussion](https://rust-lang.zulipchat.com/#narrow/stream/268952-edition/topic/.60.20DBD.20-.3E.20hard.20error) (cc `@tmandry).` Making this an edition-dependent lint has come up before but there was not a strong motivation. I am proposing it at this time because this would simplify the [named macro capture groups](rust-lang/rfcs#3649) RFC, which has had mildly positive response, and makes use of new `$` syntax in the matcher. The proposed syntax currently parses as metavariables without a fragment specifier; this warning is raised, but there are no errors. It is obviously not known that this specific RFC will eventually be accepted, but forbidding `missing_fragment_specifier` should make it easier to support any new syntax in the future that makes use of `$` in different ways. The syntax conflict is also not impossible to overcome, but making it clear that unnamed metavariables are rejected makes things more straightforward and should allow for better diagnostics. `@Mark-Simulacrum` suggested making this forbid-by-default instead of an error at rust-lang#40107 (comment), but I don't think this would allow the same level of syntax flexibility. It is also possible to reconsider making this an unconditional error since four years have elapsed since the previous attempt, but this seems likely to hit the same pitfalls. (Possibly worth a crater run?) Tracking: - rust-lang#128143
…e2024, r=petrochenkov Make `missing_fragment_specifier` an error in edition 2024 `missing_fragment_specifier` has been a future compatibility warning since 2017. Uplifting it to an unconditional hard error was attempted in 2020, but eventually reverted due to fallout. Make it an error only in edition >= 2024, leaving the lint for older editions. This change will make it easier to support more macro syntax that relies on usage of `$`. Fixes <rust-lang#40107> --- It is rather late for the edition but since this change is relatively small, it seems worth at least bringing up. This follows a brief [Zulip discussion](https://rust-lang.zulipchat.com/#narrow/stream/268952-edition/topic/.60.20DBD.20-.3E.20hard.20error) (cc `@tmandry).` Making this an edition-dependent lint has come up before but there was not a strong motivation. I am proposing it at this time because this would simplify the [named macro capture groups](rust-lang/rfcs#3649) RFC, which has had mildly positive response, and makes use of new `$` syntax in the matcher. The proposed syntax currently parses as metavariables without a fragment specifier; this warning is raised, but there are no errors. It is obviously not known that this specific RFC will eventually be accepted, but forbidding `missing_fragment_specifier` should make it easier to support any new syntax in the future that makes use of `$` in different ways. The syntax conflict is also not impossible to overcome, but making it clear that unnamed metavariables are rejected makes things more straightforward and should allow for better diagnostics. `@Mark-Simulacrum` suggested making this forbid-by-default instead of an error at rust-lang#40107 (comment), but I don't think this would allow the same level of syntax flexibility. It is also possible to reconsider making this an unconditional error since four years have elapsed since the previous attempt, but this seems likely to hit the same pitfalls. (Possibly worth a crater run?) Tracking: - rust-lang#128143
This RFC proposes optional names for repetition groups in macros:
Rendered
Small Pre-RFC: https://internals.rust-lang.org/t/pre-rfc-named-capture-groups-for-macros/20883