Skip to content

Commit

Permalink
Document compatibility between declarative and procedural macro tokens
Browse files Browse the repository at this point in the history
  • Loading branch information
petrochenkov committed Mar 31, 2022
1 parent c97d14f commit 26acee2
Showing 1 changed file with 61 additions and 0 deletions.
61 changes: 61 additions & 0 deletions src/procedural-macros.md
Original file line number Diff line number Diff line change
Expand Up @@ -271,12 +271,73 @@ fn invoke4() {}
// out: item: "fn invoke4() {}"
```

### Declarative macro tokens and procedural macro tokens

Declarative `macro_rules` macros and procedural macros use similar, but
different definitions for tokens (or rather [`TokenTree`s].)

Token trees in `macro_rules` (corresponding to `tt` matchers) are defined as
- Delimited groups (`(...)`, `{...}`, etc)
- All operators supported by the language, both single-character and
multi-character ones (`+`, `+=`).
- Note that this set doesn't include the single quote `'`.
- Literals (`"string"`, `1`, etc)
- Note that negation (e.g. `-1`) is never a part of such literal tokens,
but a separate operator token.
- Identifiers, including keywords (`ident`, `r#ident`, `fn`)
- Lifetimes (`'ident`)
- Metavariable substitutions in `macro_rules` (e.g. `$my_expr` in
`macro_rules! mac { ($my_expr: expr) => { $my_expr } }` after the `mac`'s
expansion, which will be considered a single token tree regardless of the
passed expression)

Token trees in procedural macros are defined as
- Delimited groups (`(...)`, `{...}`, etc)
- All punctuation characters used in operators supported by the language (`+`,
but not `+=`), and also the single quote `'` character (typically used in
lifetimes, see below for lifetime splitting and joining behavior)
- Literals (`"string"`, `1`, etc)
- Negation (e.g. `-1`) is supported as a part of integer
and floating point literals.
- Identifiers, including keywords (`ident`, `r#ident`, `fn`)

Mismatches between these two definitions are accounted for when token streams
are passed to and from procedural macros. \
Note that the conversions below may happen lazily, so they might not happen if
the tokens are not actually inspected.

When passed to a proc-macro
- All multi-character operators are broken into single characters.
- Lifetimes are broken into a `'` character and an identifier.
- All metavariable substitutions are represented as their underlying token
streams.
- Such token streams may be wrapped into delimited groups ([`Group`]) with
implicit delimiters ([`Delimiter::None`]) when it's necessary for
preserving parsing priorities.
- `tt` and `ident` substitutions are never wrapped into such groups and
always represented as their underlying token trees.

When emitted from a proc macro
- Punctuation characters are glued into multi-character operators
when applicable.
- Single quotes `'` joined with identifiers are glued into lifetimes.
- Negative literals are converted into two tokens (the `-` and the literal)
possibly wrapped into a delimited group ([`Group`]) with implicit delimiters
([`Delimiter::None`]) when it's necessary for preserving parsing priorities.

Note that neither declarative nor procedural macros support doc comment tokens
(e.g. `/// Doc`), so they are always converted to token streams representing
their equivalent `#[doc = r"str"]` attributes when passed to macros.

[Attribute macros]: #attribute-macros
[Cargo's build scripts]: ../cargo/reference/build-scripts.html
[Derive macros]: #derive-macros
[Function-like macros]: #function-like-procedural-macros
[`Delimiter::None`]: ../proc_macro/enum.Delimiter.html#variant.None
[`Group`]: ../proc_macro/struct.Group.html
[`TokenStream`]: ../proc_macro/struct.TokenStream.html
[`TokenStream`s]: ../proc_macro/struct.TokenStream.html
[`TokenTree`s]: ../proc_macro/enum.TokenTree.html
[`compile_error`]: ../std/macro.compile_error.html
[`derive` attribute]: attributes/derive.md
[`extern` blocks]: items/external-blocks.md
Expand Down

0 comments on commit 26acee2

Please sign in to comment.