Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

format macros #8

Open
nrc opened this issue Mar 9, 2015 · 32 comments
Open

format macros #8

nrc opened this issue Mar 9, 2015 · 32 comments

Comments

@nrc
Copy link
Member

nrc commented Mar 9, 2015

This will be interesting...

@nrc
Copy link
Member Author

nrc commented Apr 29, 2015

The strategy here will be to get the text for the macro, replace every $foo in the body of the macro with a string of the same length, e.g., xfoo. Then re-parse it and format it and convert the xfoos back to $foos (obvs we need to check that there is no xfoo in the body before substituting).

We'll need to format the decl of the macro, and find the body, I suppose, using token trees.

@lambda-fairy
Copy link

lambda-fairy commented Sep 27, 2020

Google led me here. Is it worth closing this issue now, since rustfmt does have basic macro support?

EDIT: rustfmt has a couple heuristics, but it doesn't really format macros properly in general. So this bug should stay open.

@entropylost
Copy link

Would it be possible to like annotate the macro like

#[rustfmt(struct_init)]
macro_rules! foo...

where the syntax for foo! would be

foo! {
  foo: "abcd",
  bar: 1,
};

?

@calebcartwright
Copy link
Member

Would it be possible to like annotate the macro like

Suppose it could be possible for those defined within the project being formatted, but practicality would be questionable since I imagine rustfmt would have to do an upfront walk of the entire tree to search for such attributed defs. Wouldn't be possible at all for defs residing outside the current project, as rustfmt wouldn't have insight into those attributes.

That bifurcation would result in different formatting in cases like a crate's documentation vs. what consumers would see in the formatted version of the instances where they call those macros.

@drbartling
Copy link

Fairly new to rust, and I'm curious, why is formatting macros in rust hard. How does it compare to formatting macros or templates in C++?

@lambda-fairy
Copy link

@drbartling A Rust macro defines its own custom syntax. So rustfmt has to either hard-code support for it, or somehow expand it to figure out what it does, or have the developer annotate the macro to teach rustfmt how to format it.

Only the latter two options can work for all use cases. But macros can be imported, so rustfmt will have to be extended to resolve names across modules/crates. This is a big step up from its current approach, where it only looks at a single file in isolation.

@timothee-haudebourg
Copy link

@lambda-fairy a fourth intermediate option would be to annotate each macro invocation, instead of the macro definition, with hints on how to format it. I guess most of the time a basic "add/remove 1 tabulation level after {/}" + "insert a new line after a semicolon" could really improve readability even if it is not perfect. I agree it is not as ideal as inferring formatting from the macro definition, but it would be easier to actually implement.

@timothee-haudebourg
Copy link

While I'm thinking about it, why not just let the macro invocation formatting untouched? As @lambda-fairy said, a Rust macro defines its own custom syntax, and rustfmt is a Rust syntax formatting tool. Why not let the writer be in charge of formatting what is inside a macro invocation?

@entropylost
Copy link

Not all macros define custom syntax, eg println!.
It's especially inconveinent when an entire definition of a function is wrapped within a macro to hand-format it.

@dvc94ch
Copy link

dvc94ch commented Jan 10, 2023

is there a way for proc macro developers to provide rustfmt with a formatting routine?

@ytmimi
Copy link
Contributor

ytmimi commented Jan 10, 2023

@dvc94ch there is no way for a macro author to tell rustfmt how to format their macro. If you'd like, you can open a new issue to discuss your specific use case in more detail.

@jkelleyrtp
Copy link

As far as I know, dioxus fmt is the only project to provide macro formatting.

https://github.com/DioxusLabs/dioxus/tree/master/packages/autofmt

It would be great to allow dioxus fmt to hook into cargo fmt somehow.

@GilShoshan94
Copy link

GilShoshan94 commented Mar 15, 2023

Would it be possible for a library author to add some kind of instructions for rustfmt on how to format its macros invocation?

Maybe some kind of macro_rules! too but for formating, let's call it rustfmt_rules! defined just after the definition of the macro.

For the instructions I was thinking reusing some of the fragment-specifier from the macros system itself.

For example for tokio::select! it could be:

/// In tokio/src/macros/select.rs

#[macro_export]
#[cfg_attr(docsrs, doc(cfg(feature = "macros")))]
macro_rules! select {
    ...
} 

rustfmt_rules! select {
    $(biased;\n):?
    $($:pat = $:expr$(, if $:expr):? => $:expr,\n)*
    $(else => $:expr):?
}

In the rustfmt_rules! we proceed in order.

First, we can see a capturing group $( ), that is qualified with :?, it would mean an optional group that may not be present. Inside the caturing group, we have a regex biased;\n, so the formating should be just "biased;" and newline.

Next, we have zero or more repetitions group $( )*, inside we have a Rust pattern $:pat follow by " = " and a valid Rust expression $:expr, followed by an optional group (", " + Rust expression), then a litteral " => ", Rust expression, "," and newline.

Last line, we have an optional group "else => " + valid Rust expression.

So this is kind of a reuse of the macro system, but instead of parsing, it would be used to inform rustfmt how to format (litteral, newline, valid Rust code...)

@calebcartwright
Copy link
Member

@GilShoshan94 that suggestion was previously made above, and the challenges/rationale against has similarly already been shared

@tgross35
Copy link
Contributor

tgross35 commented May 25, 2023

What sort of issues would there be with only adjusting spacing around punctuation? Not applying wrapping or anything else to the content, but I think a simple subset might handle the most common mistakes:

  • No spaces before or after: .,::
  • Space after but not before: ., :, ], ),
  • Space before but not after: [, (
  • Space on both sides: =>, {, }, operators
  • Replace multiple spaces with a single space
  • Don't adjust indentation within the section, but make sure the lowest indentation level within the section is one more than the macro (if multiline)
  • Collapse entire macro call to a single line if content is a single line and it fits within the limit, expand to macro!(\n /* content */ \n) otherwise
// before
macro!(
Baz:qux
foo =>bar;
    [ "quux",corge]);

// after
macro!(
    Baz: qux
    foo => bar;
        ["quux", corge]
);

@calebcartwright
Copy link
Member

Contextual reminder: rustfmt operates on the AST, and does not directly work with input files. For macro calls, rustfmt doesn't really directly process the arg tokensteams either; it chucks the tokens back to rustc_parse, and if rustc_parse says those tokens look like some other type of valid Rust syntax (e.g. an expression), then rustfmt is able to apply the associated rules. This is important to keep in mind because it's not a question of "adjusting" things like existing whitespace/indentation in an input file.

I don't think that's behavior we'd want to drop in lieu of more simplified token-by-token processing (even if feasible), because many users/many call sites want the args to be formatted just like regular Rust code. I'm also not sure if/how well those two models could coexist, or even if one could truly define a singular set of by-token rules that would work unequivocally across all macros (my gut says that at a minimum there would be pairs of macros with conflicting needs e.g. what makes sense for html! may be contrary to the needs of some other macro).

TBH I think the only feasible path forward that maintains a cohesive formatting experience with the rest of the code is to have better support for macros that do work with valid-Rust syntax (there's some challenges in the current model that are solvable, e.g. calls with tokenstreams that can be parsed as multiple types of valid syntax, designating/handling specified macros with args that are mostly valid syntax, etc.) and then potentially starting to special case individual macros, preferably with the majority being able to utilize a relatively small set of formatting patterns.

However, I don't expect any changes/improvements on this front for the foreseeable future; we've too little bandwidth and too many cases of valid Rust syntax that rustfmt doesn't yet support which takes priority.

@narodnik
Copy link

narodnik commented Apr 8, 2024

Why not just add rust formatting for info!() macros? Or macros which are simply like function calls (except using ! in their name).

@baxterjo
Copy link

What about a way for individual macro creators to dictate how their macros are formatted? Something like a trait, but specific to cargo fmt that has an optional implementation if the macro creator so chooses to implement it. I could see large framework creators like tokio-rs taking the time to implement this, while smaller crates don't have to.

@chipnertkj
Copy link

I have opened a discussion thread in the Rust Internals Forum related to this issue.
If there is something constructive you could add to the discussion (criticism, possible solutions, concerns, use cases), please head through the link below.
https://internals.rust-lang.org/t/discussion-adding-grammar-information-to-procedural-macros-for-proper-custom-syntax-support-in-the-toolchain/21496

@calebcartwright
Copy link
Member

calebcartwright commented Sep 6, 2024

Want to reiterate that the suggestion to let macro authors define their own formatting asked in #8 (comment) and alluded to again in the IRLO thread noted #8 (comment) has been suggested and responded to multiple times in this thread (e.g. #8 (comment))

Different tools operate at different stages of the process for varying and valid reasons based on their own respective purposes and contexts/constraints.

rustfmt operates at an early stage in the process, directly on the AST, which allows it do things like still being able to format code that doesn't compile. Even if we momentarily presume there's a mechanism that makes it both possible and desirable for macro authors to dictate their own formatting, there's a technical problem because that's not information that would be accessible in the earlier stages like lexing & AST generation, and would require rustfmt to shift to some post expansion, resolution, etc. which would in turn result in certain features/capabilities no longer being possible

Furthermore, I'd just pose the question as to whether or not it would truly be desirable for consumers of a macro.

Imagine a scenario where you're using multiple 3rd party macros, each of which has authors that have widely diverging formatting that starkly contradict each other (e.g. within your own codebase you've got standard rustfmt'ed code using 4 space indents immediately followed by one macro callsite where the author forced 8 space indents that's then followed by another macro call that's got 2 space indents, etc.)

Each macro author would also ostensibly have the autonomy to change their formatting whenever they wanted (i.e. i'm not aware of any semver spec requirement that would force macro authors specifying their own formatting rules to have to do a major version bump if they changed their formatting rules), so I think you'd be forced to ensure that relevant dependencies are pinned to an exact version to ensure formatting is consistent for all your contributors and that CI checks don't fail because you have an ephemeral CI environment that picks up a different version of a transitive dependency, upgrades on dependencies containing macros would potentially introduce code diffs, etc.

What about configuration options? Macro authors could conceivably want to enable configurable options for their bespoke macro formatting, and if so, where would those be defined? Would they want their users to be able to specify those in the standard rustfmt config file? how would rustfmt (and the rustfmt team) be able to marry those options correctly with whatever version of the dependency that defines the macro & macro formatting rules? what if macro authors wanted their own config file?

I ask these questions somewhat rhetorically to convey some skepticism. I'm sure there's people smarter than me that could devise grand solutions for all of these, but I'm still not convinced it would be the right solution to the problem.

A big part of the approach and bounding constraints (e.g. stability guarantee) for rustfmt are centered around tenants like consistency and minimizing formatting-driven code churn. I feel like positioning rustfmt as a general purpose formatting platform would run counter to that or at a minimum create a very conceivable surface for those tenants to be directly contradicted.

@chipnertkj
Copy link

chipnertkj commented Sep 6, 2024

@calebcartwright
First of all, thank you for your detailed reply and summarizing the discussion so far for me. Let's see.

that's not information that would be accessible in the earlier stages like lexing & AST generation

I understand the current constraints of rustfmt with respect to early processing stages, but aren't proc macros processed in a compilation unit separate from consuming code? Perhaps this is a misunderstanding on my part, but wouldn't defining syntax metadata/parser/etc. in the same compilation unit allow potential consumers of said data, like rustfmt, to access it at any stage of processing the inputs to a macro, as long as it is exposed somehow?

Imagine a scenario where you're using multiple 3rd party macros, each of which has authors that have widely diverging formatting that starkly contradict each other (e.g. within your own codebase you've got standard rustfmt'ed code using 4 space indents immediately followed by one macro callsite where the author forced 8 space indents that's then followed by another macro call that's got 2 space indents, etc.)

I feel it would be the responsibility of the library developer to provide a formatting experience that does not conflict with the user's needs. Eg. rustfmt doesn't force you to use specific indentation, so why should they? Though I suppose there could be edge cases where a certain kind of formatting is genuinely needed for the macro to work, in which case... This seems more like a functionality issue than a formatting one, as the mismatch would exist independently of formatting tools. I hope I understood your concern correctly.

Each macro author would also ostensibly have the autonomy to change their formatting whenever they wanted (i.e. i'm not aware of any semver spec requirement that would force macro authors specifying their own formatting rules to have to do a major version bump if they changed their formatting rules)

It is impossible to force or expect a developer to interpret semver in a specific way, but this would directly impact the user experience when working with a macro. I think it would be reasonable to expect stability guarantees similar to those provided by rustfmt. Naturally, developers will avoid introducing changes that disrupt CI processes, as stability is a key factor for adoption.

I understand and share the concern of formatting and macro definition versions being entangled, and it's not something I quite have a good solution to. Maybe establishing formal guidelines or best practices around versioning for macro formatting rules would help clarify these expectations without enforcing rigid rules. If this isn't satisfactory, perhaps a community discussion could help find a common ground. 😉

What about configuration options?

If rustfmt were to adopt such an interface, external files could be one approach, potentially with a hardcoded filename and a discovery algorithm similar to rustfmt’s own. This would avoid cluttering the existing configuration schema, and it’s just one example of how these challenges could be addressed.

I’m not advocating for a specific solution here, but rather exploring possibilities to demonstrate that practical solutions are out there. Of course, these are just initial thoughts, and community feedback would be essential in refining any approach.

I feel like positioning rustfmt as a general purpose formatting platform would run counter to that or at a minimum create a very conceivable surface for those tenants to be directly contradicted.

I sympathize with that concern. This is one of the thoughts that have been on my mind. Expanding it into a general-purpose formatting platform could indeed lead to complexities that might compromise consistency and stability.

To clarify, rustfmt is mentioned here mainly because it’s been the focal point for much of the community's requests and discussions on the topics. I’m not entirely convinced that adapting rustfmt is necessarily the best or only approach, but there’s clearly significant interest in improving the toolchain's capabilities when working with macros.

Different use cases and values exist across the community. I believe it would be beneficial to create a space where these discussions can take place. I am interested in exploring how we can address these needs without compromising on the goals of any part of the toolchain.

@calebcartwright
Copy link
Member

but wouldn't defining syntax metadata/parser/etc. in the same compilation unit allow potential consumers of said data, like rustfmt, to access it at any stage of processing the inputs to a macro, as long as it is exposed somehow?

if Rust's AST contained the information then yes, any tool that works with Rust's AST would have access to that information.

Whether Rust's AST should or feasibly could contain that type of metadata, and what the associated impacts could be is a different discussion for different people (and I know you're just trying to facilitate such a discussion, I'm just noting that we can't really speak to that beyond personal thoughts & opinions)

@ytmimi
Copy link
Contributor

ytmimi commented Sep 12, 2024

I understand the current constraints of rustfmt with respect to early processing stages, but aren't proc macros processed in a compilation unit separate from consuming code? Perhaps this is a misunderstanding on my part, but wouldn't defining syntax metadata/parser/etc. in the same compilation unit allow potential consumers of said data, like rustfmt, to access it at any stage of processing the inputs to a macro, as long as it is exposed somehow?

To add to Caleb's comment above, rustfmt can really only reliably and consistently format your code based on information in the AST. rustfmt is capable of formatting code that doesn't compile (as long as it parses), and it's possible to format code for a project before it's ever been compiled. Relying on the output from compilation and potentially introducing formatting differences between pre / post compilation would not be ideal.

jeromerobert added a commit to tucanos/pytucanos that referenced this issue Nov 29, 2024
jeromerobert added a commit to tucanos/pytucanos that referenced this issue Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests