-
Notifications
You must be signed in to change notification settings - Fork 24
Description
Proposal
Problem Statement
Currently, proc-macros that handle string literals receive raw strings with escape sequences and surrounding quotes. For example:
#[my_macro]
#[my_attr("\u{x78} blabla")]
pub struct B;In the my_attr proc-macro, the received value is "\u{x78} blabla", including escape sequences and quotes, instead of the parsed equivalent ("x blabla"). This makes working with string literals cumbersome, as proc-macro authors need to reimplement unescape logic that already exists within the Rust compiler.
Motivating Examples or Use Cases
-
Simplifying
synLibrary: Libraries likesynneed to manually reimplement string literal unescaping. Having theunescapefunctionality available in theproc_macrocrate would allowsyn::LitStr::value()to use the standardized unescape function directly, leading to simplified and more reliable code. -
Consistency Across Tools: The Rust compiler already provides unescape functionality in
rustc_lexer::unescape. Making this available publicly would ensure that tools and proc-macros handle escape sequences consistently. -
Reducing Code Duplication: Many proc-macro authors currently need to implement their own logic to handle escape sequences, resulting in duplicated code and potential inconsistencies. Exposing the compiler's unescape functionality would reduce redundancy.
Solution Sketch
-
Expose Unescape Functionality in
proc_macroCrate: The unescape functionality fromrustc_lexer::unescapeshould be exposed in theproc_macrocrate, making it accessible for use in proc-macros. -
Public API for Literal Processing: A new API can be added to the
proc_macrocrate that allows developers to parse and unescape string literals in an ergonomic and standardized way. This would significantly simplify the process of handling string literals in attributes and proc-macros.
Alternatives
-
Reimplement in Libraries: The current approach is for libraries like
synto reimplement the unescape logic. This is not ideal due to code duplication, maintenance burdens, and the potential for inconsistencies. -
External Crate: Instead of adding the unescape functionality to the
proc_macrocrate, another option would be to create an external crate. However, considering that this functionality is tied to parsing Rust literals, adding it to the standard library seems more suitable. -
Leave as Is: Another alternative is to continue requiring proc-macro authors to implement their own unescape logic. However, this is not desirable due to the associated complexity and inconsistency.
Additional Considerations
-
Extend to All Literals: Extending this unescape functionality to all literal types, such as C-strings, integers, and floats, would improve consistency across different literal types and make parsing easier for proc-macro authors working with diverse literals.
-
Refactoring to Work Outside Compiler: The
proc_macrocrate is being refactored to work even when run outside of the compiler. Therefore, the unescape functionality should be implemented in a way that does not depend on the compiler being available. This means making the unescape logic sufficiently library-agnostic so it can be used independently of the compiler context. -
Library-First Approach: The unescape function can likely be developed in a library-agnostic way to avoid code duplication. This suggests an opportunity to make the unescape function reusable, without relying on tight coupling with compiler internals, and making it broadly available.