Skip to content

Commit

Permalink
Trait ergonomics str implementation (#4233)
Browse files Browse the repository at this point in the history
* feat: Implement custom string formatting for PyClass

This update brings custom string formatting for PyClass with a #[pyclass(str = "format string")] attribute. It allows users to specify how their PyClass objects are converted to string in Python. The implementation includes additional tests and parsing logic.

* update: removed debug print statements

* update: added members to ToTokens implementation.

* update: reverted to display

* update: initial tests

* update: made STR public for pyclass default implementations

* update: generalizing str implementation

* update: remove redundant test

* update: implemented compile test to validate that manually implemented str is not allowed when automated str is requested

* update: updated compile time error check

* update: rename test file and code cleanup

* update: format cleanup

* update: added news fragment

* fix: corrected clippy findings

* update: fixed mixed formatting case and improved test coverage

* update: improved test coverage

* refactor: generalized formatting function to accommodate __repr__ in a future implementation since it will use the same shorthand formatting logic

* update: Add support for rename formatting in PyEnum3

Implemented the capacity to handle renamed variants in enum string representation. Now, custom Python names for enum variants will be correctly reflected when calling the __str__() method on an enum instance. Additionally, the related test has been updated to reflect this change.

* fix: fixed clippy finding

* update: fixed test function names

* Update pyo3-macros-backend/src/pyclass.rs

Co-authored-by: Bruno Kolenbrander <[email protected]>

* Update newsfragments/4233.added.md

Co-authored-by: Bruno Kolenbrander <[email protected]>

* update: implemented hygienic calls and added hygiene tests.

* update: cargo fmt

* update: retained LitStr usage in the quote in order to preserve a more targeted span for the format string.

* update: retained LitStr usage in the quote in order to preserve a more targeted span for the format string.

* update: added compile time error check for invalid fields (looking to reduce span of invalid member)

* update: implemented a subspan to improve errors in format string on nightly, verified additional test cases on both nightly and stable

* update: updated test output

* update: updated with clippy findings

* update: added doc entries.

* update: corrected error output for compile errors after updating from main.

* update: added support for raw identifiers used in field names

* update: aligning branch with main

* update: added compile time error when mixing rename_all or name pyclass, field, or variant args when mixed with a str shorthand formatter.

* update: removed self option from str format shorthand, restricted str shorthand format to structs only, updated docs with changes, refactored renaming incompatibility check with str shorthand.

* update: removed checks for shorthand and renaming for enums and simplified back to inline check for structs

* update: added additional test case to increase coverage in match branch

* fix: updated pyclass heighten check to validate for eq and ord, fixing Ok issue in eq implementation.

* Revert "fix: updated pyclass heighten check to validate for eq and ord, fixing Ok issue in eq implementation."

This reverts commit a37c24b.

* update: improved error comments, naming, and added reference to the PR for additional details regarding the implementation of `str`

* update: fixed merge conflict

---------

Co-authored-by: Michael Gilbert <[email protected]>
Co-authored-by: Bruno Kolenbrander <[email protected]>
Co-authored-by: MG <[email protected]>
  • Loading branch information
4 people committed Jul 17, 2024
1 parent c1f524f commit 5ac5cef
Show file tree
Hide file tree
Showing 10 changed files with 794 additions and 84 deletions.
1 change: 1 addition & 0 deletions guide/pyclass-parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
| `rename_all = "renaming_rule"` | Applies renaming rules to every getters and setters of a struct, or every variants of an enum. Possible values are: "camelCase", "kebab-case", "lowercase", "PascalCase", "SCREAMING-KEBAB-CASE", "SCREAMING_SNAKE_CASE", "snake_case", "UPPERCASE". |
| `sequence` | Inform PyO3 that this class is a [`Sequence`][params-sequence], and so leave its C-API mapping length slot empty. |
| `set_all` | Generates setters for all fields of the pyclass. |
| `str` | Implements `__str__` using the `Display` implementation of the underlying Rust datatype or by passing an optional format string `str="<format string>"`. *Note: The optional format string is only allowed for structs. `name` and `rename_all` are incompatible with the optional format string. Additional details can be found in the discussion on this [PR](https://github.com/PyO3/pyo3/pull/4233).* |
| `subclass` | Allows other Python classes and `#[pyclass]` to inherit from this class. Enums cannot be subclassed. |
| <span style="white-space: pre">`text_signature = "(arg1, arg2, ...)"`</span> | Sets the text signature for the Python class' `__new__` method. |
| `unsendable` | Required if your struct is not [`Send`][params-3]. Rather than using `unsendable`, consider implementing your struct in a threadsafe way by e.g. substituting [`Rc`][params-4] with [`Arc`][params-5]. By using `unsendable`, your class will panic when accessed by another thread. Also note the Python's GC is multi-threaded and while unsendable classes will not be traversed on foreign threads to avoid UB, this can lead to memory leaks. |
Expand Down
40 changes: 40 additions & 0 deletions guide/src/class/object.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,46 @@ impl Number {
}
```

To automatically generate the `__str__` implementation using a `Display` trait implementation, pass the `str` argument to `pyclass`.

```rust
# use std::fmt::{Display, Formatter};
# use pyo3::prelude::*;
#
# #[pyclass(str)]
# struct Coordinate {
x: i32,
y: i32,
z: i32,
}

impl Display for Coordinate {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
write!(f, "({}, {}, {})", self.x, self.y, self.z)
}
}
```

For convenience, a shorthand format string can be passed to `str` as `str="<format string>"` for **structs only**. It expands and is passed into the `format!` macro in the following ways:

* `"{x}"` -> `"{}", self.x`
* `"{0}"` -> `"{}", self.0`
* `"{x:?}"` -> `"{:?}", self.x`

*Note: Depending upon the format string you use, this may require implementation of the `Display` or `Debug` traits for the given Rust types.*
*Note: the pyclass args `name` and `rename_all` are incompatible with the shorthand format string and will raise a compile time error.*

```rust
# use pyo3::prelude::*;
#
# #[pyclass(str="({x}, {y}, {z})")]
# struct Coordinate {
x: i32,
y: i32,
z: i32,
}
```

#### Accessing the class name

In the `__repr__`, we used a hard-coded class name. This is sometimes not ideal,
Expand Down
1 change: 1 addition & 0 deletions newsfragments/4233.added.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Added `#[pyclass(str="<format string>")]` option to generate `__str__` based on a `Display` implementation or format string.
153 changes: 151 additions & 2 deletions pyo3-macros-backend/src/attributes.rs
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
use proc_macro2::TokenStream;
use quote::ToTokens;
use quote::{quote, ToTokens};
use syn::parse::Parser;
use syn::{
ext::IdentExt,
parse::{Parse, ParseStream},
punctuated::Punctuated,
spanned::Spanned,
token::Comma,
Attribute, Expr, ExprPath, Ident, LitStr, Path, Result, Token,
Attribute, Expr, ExprPath, Ident, Index, LitStr, Member, Path, Result, Token,
};

pub mod kw {
Expand Down Expand Up @@ -36,6 +37,7 @@ pub mod kw {
syn::custom_keyword!(set);
syn::custom_keyword!(set_all);
syn::custom_keyword!(signature);
syn::custom_keyword!(str);
syn::custom_keyword!(subclass);
syn::custom_keyword!(submodule);
syn::custom_keyword!(text_signature);
Expand All @@ -44,12 +46,137 @@ pub mod kw {
syn::custom_keyword!(weakref);
}

fn take_int(read: &mut &str, tracker: &mut usize) -> String {
let mut int = String::new();
for (i, ch) in read.char_indices() {
match ch {
'0'..='9' => {
*tracker += 1;
int.push(ch)
}
_ => {
*read = &read[i..];
break;
}
}
}
int
}

fn take_ident(read: &mut &str, tracker: &mut usize) -> Ident {
let mut ident = String::new();
if read.starts_with("r#") {
ident.push_str("r#");
*tracker += 2;
*read = &read[2..];
}
for (i, ch) in read.char_indices() {
match ch {
'a'..='z' | 'A'..='Z' | '0'..='9' | '_' => {
*tracker += 1;
ident.push(ch)
}
_ => {
*read = &read[i..];
break;
}
}
}
Ident::parse_any.parse_str(&ident).unwrap()
}

// shorthand parsing logic inspiration taken from https://github.com/dtolnay/thiserror/blob/master/impl/src/fmt.rs
fn parse_shorthand_format(fmt: LitStr) -> Result<(LitStr, Vec<Member>)> {
let span = fmt.span();
let token = fmt.token();
let value = fmt.value();
let mut read = value.as_str();
let mut out = String::new();
let mut members = Vec::new();
let mut tracker = 1;
while let Some(brace) = read.find('{') {
tracker += brace;
out += &read[..brace + 1];
read = &read[brace + 1..];
if read.starts_with('{') {
out.push('{');
read = &read[1..];
tracker += 2;
continue;
}
let next = match read.chars().next() {
Some(next) => next,
None => break,
};
tracker += 1;
let member = match next {
'0'..='9' => {
let start = tracker;
let index = take_int(&mut read, &mut tracker).parse::<u32>().unwrap();
let end = tracker;
let subspan = token.subspan(start..end).unwrap_or(span);
let idx = Index {
index,
span: subspan,
};
Member::Unnamed(idx)
}
'a'..='z' | 'A'..='Z' | '_' => {
let start = tracker;
let mut ident = take_ident(&mut read, &mut tracker);
let end = tracker;
let subspan = token.subspan(start..end).unwrap_or(span);
ident.set_span(subspan);
Member::Named(ident)
}
'}' | ':' => {
let start = tracker;
tracker += 1;
let end = tracker;
let subspan = token.subspan(start..end).unwrap_or(span);
// we found a closing bracket or formatting ':' without finding a member, we assume the user wants the instance formatted here
bail_spanned!(subspan.span() => "No member found, you must provide a named or positionally specified member.")
}
_ => continue,
};
members.push(member);
}
out += read;
Ok((LitStr::new(&out, span), members))
}

#[derive(Clone, Debug)]
pub struct StringFormatter {
pub fmt: LitStr,
pub args: Vec<Member>,
}

impl Parse for crate::attributes::StringFormatter {
fn parse(input: ParseStream<'_>) -> Result<Self> {
let (fmt, args) = parse_shorthand_format(input.parse()?)?;
Ok(Self { fmt, args })
}
}

impl ToTokens for crate::attributes::StringFormatter {
fn to_tokens(&self, tokens: &mut TokenStream) {
self.fmt.to_tokens(tokens);
tokens.extend(quote! {self.args})
}
}

#[derive(Clone, Debug)]
pub struct KeywordAttribute<K, V> {
pub kw: K,
pub value: V,
}

#[derive(Clone, Debug)]
pub struct OptionalKeywordAttribute<K, V> {
pub kw: K,
pub value: Option<V>,
}

/// A helper type which parses the inner type via a literal string
/// e.g. `LitStrValue<Path>` -> parses "some::path" in quotes.
#[derive(Clone, Debug, PartialEq, Eq)]
Expand Down Expand Up @@ -178,6 +305,7 @@ pub type FreelistAttribute = KeywordAttribute<kw::freelist, Box<Expr>>;
pub type ModuleAttribute = KeywordAttribute<kw::module, LitStr>;
pub type NameAttribute = KeywordAttribute<kw::name, NameLitStr>;
pub type RenameAllAttribute = KeywordAttribute<kw::rename_all, RenamingRuleLitStr>;
pub type StrFormatterAttribute = OptionalKeywordAttribute<kw::str, StringFormatter>;
pub type TextSignatureAttribute = KeywordAttribute<kw::text_signature, TextSignatureAttributeValue>;
pub type SubmoduleAttribute = kw::submodule;

Expand All @@ -198,6 +326,27 @@ impl<K: ToTokens, V: ToTokens> ToTokens for KeywordAttribute<K, V> {
}
}

impl<K: Parse + std::fmt::Debug, V: Parse> Parse for OptionalKeywordAttribute<K, V> {
fn parse(input: ParseStream<'_>) -> Result<Self> {
let kw: K = input.parse()?;
let value = match input.parse::<Token![=]>() {
Ok(_) => Some(input.parse()?),
Err(_) => None,
};
Ok(OptionalKeywordAttribute { kw, value })
}
}

impl<K: ToTokens, V: ToTokens> ToTokens for OptionalKeywordAttribute<K, V> {
fn to_tokens(&self, tokens: &mut TokenStream) {
self.kw.to_tokens(tokens);
if self.value.is_some() {
Token![=](self.kw.span()).to_tokens(tokens);
self.value.to_tokens(tokens);
}
}
}

pub type FromPyWithAttribute = KeywordAttribute<kw::from_py_with, LitStrValue<ExprPath>>;

/// For specifying the path to the pyo3 crate.
Expand Down
Loading

0 comments on commit 5ac5cef

Please sign in to comment.