Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyo3 ClassB referencing ClassA instead of Py<ClassA> disables ClassA pyo3(set) pymethod #1358

Open
aagit opened this issue Jan 4, 2021 · 4 comments

Comments

@aagit
Copy link

aagit commented Jan 4, 2021

🐛 Bug Reports

Hello,

During the holidays I spent some spare time experimenting with pyo3. I found something didn't work as I would have expected, fom a non-rust programmer point of view, this looks really strange. Since I run out of holiday spare time, I setup a reproducer and I send this bugreport to notify about this. Overall I'm curious to hear why it is happening, if it's a fixable bug, if it's a unfixable bug or if it's just a bug in my test code.

What feels like is that python side or the rust side are silently losing writes. At the very least I'd expect to see some runtime error from either side and not just silent data loss.

Taking a look at similar cases like in tests/test_class_conversion.rs I see Py is used, that's how I got the classB2 setup and then it works, but it's not clear if it's just working by luck in such case or if Py is used precisely for that purpose of activating all BaseClass methods like pyo3(set).

🌍 Environment

  • Your python version: 3.8.6
  • How did you install python (e.g. apt or pyenv)? Did you use a virtualenv?: system python on Gentoo
  • Your Rust version (rustc --version): 1.47.0
  • Your PyO3 version: 0.13.0
  • Have you tried using latest PyO3 master (replace version = "0.x.y" with git = "https://github.com/PyO3/pyo3")?: yes

💥 Reproducing

$ git clone https://github.com/aagit/pyo3-reproducer.git
$ cd pyo3-reproducer
$ firejail --private=pwd --seccomp # optional
$ cargo build
$ ./test.py
case1 B 0
0 False
case2 A 0
1 True
case3 B 0
0 False
case4 B 0
1 True
1 False
case1 B2 0
1 True
case3 B2 0
1 True
case4 B2 0
1 True
2 True
case4 B workaround 0
1 True

All lines with False show the unexpected silent write lost.

In all False cases the pyo3(set) for member "x" of ClassA is silently not invoked when python sets b.a.x .

ClassB2 shows how Py stabilizes the python object lifetime and doing so also happen to re-activate pyo3(set) method of ClassA and then everything works as expected.

It's just not clear if Py should be necessary to make it work and why "set" doesn't trigger an error if cannot hit the memory of the real ClassA rust side.

The workaround method of using a setter for b.a.x, but implemented in ClassB, also works, showing "get" of ClassA works fine and ClassA is referenced correctly on the rust side, it's just "set" failing silently, if invoked through ClassB.

Interestingly even a = b.a ; a.x = 1 isn't enough to invoke ClassA set method on b.a.x, despite a.x modified a successfully.

ClassB2 works, but Py then prevents to derive serialize/deserialize so it's far from ideal. Or would it be possible to filter out Py<> from the serializer and put Py<> back on in the deserializer? Isn't that task part of what pyo3 can do without having to code it on the rust side?

Thanks!

@davidhewitt
Copy link
Member

Hi @aagit, thanks for the comments.

Your observations of ClassA vs Py<ClassA> and its interaction with #[pyo3(get, set)] is unfortunately expected behaviour at the moment. It is something I am aware of and very much plan to improve in the future - it's just waiting for the right contribution to PyO3 to be designed.

What happens is that #[pyo3(get, set)] must currently copy data from Rust to Python and vice versa.

This means that #[pyo3(get)] on a ClassA field will create a new Python ClassA object each time this field is read. Similarly #[pyo3(set)] will copy from the Python object into Rust.

With your chained assignment b.a.x - what happens is that b.a will create a new ClassA object, where the assignment a.x will be set, and then unfortunately it's immediately lost, as you say.

A Py<ClassA> field already represents a Python object. So in this case #[pyo3(get)] and #[pyo3(set)] just copy the Py reference and everything always refers to the same Python object. This is why using Py<ClassA> works correctly. b.a references the existing object, and so assignment to b.a.x is actually modifying that existing object.

One day I would like for #[pyo3(get)] with the ClassA field to bind to the underlying Rust data, rather than create a copy. This is technically possible but needs very careful design in pyo3's internals. The internals of PyO3 will have to use a significant amount of unsafe to transfer the borrowed data into Python without leading memory safety issues. We have not successfully designed this API yet. Eventually we'll get there.


As for Serialize / Deserialize for Py<T> - I'm beginning to think that PyO3 should have an optional serde feature to support this...

@aagit
Copy link
Author

aagit commented Jan 5, 2021

Hello,

thanks for the quick and clear explanation!

Not copying would provide all kind of memory saving and performance advantages too, so it would be a great ultimate goal and it would solve it all.

I wonder if a smaller change would be possible that would still fix the b.a.x=1 without requiring a major overhaul, otherwise the setter feature looks a slippery slope as things are now. It'd be safer not have a setter at all and consider the pyclasses are always readonly on the python side.

When you allocate the python object for ClassA with a copy of rust ClassA inside, is it possible to set a back pointer to classB python object so then you could invoke its setter to write the updated classA copy back to its original location?

The Serialize / Deserialize of Py<> would look more attractive as a short term solution if it was impossible to make setter work right without Py<> in the long term.

Thanks!

@daniil-konovalenko
Copy link
Contributor

daniil-konovalenko commented Jan 5, 2021

FWIW I ended up writing my own de/serializing module for Py<T>:

https://gist.github.com/daniil-konovalenko/41ce420a9bd1d714050bebbafcbfb7aa
(It seems to work, but I am not sure that it is correctly implemented, so any comments are very welcome)

and then told serde to use it:

#[pyclass]
#[derive(Serialize, Deserialize)]
pub struct Wallet {
   ...
}

#[pyclass(module = "models", name = "Profile")]
#[derive(Serialize, Deserialize)]
pub struct Profile {
    ...
    #[serde(with = "serde_pyo3")]
    #[pyo3(get, set)]
    wallet: Py<Wallet>,
}

@davidhewitt
Copy link
Member

Also, serde support for Py<T> is now on master - thanks @daniil-konovalenko ! If anyone is willing to try it out and send any feedback before the next PyO3 release (probably a month or so away still) that'd be very welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants