-
-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated memory management #543
base: main
Are you sure you want to change the base?
Conversation
@mhsmith, care to have first look? Please note the TODOs still listed above. |
89a4d62
to
931c352
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This turned out to be a lot less invasive than I thought it would be.
I've done a check of the toga-chart issue that triggered this change; and it seems to resolve that problem.
I also did a quick check with Toga's testbed suite on macOS; that code has a bunch of manual retain
s and autorelease
/release
s. I thought they should all be balanced though - worst case, objects would be over-retained - so I was a little surprised that it the testbed segfaults almost immediately (and the stack trace doesn't give any obvious pointers what is causing the issue).
If I remove all the retains and releases, the testbed segfaults; but on inspection, some of the uses are for objects that are created in Python, then handed to ObjC to manage (e.g., Toolbar items created here), or the copyWithZone
handler here). But I guess those uses of memory handling make sense - and they're a lot closer aligned to the "spirit" of ObjC memory handling, Plus, in at least the ToolbarItem case, it could be avoided by keeping the toolbar instance in the cache of items.
Related - if we land this, I suspect a version bump to 0.5 might be called for. This is just backwards incompatible enough that I think it's worth flagging the significance of the change. |
Thanks for the thorough checks! I've updated the PR description to give a better summary of the change and also discuss why this should be non-breaking for most users. I'll have a closer look at the segfaults that you encountered, later. They might be caused by the usage of Maybe there are also ways to prevent users from shooting themselves in the foot, e.g., raise an exception on manual release calls if there is only a single reference left. |
Thanks, this looks great. I'm busy today, but I'll take a look at this as soon as I can. |
I've had a closer look now at the segfaults and could identify two cases where they happen:
beeware/toga#2978 contains all the changes that I found to (1) prevent segfaults and (2) remove now unneeded manual memory management. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Malcolm Smith <[email protected]>
6648cd0
to
f0edb5b
Compare
If we do things right, i.e., the way that ARC would handle it, it would indeed remove the need for special handling on the NSImage "init failed" issue.
Agreed, I think we would represent different objects by different Python wrappers instead of swapping out the pointer under the hood. |
I've looked through ARC docs and found the relevant places to explain the Basically, all methods in the
Getting those semantics wrong seems to me the root cause of beeware/toga#2978. |
Nice find - that's a helpful reference. Probably worth adding a link to that in a docstring somewhere as well. If I'm reading that right (and especially the "Rationale" block"):
(1) seems like an edge case, but it's also not hard to accomodate. (2) also seems like an edge case, but one that's a little harder to implement, so if that was bumped to a "known limitation" with documentation and open feature ticket, I'd be OK with that. |
Speaking with @mhsmith in person, he pointed out that if we create a new Python object for the This is all true; my question is how much we care, balanced against the complexity and efficiency of the implementation. Rubicon is a Python wrapper around ObjC; I don't think it's unreasonable to expect our users to need a basic understanding of ObjC, especially if we document these edge cases. "Don't use the alloc ptr once it's been init'd" is one of those edge cases. And the use case for keeping a pointer to the The only thing that gives me pause is the performance aspect. If every If the fix to rewrite the pointer for those cases is <10 lines of code with no real other risks, then we might as well take the win. But if it's more complex than that, or there's any question over whether there are edge cases in getting the caching right, then the simple solution of "just create 2 objects and document the limitation" is something I can live with. |
That's a good point. Sending messages to an uninitialized object will segfault for most selectors (with a few useful exceptions such as Another option could be explicitly track / flag uninitialized objects.
I am trying out a solution at the moment that would indeed only create 2 Python objects if the pointer changes, keeping with what ObjCInstances are supposed to represent. I am however still running into test failures in Python 3.12 + macOS 15 which are a bit concerning.
I think you are reading this correctly. Though my first stab is certainly less sophisticated than that. |
I'm not familiar enough with Rubicon to suggest the best way to implement this, but yes, this is generally what I had in mind.
Nothing should be
It looks like this is answered by the specification Sam found, along the converse question of "are there methods that do start with
Again, we shouldn't call
|
It's a little more complicated than that:
Or for a more realistic example, any method beginning with the string The spec also gives a few other conditions for deciding whether a method is in a family, but I don't know whether it's possible to check them at runtime, or how likely they are to matter in practice. |
I don't have enough experience with Objective-C to know this either, so I've implemented the name-base checks for now. Name-based rules seem sufficient for selectors families, which is required but not sufficient for the method family. I've found what I believe is a relatively lightweight solution now that does not change ObjCInstance pointers under the hood and keeps the current semantics of "if we have Python wrapper, we Obj-C object must still exist". Admittedly, this object is much less useful when not yet initialized, but that's a problem that I don't want to solve for in this PR. Why I like the current current solution:
|
if self.superclass.methods_ptr is None: | ||
with self.superclass.cache_lock: | ||
self.superclass._load_methods() | ||
# Traverse superclasses and load methods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not entirely sure why a full traversal of superclasses is required now but wasn't previously for tests on Python 3.12 + macOS 15 to pass. But I do believe that a class hierarchy traversal makes sense regardless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed this is weird; I can only assume we're hitting some weird cache initialisation order thing (e.g., the test was previously initializing the super class before the class that was causing a problem). However, the solution here makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running the toga testbed suite reveals a similar issue for TogaSlider.setValue(value, animated=True)
where the superclass method cannot be found. This is despite the fix here and could be related to the TogaSlider being a subclass defined in Python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the direct method call TogaSlider.setValue_animated_(value, True)
succeeds: https://github.com/beeware/toga/actions/runs/12089448592/job/33714837506?pr=2978
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've managed to fix that by allowing direct method lookup if there is no cached partial method, similar to what we already do for the old style syntax.
This is very likely a race condition somewhere, but the code around ObjCClass._load_methods
, ObjCClass._cache_method
and ObjCPartialMethod
quite complex and hard to follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But all those changes to method lookup are making this PR a bit unwieldy. I've reverted some of the unneeded commits and am happy to split this off into an entirely different PR if that makes it easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the method lookup isn't strictly related to the memory retention issue, but I'm OK with the level of complexity it adds to this PR in the interest of addressing some issues that we know exist when this PR is used in Toga.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, something is still very fishy about my solution in this PR. _load_methods
should have already been recursive, with the main difference that is was hard-stopping the recursion if one of the superclasses had already called _load_methods
. This makes assumptions of the superclass chain not changing during init
.
The new solution forces recursion to continue, but in a horribly hacky way. I do still want to find a more elegant solution here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good to me. A couple of comments inline, mostly about the comments :-)
if self.superclass.methods_ptr is None: | ||
with self.superclass.cache_lock: | ||
self.superclass._load_methods() | ||
# Traverse superclasses and load methods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed this is weird; I can only assume we're hitting some weird cache initialisation order thing (e.g., the test was previously initializing the super class before the class that was causing a problem). However, the solution here makes sense.
Agreed - I like the solution you've presented here. It may still have some interesting behavior if you try to use an alloc'd object... but then, so will ObjC, so it shouldn't be surprising. |
Co-authored-by: Russell Keith-Magee <[email protected]>
This reverts commit c6096c2.
This reverts commit 2e4eccb. # Conflicts: # src/rubicon/objc/api.py
This reverts commit b78c41d.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this PR as it stands now; in addition to solving the originally reported problem (beeware/toga-chart#191), it solves the bigger problem of memory management (#256) problem; it addresses the remaining iOS leakage issues (see beeware/toga#2853); and apparently also fixes a method caching issue that has been unreported to date.
I've left one comment about the argument used when constructing arguments, but that's a minor cleanup suggestion, and one that I'll gladly back down from if you disagrees.
Before we merge, I'd also like @mhsmith's final review in case he can think of any edge cases or other issues I might have missed.
if self.superclass.methods_ptr is None: | ||
with self.superclass.cache_lock: | ||
self.superclass._load_methods() | ||
# Traverse superclasses and load methods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the method lookup isn't strictly related to the memory retention issue, but I'm OK with the level of complexity it adds to this PR in the interest of addressing some issues that we know exist when this PR is used in Toga.
This PR changes the memory management model of Rubicon. Previously, we would release objects on Python
__del__
calls only if we created them ourselves and own them from alloc and similar calls.In this PR, we always ensure that we own objects when we create a Python wrapper, by explicitly calling
retain
if we did not get them fromalloc
etc and always callingautorelease
when the Python wrapper is garbage collected. This has a few advantages:This change should be backward compatible for most users because existing manual
retain
andrelease
calls don't cause any issues if balanced and would have already caused segfaults if there are more releases than retains.TODO:
PR Checklist: