-
-
Notifications
You must be signed in to change notification settings - Fork 689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make upstream window and app references weak to avoid reference cycles #2061
Conversation
I can't argue with the results, it does fix the test case from #1215 (which I've updated to account for recent API changes).
It is indeed strange that GC does not pick up the reference cycles here. My first instinct would have been a reference cycle involving ObjC instances, but those should be prevented by the explicit weak ObjC properties that we have introduced some time ago. Another possibility could be that we are indeed holding on to an additional reference somewhere that is not part of a closed cycle. Replacing it by a weakref enables collecting the cycle, but might return an unexpected I'm a bit cautious about this fix because it may be only addressing the symptom and not the cause. The cause being (1) potentially a leftover non-cyclic reference somewhere and (2) a complex reference graph that makes it difficult to reason about. But clearly you have gone through most of the reference graph, it does not seem too complex after all, and one or multiple of references that you replaced with a weekref fixed the issue. I'd be curious to see a minimal fix that addresses it though, just to maybe narrow down which references were problematic.
Maybe holding on to a weak reference to a Window in a test, forcing a GC run, and then asserting that the reference was collected, would work? |
That was my thought too - and my "but it can't be that" for the same reason.
That also occurred to me; but if it's true, it's a subtle one that I haven't been able to discover. I tried mapping out all the relationships in a big spiderweb of a diagram, but that has rapidly converged on being one of those "red string between photos on the wall" conspiracy maps, rather than actual enlightenment :-)
100% agreed on this. As a fix, there's at least a consistent rationale (i.e., "upstreamness") behind why the specific references are weakrefs, but it's not a very satisfying answer specifically because circular references should be resolved by the GC.
Honestly - AFAICT, this is the minimal fix (or, at least, a minimal fix). The only part that is maybe optional is the references to |
Ok, you have convinced me that weakrefs are indeed a pragmatic solution :) Do you know if the window deletion issue is limited to macOS or also occurs on Windows and Gtk backends? The window and app reference fixes will apply to all backends but your interface reference fix is macOS only. Since the general structure of backrefs to the interface layer is more general, it may be worth extending the weakrefs to other backends as well. I don't currently have access to a machine with VMs to test this myself. Curiously, widget implementations have similar backrefs to the interface layer. You haven't introduced weakrefs here, and it seems that those are indeed not needed. A similar example as in #1215 but for Widgets works just fine with the current public release 0.3.1. Admittedly, the reference graph is a lot simpler here. Widgets memleak testimport gc
import toga
from toga.constants import COLUMN
from toga.style import Pack
class Window(toga.Window):
instances = 0
def __init__(self):
super().__init__(title=f'Window: {Window.instances}', position=(200, 200))
Window.instances += 1
def __del__(self):
Window.instances -= 1
class WidgetRefcountExample(toga.App):
def startup(self):
self.window = None
self.button_create = toga.Button(
'Create window',
style=Pack(padding=10, width=120),
on_press=self.on_create,
)
self.button_delete = toga.Button(
'Close and delete window',
style=Pack(padding=10, width=120),
on_press=self.on_delete,
)
self.label = toga.Label(
f'Instances: {Window.instances}',
style=Pack(padding=10, width=120),
)
self.button_create.enabled = True
self.button_delete.enabled = False
self.box = toga.Box(
children=[
self.button_create,
self.button_delete,
self.label,
],
style=Pack(direction=COLUMN),
)
self.main_window = toga.MainWindow(title="Toga")
self.main_window.content = self.box
self.main_window.show()
def on_create(self, sender):
if not self.window:
self.window = Window()
self.app.windows += self.window
self.window.content = toga.Box()
self.window.show()
self.label.text = f'Instances: {Window.instances}'
self.button_create.enabled = False
self.button_delete.enabled = True
def on_delete(self, sender):
if self.window:
self.window.close()
self.window = None
gc.collect() # clean up ref cycles before displaying instance count
self.label.text = f'Instances: {Window.instances}'
self.button_create.enabled = True
self.button_delete.enabled = False
def main():
return WidgetRefcountExample('Example', 'org.beeware.refcount')
if __name__ == '__main__':
app = main()
app.main_loop() |
cocoa/src/toga_cocoa/window.py
Outdated
self._interface = weakref.ref(value) | ||
|
||
def __del__(self): | ||
print("DEL WINDOW IMPL") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leftover debug statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went spelunking around a bit and removed as many of the weak refs as I could without re-introducing the memory leak.
I could remove all but one: The weakref that a toga.Widget
holds to the toga.Window
it belongs to. That lead me to conclude: The window is kept alive by widgets that still reference it, therefore the widgets themselves must be kept alive by something even though they are no longer required when the window is closed.
The obvious reference cycles are all closed loops, but one point stood out: The registries toga.App.windows
and toga.App.widgets
. When a window is closed, it is removed from toga.App.windows
by toga.Window.close()
:
Lines 273 to 275 in 9a482d3
def close(self) -> None: | |
self.app.windows -= self | |
self._impl.close() |
However, the window's widgets still remain in toga.App.widgets
add each one of them still holds a reference to the window. There are our leftover non-cycling references!
I've drafted a minimal fix as a PR at #2066.
My original diagnosis still stands, all the cross references and the different parts of code that manage them are too complicated to track and reason about easily. Even though my suggestion fixes the memory leak, your weak ref solution would make regressions more difficult (though note that it only cleans up the references to windows, references to their widgets would have still lived on forever). If both of us are struggling, imagine someone new to the codebase.
I don't have immediate suggestions to improve things though :(
Maybe remove all cross-references that are not strictly required? The registries are very convenient though. Or centralise and simplify the code when:
- Windows are added to or removed from an app.
- Widgets are added to or removed from a window, a parent widget (such as a box) or a container.
There is a lot of very similar code for the above cases and a regression in any of those places can be problematic.
It's not limited to macOS; the same test case causes problems on GTK and Windows, and the same fix resolves the issue. I've added that to the PR. I guess conceptually the same issue exists on iOS and Android, but you can't create and destroy additional windows, so it's a moot point. I've added the weakrefs for consistency.
Yeah - I did some poking around this and came to the same conclusion. I do wonder if a more complicated widget graph, with widget references held by the app and closures on widget event handlers would manufacture a problem, but at least simple tests of this didn't reveal a problem for me.
This is indeed an obvious source of leakage; however, as noted on that PR, it seems to be an incomplete fix.
The problem is that it's difficult to judge what is strictly required. I guess now that we've got a (near 100%) functional test suite, we'd be in a position to prune the references that aren't actually used in practice;
Agreed there's some similarities; @mhsmith has also remarked that there are some commonalities that could be cleaned up. Conceptually, I'm in favor of any cleanup that reduces duplication and turns the backends into an even more "dumb" wrapper around native widgets; however, I'm not sure I have a clear vision on exactly what that simplification would look like in practice. There's enough differences between widget lifecycle on different platforms that I'm not sure there is a completely clean abstraction of widget management. I'd definitely be happy to be proven wrong on this, though. |
I've added a test case on the testbed to verify that both the window interface and impl is deleted on each platform. I've done this on the testbed rather than core/dummy because:
|
Hm, from the last CI run, I still see the Windows test |
I think that failure is a red herring (at least for now). There's no Window test backend implemented for Winforms yet, so all the windows tests are failing because they can't find the probe for a Window. The secondary window reference test fails in a slightly different way specifically because it doesn't have a fixture, so it's not failing during fixture creation like all the other tests. @mhsmith and I have been splitting the work of the audit process; I do a first pass handling Cocoa, iOS, and GTK; he follows behind handling Windows and Android. As a side effect of this distribution of labor, the tests for Android and Winforms tend to be explosive until he's started work on whatever feature is being audited. |
3c21869
to
abd9397
Compare
I'm not so sure about that. From the test logs for
This looks like a test case failure to me. I also finally spun up a windows VM (pythonnet works on Apple Silicon!), and my manual example from #1215 still fails with this branch. It does pass however after removing circular references from Pythonnet to the |
9c03537
to
73714f6
Compare
I'm going to close this in favor of #2066; the approach there is a lot simpler, and isn't platform dependent. |
Fixes #1215.
The manual test case for this is in #1215. Windows won't be deleted immediately, but once the garbage collector kicks in, both the window interface and impl will be freed. I'm not sure how we'd manufacture an automated test, since it's entirely dependent on garbage collection behavior.
The underlying problem appears to be that the relations between widgets, App and Window are complex.
Python's garbage collection should pick up closed cycles between objects, but there's clearly something in this nest that is preventing collection.
This PR corrects the problem by making all "upstream" references (i.e., references from impl to interface, references from window to the app that owns the window, and references to app or window from a widget) weak. This is done by:
app
andwindow
on the base widget into a wrapper around a weak property, rather than a full object reference,interface
on the Cocoa implementation of App and Window so that these properties are a wrapper around a weak property, rather than a full object referenceThis PR builds on #2058 so that the full GUI test suite for Window, plus all the changes to Container are incorporated. For convenience this is a link to the diff that is specific to this bug fix
There is almost certainly an analogous set of changes needed on the iOS backend (although it's less critical there because you can't create secondary windows).
PR Checklist: