-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak on simpledrm #3710
Comments
I've been able to reproduce this locally (Alder Lake) by way of:
Can probably be simplified further, and I doubt it has to be WPE to trigger it, but at least So we seem to be leaking on client disconnection. |
Confirmed with a slight variation (running on my laptop):
Leak appears to be of the order of 0.1Mb per iteration |
Another thing came to mind - worth checking whether the atomic-kms platform is also affected. |
As it affects both the gbm-kms and x11 display platforms, I doubt the display platform is a factor |
Whatever is going on, it is far more pronounced (~0.1Mb per iteration) with the |
OK, there's a lot of "noise" in a valgrind report, but this seems to be the most significant sources of leaks in this scenario:
and
(Other sources of leaks are in the single digits, i.e. not corresponding to five minutes of five second iterations) |
OK, the first one is easy to fix.
The latter has some variations:
|
Here's a summary of the leak information after a 15min run:
I've tried a |
I think the leaks I mention above are a "red-herring", they occur without ever starting a client:
Afternote Digging a bit further into the "red-herring" - most of this is we leak one |
OP here from the ubuntu forum. Thanks for your attention so far on this issue @AlanGriffiths @Saviq. Given that we're seeing this issue on Meteor Lake which hasn't specifically been reproduced above, I'm happy to provide similar valgrind logs to the above on our hardware (with some guidance) to catch any potential differences if it would be useful? |
@thmsclrk that's, we _did _ reproduce on Meteor Lake initially. But it was more interesting that it's not specific to that hw, which is why I didn't mention it :) Thanks for the help! |
OK, using different tooling[
|
@thmsclrk you can check the fix I've proposed addresses your scenario:
It definitely fixes a problem, and it would be good to know if that includes what you are seeing |
@AlanGriffiths doesn't seem to affect what I'm seeing: |
@AlanGriffiths didn't fix our particular issue, |
@thmsclrk and just to confirm - if you turn off your screenshotting service, does that avoid the problem? |
Every time an app supporting text-input-v3 starts we allocate a "virtual-keyboard", we should also deallocate it as appropriate. This was identified while investigating #3710, and is part of the fix (possibly all the fix - still testing)
@Saviq yes, just re-tested and still seeing issues on a clean server 24.04 install with nothing installed/running other than
Complete steps taken below:
Some public urls that produce leaks: |
@thmsclrk thanks for these details, that shows the plot thickens even more. Locally (Alder Lake) the only way I was able to reproduce was the (unlikely in production) cyclic restarting of WPE. Using the websites above didn't change anything. We're looking into things on Meteor Lake, if it's a leak just displaying the above websites, that would hopefully help pinpoint the issue. |
I now have (remote) access to Meteor Lake: Testing with:
and
I see no memory growth: Maybe what remains is Frame specific? But first I'll try the above websites |
Unfortunately, not reproduced with these either (still using miral-kiosk as that is simpler to instrument) |
Still on Meteor Lake... It seems from the above that tracking heap memory doesn't help as it remains stable. However...
Hypothesis: whatever is increasing the memory footprint, isn't heap memory allocations running wild, and is specific to Frame. [edit] I can reproduce these findings on the X11 backend on my AMD system. (Which is obviously not Meteor Lake, so may be yet another false lead) |
@thmsclrk I've another test you can help with:
This disables one of the Frame subsystems (which paints the background and shows diagnostics). In my testing that results in the memory footprint being much more stable, especially on the website you mention above (I only checked the first). I see the same expansion of resident memory set on other devices, so I am still not convinced this is the same issue you identify with Meteor Lake, but I tested both the restart and sitting on weather.com scenarios on Meteor Lake with it. (Disabling this subsystem clearly isn't a long term solution, but can at least prove whether I'm looking at the right thing) |
The fix for that is in |
Yes, but won't be picked up automatically in |
Yeah, but the fix isn't Meteor Lake specific. So I worry there's something else |
Hi @AlanGriffiths, Neither |
@thmsclrk by "didn't work", you mean it never started up? Or that
Possibly |
@Saviq meaning the memory usage patterns remain unchanged/unimproved, refreshing
|
@thmsclrk thanks for that… unfortunately that doesn't reproduce here. But what I noticed is that you have two displays connected / overlapping. I wonder if that affects things. Can you disconnect / disable one of them? I've now found a device in our lab with the 125H CPU and dual displays, will report back on what I manage to find out. |
Hi @Saviq, if by two displays you are referring to the 800x600 display - I don't actually know where or what this display is. There is definitely only 1x screen connected to the device, right now via DisplayPort. Looks like maybe a phantom display? Running
I can manually override as such:
But the second screen remains in the logs...
|
@Saviq following the 'phantom display' lead... I think I may have actually solved my problem. Using the steps here, more specifically adding |
And the more pertinent to Meteor Lake https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2084046 Can confirm that disabling the phantom output reins the memory usage in (where it plateaus). We'll add a quirk to disable simpledrm usage in Mir and put it on the backlog to track the leak down further. |
They should never be there in the first place. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2084046 #3710
Thanks @thmsclrk for the help and persistence ;) Alan did plug a couple smaller leaks while hunting for this ;D |
We were failing to release mmapped memory (doing it wrong and at the wrong time). That lead to the memory footprint growing without bound. C.f. canonical/mir#3710
They should never be there in the first place. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2084046 #3710 Closes #3710
Every time an app supporting text-input-v3 starts we allocate a "virtual-keyboard", we should also deallocate it as appropriate. This was identified while investigating #3710, and is part of the fix (possibly all the fix - still testing)
They should never be there in the first place. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2084046 #3710 Closes #3710
Thanks @AlanGriffiths and @Saviq for your help with this! |
@thmsclrk thanks you for bringing this to our attention, it improved our code |
We have reports of problems:
https://discourse.ubuntu.com/t/ubuntu-frame-memory-leak-on-meteor-lake-cpu/51778/
The text was updated successfully, but these errors were encountered: