Avoid deadlock in button renderer by aquiring read lock twice. #5115

williambrode · 2024-09-05T05:39:41Z

Description:

Checklist:

Tests included.
Lint and formatter run with no errors.
Tests all pass.

Where applicable:

Public APIs match existing style and have Since: line.
Any breaking changes have a deprecation path or have been discussed.
Check for binary size increases when importing new modules.

coveralls · 2024-09-05T06:00:06Z

coverage: 66.055% (+0.002%) from 66.053%
when pulling 110aba7 on williambrode:issue-5114
into 5fb3d75 on fyne-io:develop.

dweymouth · 2024-09-05T15:20:40Z

Acquiring a read lock twice should not cause a deadlock - RWMutex allows for multiple read locks at the same time. Are you sure this PR is actually solving a deadlock or have you just not seen it because of timing luck when you try to reproduce it again?

williambrode · 2024-09-05T15:51:43Z

Yes, it will cause deadlock - it's a bit hard for me to explain but this answer seems to do a better job: https://stackoverflow.com/a/30549188/1139197

The TLDR is you can never acquire the same read lock twice in the same goroutine.

beeblebrox · 2024-09-06T01:27:57Z

@dweymouth these can be tricky to spot. Do you think there might be any other double read locks hidden in the engine? Coincidently I think I hit this deadlock using the demo app yesterday and was coming to report it and why I felt obliged to comment :) However I lost the stack trace and not 100% sure it was the same deadlock.

But here is some clarity if anyone else stumbles upon this and asks why you can't take RLock() twice in the same goroutine without a potential deadlock situation.

Taking the read lock twice (RLock) in the same routine won't cause issues unless there is another go routine that calls Lock on the same mutex between the multiple RLock's. Go uses a writer prefer RW lock strategy and in addition will cause Lock() to force all other future RLocks to block until the Lock() obtains the lock and releases it.

The problem here is with this timeline:

go1 RLock()'A - takes the lock
go2 Lock() - blocks waiting for go1 RLock()'A release.
go1 RLock()'B - blocks until go2 Lock() is released.

Deadlock because:

go1 RLock()'A will never release waiting for go1 RLock()'B to obtain the lock and finish.
go1 RLock()'B always blocked waiting for go2 Lock()/
go2 Lock() will always be blocked waiting for RLock()'A

When go2 Lock occurs anywhere else go won't deadlock so this kind of issue can hide for quite awhile. I suspect that this is only seen in Fyne when someone has implemented a GUI doing some dynamic updates to the button icon's asynchronously with some other mechanisms force a re-render.

Most UI systems get around this issue (and locks in general) requiring all UI property state changes to happen on a single thread, or go routine in the case of go. The databinding is a nice new feature which I think helps prevents these bugs due to the channels and maybe that can be extended to more properties?

One last thing, I thought maybe go run -race might catch this RLock() called twice without the need of a Lock() running at the same time but I just checked and it doesn't seem to. Funny enough though there is another race condition I detected though just when updating a button with an icon, but I'll save that for a different issue and I am unsure if it has any real impact in that case.

williambrode · 2024-09-06T05:32:15Z

@beeblebrox I opened this one last year: #3886 and doing a quick review of the code I see its still an issue.

beeblebrox · 2024-09-06T09:47:06Z

I see. It looks like one option as implied in those comments might be to make each public API simply a grab/defer of the lock the operation required and call a non public version with a strict rule that it may never call an exported method and must always call a private one to prevent this error. The private ones probably should also be named stating if they expect the write lock instead of the read lock to also prevent unintentional write races in the stack. I have seen other issues like this already. I'm guessing without a significant design shift/discipline like above it is really easy to Introduce more potential instances of this double+ lock scenario that is easily missed because you'd have to understand all possible call hierarchies in the system to know it occurred and -race won't detect this without actually hitting the scenario with the Lock too. For now the easier thing to do is to keep reporting the issues we find and submitting PRs to fix them to help out the team. You're doing well in that regard. I think what you did here was a good way for this specific case. What do you think if in future PRs if we can fix the private functions that call the exported ones by breaking out the exported function into one that only calls the lock and defers it and call a new private function that does the implementation? We should probably only do this only if it runs in the double Lock scenario because of a call further up the stack and incrementally as we see the issue. That way maybe we can slowly improve the state and hopefully make future changes less fragile if there is no need to call the exported versions from the private ones any longer. We'd need to be careful though because this would naturally have the tradeoff of having less fine grain read locks which could affect performance. I know the team already had a plan to fix slot of this late 2024, maybe they already have a best practice or redesign in the locking pattern already? Thoughts? Fyne seems to be one of the better gui frameworks for go and I hope we, the community, can help make it better. I am also using Windows 10 and like you I seem to be getting unlucky and hitting the locks more frequently than perhaps the team is used to seeing on other platforms.

…

On Thu, Sep 5, 2024, 11:32 PM William Brode ***@***.***> wrote: @beeblebrox <https://github.com/beeblebrox> I opened this one last year: #3886 <#3886> and doing a quick review of the code I see its still an issue. — Reply to this email directly, view it on GitHub <#5115 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AADZXBKFCJKC3AKL6EC54OLZVE47LAVCNFSM6AAAAABNVWGJG2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZTGI3DIMJTHA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

williambrode · 2024-09-06T19:30:38Z

@beeblebrox as I understand it the fyne team is working on a single-threaded approach which likely will resolve most race conditions. I wasn't able to wait on this one because it was actively affecting my customers.

andydotxyz · 2024-09-06T19:31:58Z

That's right. More info to come, will be discussed in Fyne Conf 2024 at the latest :)

Avoid deadlock in button renderer by aquiring read lock twice.

110aba7

dweymouth changed the base branch from develop to release/v2.5.x September 5, 2024 15:57

dweymouth changed the base branch from release/v2.5.x to develop September 5, 2024 15:57

dweymouth approved these changes Sep 5, 2024

View reviewed changes

dweymouth merged commit 367ea0a into fyne-io:develop Sep 5, 2024
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid deadlock in button renderer by aquiring read lock twice. #5115

Avoid deadlock in button renderer by aquiring read lock twice. #5115

williambrode commented Sep 5, 2024 •

edited by dweymouth

Loading

coveralls commented Sep 5, 2024

dweymouth commented Sep 5, 2024

williambrode commented Sep 5, 2024

beeblebrox commented Sep 6, 2024

williambrode commented Sep 6, 2024

beeblebrox commented Sep 6, 2024 via email

williambrode commented Sep 6, 2024

andydotxyz commented Sep 6, 2024

Avoid deadlock in button renderer by aquiring read lock twice. #5115

Avoid deadlock in button renderer by aquiring read lock twice. #5115

Conversation

williambrode commented Sep 5, 2024 • edited by dweymouth Loading

Description:

Checklist:

Where applicable:

coveralls commented Sep 5, 2024

dweymouth commented Sep 5, 2024

williambrode commented Sep 5, 2024

beeblebrox commented Sep 6, 2024

williambrode commented Sep 6, 2024

beeblebrox commented Sep 6, 2024 via email

williambrode commented Sep 6, 2024

andydotxyz commented Sep 6, 2024

williambrode commented Sep 5, 2024 •

edited by dweymouth

Loading