-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid deadlock in button renderer by aquiring read lock twice. #5115
Conversation
Acquiring a read lock twice should not cause a deadlock - RWMutex allows for multiple read locks at the same time. Are you sure this PR is actually solving a deadlock or have you just not seen it because of timing luck when you try to reproduce it again? |
Yes, it will cause deadlock - it's a bit hard for me to explain but this answer seems to do a better job: https://stackoverflow.com/a/30549188/1139197 The TLDR is you can never acquire the same read lock twice in the same goroutine. |
@dweymouth these can be tricky to spot. Do you think there might be any other double read locks hidden in the engine? Coincidently I think I hit this deadlock using the demo app yesterday and was coming to report it and why I felt obliged to comment :) However I lost the stack trace and not 100% sure it was the same deadlock. But here is some clarity if anyone else stumbles upon this and asks why you can't take RLock() twice in the same goroutine without a potential deadlock situation. Taking the read lock twice (RLock) in the same routine won't cause issues unless there is another go routine that calls Lock on the same mutex between the multiple RLock's. Go uses a writer prefer RW lock strategy and in addition will cause Lock() to force all other future RLocks to block until the Lock() obtains the lock and releases it. The problem here is with this timeline:
Deadlock because:
When go2 Lock occurs anywhere else go won't deadlock so this kind of issue can hide for quite awhile. I suspect that this is only seen in Fyne when someone has implemented a GUI doing some dynamic updates to the button icon's asynchronously with some other mechanisms force a re-render. Most UI systems get around this issue (and locks in general) requiring all UI property state changes to happen on a single thread, or go routine in the case of go. The databinding is a nice new feature which I think helps prevents these bugs due to the channels and maybe that can be extended to more properties? One last thing, I thought maybe |
@beeblebrox I opened this one last year: #3886 and doing a quick review of the code I see its still an issue. |
I see.
It looks like one option as implied in those comments might be to make
each public API simply a grab/defer of the lock the operation required and
call a non public version with a strict rule that it may never call an
exported method and must always call a private one to prevent this error.
The private ones probably should also be named stating if they expect the
write lock instead of the read lock to also prevent unintentional write
races in the stack. I have seen other issues like this already.
I'm guessing without a significant design shift/discipline like above it is
really easy to
Introduce more potential instances of this double+ lock scenario that is
easily missed because you'd have to understand all possible call
hierarchies in the system to know it occurred and -race won't detect this
without actually hitting the scenario with the Lock too.
For now the easier thing to do is to keep reporting the issues we find and
submitting PRs to fix them to help out the team. You're doing well in that
regard. I think what you did here was a good way for this specific case.
What do you think if in future PRs if we can fix the private functions
that call the exported ones by breaking out the exported function into one
that only calls the lock and defers it and call a new private function that
does the implementation? We should probably only do this only if it runs
in the double Lock scenario because of a call further up the stack and
incrementally as we see the issue. That way maybe we can slowly improve the
state and hopefully make future changes less fragile if there is no need to
call the exported versions from the private ones any longer. We'd need to
be careful though because this would naturally have the tradeoff of having
less fine grain read locks which could affect performance. I know the team
already had a plan to fix slot of this late 2024, maybe they already have a
best practice or redesign in the locking pattern already? Thoughts?
Fyne seems to be one of the better gui frameworks for go and I hope we, the
community, can help make it better. I am also using Windows 10 and like you
I seem to be getting unlucky and hitting the locks more frequently than
perhaps the team is used to seeing on other platforms.
…On Thu, Sep 5, 2024, 11:32 PM William Brode ***@***.***> wrote:
@beeblebrox <https://github.com/beeblebrox> I opened this one last year:
#3886 <#3886> and doing a quick
review of the code I see its still an issue.
—
Reply to this email directly, view it on GitHub
<#5115 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADZXBKFCJKC3AKL6EC54OLZVE47LAVCNFSM6AAAAABNVWGJG2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZTGI3DIMJTHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@beeblebrox as I understand it the fyne team is working on a single-threaded approach which likely will resolve most race conditions. I wasn't able to wait on this one because it was actively affecting my customers. |
That's right. More info to come, will be discussed in Fyne Conf 2024 at the latest :) |
Description:
Fixes #5114
Checklist:
Where applicable: