possible race conditions on intel MacOS #15730
Labels
bug
Something isn't working
core-team
E:Desktop Keycard Bug
Bug found after initial keycard development
wallet-team
Bug Report
Description
It was observed that status-desktop would often crash on onboarding stage on intel MacOS.
After trying to figure out why that happens here #15134
I found out that this issue was often fixed by doing nothing.
All the signs point to a possible race condition in the thread pool setup where a
nim
service calls some business logic instatus-go
orstatus-keycard-go
.I also discovered that for
nim
interop ofgo
we free the memory by calling ago
method which is called via anim
interface.Exhibit A :
Lets take a look at how we consume
keycardInitFlow
which lives invendor/status-keycard-go/shared/main.go
Source of
keycardInitFlow
is :This function is wrapped around another wrapper in
vendor/nim-keycard-go/keycard_go.nim
and the source looks like this :It is also important to look at the source of
go_shim
:The source lives here :
vendor/nim-keycard-go/keycard_go/impl.nim
Finally this code is being consumed in
service.nim
like this :service.nim
lives here :src/app_service/service/keycard/service.nim
I tried to create a minimal reproduction repo here but I was unable to reproduce the crash :
https://github.com/siddarthkay/status-desktop-intel-crash-reproducer
Although my efforts did not include a thread pool and that could be the key to reproducing the race condition.
Another key factor in discovering this race condition was upgrading
go
to1.21
.go 1.21
has brought significant changes to its garbage collector and the crash we would see would often link to the code related to garbage collection.error message :
reference in go source :
https://github.com/golang/go/blob/8f5c6904b616fd97dde4a0ba2f5c71114e588afd/src/runtime/mcache.go#L325
At the moment this issue is mitigated by introducing some sleep time in this PR : #15194
However this is not a proper solution and we may run into race conditions elsewhere in the future.
Steps to reproduce
Expected behaviour
Actual behaviour
The text was updated successfully, but these errors were encountered: