-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"fatal error: concurrent map writes" when submitting transaction #570
Comments
Thanks for reporting, we'll look into this 👍 |
I think we need locks here https://github.com/getsentry/sentry-go/blob/master/integrations.go#L87 and in all other places. |
Spent some time on this. Tried to reproduce it with the following test (runs for 10 minutes), but it passes: // Running like:
// go test -run ^TestSetContextAndFinishTransaction$ -count 1 -race
//
func TestSetContextAndFinishTransaction(t *testing.T) {
ctx := NewTestContext(ClientOptions{
EnableTracing: true,
TracesSampleRate: 1.0,
})
hub := GetHubFromContext(ctx)
scope := hub.Scope()
transaction := StartTransaction(ctx, "op")
go func() {
for {
transaction.Finish()
}
}()
go func() {
for {
scope.SetContext("device", map[string]interface{}{
"name": "value",
})
}
}()
time.Sleep(600 * time.Second)
fmt.Println("Done!")
} For comparison, the following test with concurrent map writes fails within 1 second: func TestConcurrentMapWrite(t *testing.T) {
m := map[string]string{}
go func() {
for {
m["key1"] = "value1"
}
}()
go func() {
for {
m["key2"] = "value2"
}
}()
time.Sleep(1 * time.Second)
fmt.Println("Done!")
} For now I can think of the following alternative scenarios that may cause this:
@SupaHam if at some point you have any additional information (e.g. more stack trace examples, ideally where the traces of both goroutines are present), that would help us a lot 🙏 |
Hi @tonyo, sorry for the wait. Your test will not trigger the issue as SetContext is locked, so that will always be safe to run asynchronously. The reported issue is, specifically step 2, is multiple spans (transactions) modifying the same scope asynchronously. Apologies if the issue wasn't clearly described, here is a test that should trigger it for you locally - increase func TestAB(t *testing.T) {
ctx := NewTestContext(sentry.ClientOptions{
EnableTracing: true,
TracesSampleRate: 1,
})
hub := sentry.GetHubFromContext(ctx)
c := make(chan *sentry.Span)
const runs = 100
for i := 0; i < runs; i++ {
go func() {
span := sentry.StartSpan(ctx, "test", sentry.TransactionName("test"))
c <- span
go hub.Scope().SetContext("device", map[string]interface{}{"name": "value"})
}()
}
var wg sync.WaitGroup
defer wg.Wait()
for i := 0; i < 4; i++ {
wg.Add(1)
go func() {
defer wg.Done()
for span := range c {
go span.Finish()
}
}()
}
} |
Thanks for the follow-up. We normally recommend that each go routine uses its own hub, as described here: https://docs.sentry.io/platforms/go/concurrency/ |
Hi @cleptric. Firstly, I'd like to apologise for not being able to expose our internal proprietary code, but I was trying to demonstrate an example of how it fails. Just to clarify, there is one hub per player in our game, but the loading of player data is done asynchronously. Changing that to be synchronous would add serious wait times and we cannot afford that solution. Yet, the problem is that there is that sliver of a chance for a panic, and it disconnects players from their gaming experience which is very abrupt and ugly for user retention. I'm not quite sure I see an issue with the usage of asynchronous loading, i think the locking can be justified. EDIT: I would like to update you on this issue, we have prevented the panics by forcefully setting the properties when creating the context so that this specific faulty map write doesn't ever happen again. But that doesn't seem user friendly, and I can imagine more users coming across this issue. |
Summary
When
span.Finish
is called, there's a chance a fatal concurrent write could occur due to the forceful setting of device parametersarch
andcpu
.Steps To Reproduce
scope.SetContext
fatal error: concurrent map writes
Expected Behavior
Don't crash the go application because integrations have no concept of context safety.
Screenshots
Sentry.io Event
Environment
SDK
sentry-go
version: v0.16.0Sentry
Additional context
This issue has been around for around at least a year. This environment is a game server that gains a lot of unique connections and requires has multiple goroutines running at the initialisation of the connection that are spawning spans, while asynchronously manipulating context as new data comes in.
The text was updated successfully, but these errors were encountered: