Skip to content

Fix data race in MaybeNotifyAPMAgent#469

Merged
fabled merged 1 commit intoopen-telemetry:mainfrom
zystem-io:joel/fix-apm-data-race
May 26, 2025
Merged

Fix data race in MaybeNotifyAPMAgent#469
fabled merged 1 commit intoopen-telemetry:mainfrom
zystem-io:joel/fix-apm-data-race

Conversation

@athre0z
Copy link
Copy Markdown
Member

@athre0z athre0z commented May 26, 2025

MaybeNotifyAPMAgent could previously access the interpreter map while it was being updated in another goroutine during PID event processing. This commit correctly takes the lock to prevent this in the future.

This doesn't happen much during normal operation, but becomes increasingly likely with higher sampling rates.

Example crash

fatal error: concurrent map read and map write

goroutine 201 [running]:
go.opentelemetry.io/ebpf-profiler/processmanager.(*ProcessManager).MaybeNotifyAPMAgent(0xc006792000?, 0xc006180820, {{0xcf22de1f18bae9f9?, 0x0?}}, 0x1)
        /root/zymtrace/services/profiler/ebpf-profiler/processmanager/manager.go:323 +0x51
go.opentelemetry.io/ebpf-profiler/tracehandler.(*traceHandler).HandleTrace(0xc006780120, 0xc006180820)
        /root/zymtrace/services/profiler/ebpf-profiler/tracehandler/tracehandler.go:150 +0x2a2
go.opentelemetry.io/ebpf-profiler/tracehandler.Start.func1()
        /root/zymtrace/services/profiler/ebpf-profiler/tracehandler/tracehandler.go:196 +0x173
created by go.opentelemetry.io/ebpf-profiler/tracehandler.Start in goroutine 1
        /root/zymtrace/services/profiler/ebpf-profiler/tracehandler/tracehandler.go:185 +0x174

`MaybeNotifyAPMAgent` could previously access the interpreter map while
it was being updated in another goroutine during PID event processing.
@athre0z athre0z requested review from a team as code owners May 26, 2025 11:31
@athre0z athre0z added the bug Something isn't working label May 26, 2025
@athre0z athre0z changed the title Fix data race in process manager Fix data race in MaybeNotifyAPMAgent May 26, 2025
@athre0z
Copy link
Copy Markdown
Member Author

athre0z commented May 26, 2025

@fabled can you please merge? It seems that approvers no longer have permissions to do so themselves.

@fabled fabled merged commit 1c6d398 into open-telemetry:main May 26, 2025
25 checks passed
@athre0z athre0z deleted the joel/fix-apm-data-race branch May 26, 2025 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants