Skip to content

Fix SIGBUS/SIGSEGV crash in TemperaturesWithContext on macOS ARM64#2063

Merged
shirou merged 1 commit intoshirou:masterfrom
lubeschanin:fix/sensors-darwin-arm64-crash
Mar 29, 2026
Merged

Fix SIGBUS/SIGSEGV crash in TemperaturesWithContext on macOS ARM64#2063
shirou merged 1 commit intoshirou:masterfrom
lubeschanin:fix/sensors-darwin-arm64-crash

Conversation

@lubeschanin
Copy link
Copy Markdown
Contributor

Summary

Initialize IOKit and CoreFoundation libraries once via sync.Once instead of opening/closing them on every call to TemperaturesWithContext.

Problem

TemperaturesWithContext in sensors_darwin_arm64.go calls common.NewLibrary() + defer Close() on every invocation. Close() calls purego.Dlclose() which invalidates the library handles. The Go runtime (GC, timers, finalizers) can still reference these handles after close, causing:

  • SIGBUS (bus error)
  • SIGSEGV (segmentation fault)
  • unexpected return pc runtime panics
  • Corrupted memory (null bytes in subsequent JSON output)

This affects all macOS ARM64 (Apple Silicon) systems: M1, M2, M4 confirmed.

Fix

var (
    sensorLibOnce sync.Once
    sensorIOKit   *common.Library
    sensorCF      *common.Library
)

func TemperaturesWithContext(_ context.Context) ([]TemperatureStat, error) {
    sensorLibOnce.Do(initSensorLibraries)  // open once, never close
    // ...
}

The libraries are kept open for the process lifetime. They are small (~1 handle each) and macOS expects them to stay loaded.

Test

Tested on Mac mini M2 Pro, macOS Tahoe 26.3.1. Without the fix, the agent crashes within hours. With the fix, it runs stable for 24h+.

Related

Copy link
Copy Markdown
Owner

@shirou shirou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! The fix for sensors is correct and well-motivated. Two follow-up concerns:

  1. The same open/close-on-every-call pattern exists in cpu/cpu_darwin_arm64.go and disk/disk_darwin.go — those should be fixed too (either in this PR or as a tracked follow-up).
  2. With shared library handles, library.fnMap (used inside getFunc at internal/common/common_darwin.go:44-54) access in getFunc becomes a potential data race under concurrent calls. Consider whether fnMap needs synchronization or if the function pointers should be resolved eagerly in initSensorLibraries.

@lubeschanin
Copy link
Copy Markdown
Contributor Author

Updated the PR to address both concerns:

  1. cpu + disk fixed: Applied the same sync.Once pattern to cpu/cpu_darwin_arm64.go and disk/disk_darwin.go.

  2. fnMap data race fixed: Added sync.RWMutex with double-checked locking to getFunc in internal/common/common_darwin.go. Fast path (read lock) avoids contention after first resolution. go test -race ./sensors/ passes clean.

4 files changed, tested on Mac mini M2 Pro — agent stable for 4+ days with the fix.

…a race

Three changes:

1. sensors/sensors_darwin_arm64.go, cpu/cpu_darwin_arm64.go, disk/disk_darwin.go:
   Initialize IOKit and CoreFoundation libraries once via sync.Once instead
   of opening/closing them on every call. Dlclose invalidates library handles
   that the Go runtime (GC, timers, finalizers) may still reference, causing
   SIGBUS or SIGSEGV crashes.

2. internal/common/common_darwin.go:
   Make getFunc thread-safe via sync.RWMutex with double-checked locking.
   With shared library handles, concurrent calls to getFunc race on fnMap
   reads and writes. The fast path (read lock) avoids contention after
   function pointers are resolved on first call.

The libraries are kept open for the process lifetime. They are small
(~1 handle each) and macOS expects them to stay loaded.

Tested on Mac mini M2 Pro, macOS Tahoe 26.3.1. Without the fix, the agent
crashes within hours. With the fix, stable for 4+ days.

Fixes shirou#1832

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@lubeschanin lubeschanin force-pushed the fix/sensors-darwin-arm64-crash branch from ea57bfc to 76137fe Compare March 27, 2026 16:50
Copy link
Copy Markdown
Owner

@shirou shirou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Root cause is clear and the fix is correct — keeping framework handles open for the process lifetime is the right approach. The sync.RWMutex addition to getFunc properly addresses the data race from shared handles. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Occasional crash when using sensors on MacOS arm64

2 participants