fix: resolve multiple memory leaks causing unbounded growth#16695
fix: resolve multiple memory leaks causing unbounded growth#16695binarydoubling wants to merge 6 commits intoanomalyco:devfrom
Conversation
- Cap SDK event queue to prevent unbounded growth during high event throughput - Clean up message parts when trimming excess messages in sync store - Evict previous session data from memory when switching sessions - Bound LSP diagnostics map with LRU eviction and clear on shutdown - Reject pending callbacks on session cancel to prevent promise/closure leaks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Thanks for your contribution! This PR doesn't have a linked issue. All PRs must reference an existing issue. Please:
See CONTRIBUTING.md for details. |
|
The following comment was made by an LLM, it may be inaccurate: Based on my search, I found several related PRs that address memory leaks: Potentially Related PRs:
These PRs may be addressing overlapping memory leak issues. You should verify if PR #16695 duplicates work from any of these, particularly #16346, #16241, #7050, or #13514. |
|
Thanks for updating your PR! It now meets our contributing guidelines. 👍 |
- Add onCleanup for all sdk.event.on() listeners in app.tsx, session route, and prompt component to prevent listener accumulation on re-render - Add onCleanup for process SIGUSR2 handler in theme provider - Add onCleanup for leader timeout in keybind provider - Clear event queue on SDK context cleanup - Remove empty subscription arrays in Bus and RPC listener maps - Clear warning timeout in state disposal after completion - Clear models refresh interval on process exit - Add dispose() to ShareNext to clear pending sync timeouts - Clear PTY session buffer on removal to free memory immediately Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Addresses the remaining memory leaks identified in anomalyco#16697 by consolidating the best fixes from 23+ open community PRs into a single coherent changeset. Fixes consolidated from PRs: anomalyco#16695, anomalyco#16346, anomalyco#14650, anomalyco#15646, anomalyco#13186, anomalyco#10392, anomalyco#7914, anomalyco#9145, anomalyco#9146, anomalyco#7049, anomalyco#16616, anomalyco#16241 - Plugin subscriber stacking: unsub before re-subscribing in init() - Subagent deallocation: Session.remove() after task completion - SSE stream cleanup: centralized cleanup with done guard (3 endpoints) - Compaction data trimming: clear output/attachments on prune - Process exit cleanup: Instance.disposeAll() with 5s timeout - Serve cmd: graceful shutdown instead of blocking forever - Bash tool: ring buffer with 10MB cap instead of O(n²) concat - LSP index teardown: clear clients/broken/spawning on dispose - LSP open-files cap: evict oldest when >1000 tracked files - Format subscription: store and cleanup unsub handle - Permission/Question clearSession: reject pending on session delete - Session.remove() cleanup chain: FileTime, Permission, Question - ShareNext subscription cleanup: store unsub handles, cleanup on dispose - OAuth transport: close existing before replacing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update: Consolidated fixes from 23+ community PRsI've reviewed every open memory leak PR referenced in #16697 and consolidated the best version of each fix into this branch. The latest commit ( New fixes added (from community PRs)
What's NOT in this PR (and why)Some fixes from the community PRs were intentionally excluded:
Verification
|
Port robust process exit detection from PR anomalyco#15757 to fix zombie/stuck child processes in containers where Bun fails to deliver exit events. - Add polling watchdog to bash tool and Process.spawn that detects process exit via kill(pid, 0) when event-loop events are missed - Add process registry (active map) with stale/reap exports for server-level watchdog to detect and clean up stuck bash processes - Improve Shell.killTree with alive() helper and proper SIGKILL escalation after SIGTERM timeout - Add session-level watchdog interval in prompt loop to periodically reap stale bash processes Based on the work in anomalyco#15757. Co-Authored-By: Nacho F. Lizaur <NachoFLizaur@users.noreply.github.com>
Complete the port of PR anomalyco#15757 with remaining pieces: - Add stdio end event redundancy as third fallback for exit detection (fires when pipe file descriptors close, independent of exit events) - Add diagnostic log.info calls at spawn, abort, timeout, and each exit detection path for debugging container issues - Add comprehensive tests: defensive patterns, polling watchdog isolation, Shell.killTree, server-level watchdog (stale/reap), stdio end events, and Process.spawn defensive patterns - Skip truncation tests on Windows (matching upstream) Co-Authored-By: Nacho F. Lizaur <NachoFLizaur@users.noreply.github.com>
On Windows, stdio pipe end events can fire before the exit event populates proc.exitCode, causing it to be null in the result metadata. Fall back to 0 (or 1 if signalCode is set) when exitCode is null, matching the same pattern used in Process.spawn. Co-Authored-By: Nacho F. Lizaur <NachoFLizaur@users.noreply.github.com>
Issue for this PR
Closes #16697
Type of change
What does this PR do?
Fixes multiple sources of unbounded memory growth across the TUI, core subsystems, and server-side components. These leaks compound during extended usage causing RAM to climb monotonically.
TUI event listener leaks:
app.tsx: 6sdk.event.on()calls had no cleanup — listeners accumulated on re-render. Now collected and unsubscribed viaonCleanup.session/index.tsx:message.part.updatedlistener accumulated every time a session was opened. AddedonCleanup.prompt/index.tsx:PromptAppendlistener leaked on prompt remount. AddedonCleanup.theme.tsx:process.on("SIGUSR2")handler never removed. Now cleaned up viaonCleanup.keybind.tsx: Leader mode timeout never cleared on provider unmount. AddedonCleanup.sdk.tsx: Event queue not cleared on context cleanup. Now zeroed out.Core memory leaks:
sdk.tsx: Event queue grew unbounded — capped at 1000 with oldest-event eviction.sync.tsx: Message trimming only removed 1 message at a time and leaked associated parts. Now removes all excess and cleans up parts. Session switching now frees previous session's data.lsp/client.ts: Diagnostics map grew without bound — added 200-file cap with LRU eviction, early return on empty diagnostics, and clear on shutdown.session/prompt.ts: Pending callbacks not rejected on cancel — now rejected to prevent promise/closure leaks.project/state.ts: Warning timeout in disposal never cleared — now cleared after disposal completes.provider/models.ts: Module-levelsetIntervalfor model refresh never cleared — now cleared on process exit.share/share-next.ts: Addeddispose()to clear pending sync timeouts.Collection cleanup:
bus/index.ts: Empty subscription arrays left as tombstones after unsubscribe — now deleted.util/rpc.ts: Empty listener Sets left in map after handler removal — now deleted.pty/index.ts: Session buffer not cleared on removal — now zeroed to free memory immediately.How did you verify your code works?
Ran the TUI against a large monorepo, switched between sessions repeatedly, and monitored RSS over ~30 minutes. Memory stabilized instead of growing monotonically. Typechecks pass cleanly.
Screenshots / recordings
N/A — not a UI change
Checklist