fix(core): Free WASM resources and cap MCP caches to prevent memory leaks#1379
fix(core): Free WASM resources and cap MCP caches to prevent memory leaks#1379
Conversation
…eaks Two independent fixes for memory leaks in long-running or heavy usage: Tree-sitter WASM leak: parser.parse() returns a Tree object backed by WASM heap memory that JS GC cannot reclaim. Added tree.delete() in a finally block and query.delete() in LanguageParser.dispose(). MCP cache growth: outputFileRegistry and fileChangeCountsCache are module-level Maps that grow without limit in long-running MCP servers. Added size caps with FIFO eviction. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
⚡ Performance Benchmark
Details
|
📝 WalkthroughWalkthroughThe PR introduces memory management improvements by adding fixed upper bounds to two caches/registries with LRU-style eviction (fileChangeCountsCache and outputFileRegistry), and ensuring explicit deletion of Tree-sitter Query and Tree objects during cleanup to prevent memory leaks. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request addresses potential memory leaks and unbounded resource growth in long-running processes. Key changes include implementing FIFO eviction strategies for caches in outputSort.ts and mcpToolRuntime.ts, and ensuring native WASM memory is properly released in tree-sitter components by calling .delete() on queries and trees. Review feedback highlights that gitAvailabilityCache should also be capped to prevent memory growth and suggests implementing disk cleanup for temporary files when they are evicted from the output registry.
| // Capped to prevent unbounded growth in long-running MCP server processes. | ||
| // Key format: `${cwd}:${maxCommits}` | ||
| const MAX_CACHE_SIZE = 50; | ||
| const fileChangeCountsCache = new Map<string, Record<string, number>>(); |
There was a problem hiding this comment.
The gitAvailabilityCache (declared at line 15) is also a module-level Map that grows without limit as different directories are processed. To fully address the memory leak concerns in long-running processes, this cache should also be capped using a similar FIFO eviction strategy as fileChangeCountsCache. This is particularly relevant because cwd changes for every remote repository packing operation.
| if (outputFileRegistry.size >= MAX_REGISTRY_SIZE) { | ||
| const oldestKey = outputFileRegistry.keys().next().value; | ||
| if (oldestKey !== undefined) { | ||
| outputFileRegistry.delete(oldestKey); | ||
| } | ||
| } |
There was a problem hiding this comment.
While evicting entries from outputFileRegistry prevents memory growth, the corresponding temporary files on disk are not deleted. In long-running scenarios with heavy usage, this could lead to significant disk space consumption in the system's temporary directory. Consider implementing a best-effort cleanup of the file when an entry is evicted (e.g., using fs.unlink). Note that registerOutputFile is currently synchronous, so this would need to be handled carefully to avoid blocking or complex async logic.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1379 +/- ##
==========================================
- Coverage 87.37% 87.24% -0.13%
==========================================
Files 115 115
Lines 4371 4383 +12
Branches 1015 1019 +4
==========================================
+ Hits 3819 3824 +5
- Misses 552 559 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/core/output/outputSort.ts (1)
14-15: Consider boundinggitAvailabilityCachefor consistency.While
fileChangeCountsCacheis now bounded,gitAvailabilityCache(line 15) remains unbounded. In long-running MCP server processes with many distinctcwdvalues, this could also grow without limit.The impact is likely minimal since it only stores boolean values, but for consistency with the PR's goal of preventing unbounded cache growth, you might consider applying the same eviction pattern here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/core/output/outputSort.ts` around lines 14 - 15, gitAvailabilityCache is currently an unbounded Map while fileChangeCountsCache was changed to a bounded LRU-style cache; make gitAvailabilityCache use the same bounded eviction strategy to prevent unbounded growth. Replace the Map<string, boolean> gitAvailabilityCache with the same bounded cache implementation used for fileChangeCountsCache (reuse the same MAX_CACHE_ENTRIES constant and eviction logic or helper class), ensure lookups/sets use the new cache API, and keep the key as cwd and value as boolean so all existing call sites (references to gitAvailabilityCache) continue to work.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/core/output/outputSort.ts`:
- Around line 14-15: gitAvailabilityCache is currently an unbounded Map while
fileChangeCountsCache was changed to a bounded LRU-style cache; make
gitAvailabilityCache use the same bounded eviction strategy to prevent unbounded
growth. Replace the Map<string, boolean> gitAvailabilityCache with the same
bounded cache implementation used for fileChangeCountsCache (reuse the same
MAX_CACHE_ENTRIES constant and eviction logic or helper class), ensure
lookups/sets use the new cache API, and keep the key as cwd and value as boolean
so all existing call sites (references to gitAvailabilityCache) continue to
work.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0a08b6b7-5abe-4b97-bcdf-bee084b6476f
📒 Files selected for processing (4)
src/core/output/outputSort.tssrc/core/treeSitter/languageParser.tssrc/core/treeSitter/parseFile.tssrc/mcp/tools/mcpToolRuntime.ts
Code ReviewOverall this is a clean, well-motivated PR. The WASM resource cleanup is correct and the cache caps are a sensible safeguard for long-running MCP servers. Approve with minor suggestions. Highlights
Suggestions1. Missing tests for eviction behavior (recommended)The FIFO eviction logic is the core new behavior in this PR but has no test coverage. Two tests would be straightforward:
These would catch regressions if the eviction logic is ever refactored. 2.
|
|
Closing — no measurable memory improvement in benchmarks (compress mode 300-iteration test showed identical RSS/heap between this branch and main). The WASM trees for typical source files are too small to produce visible impact, and MCP cache growth is negligible in practice. |
Two independent fixes for memory leaks that affect long-running or heavy usage scenarios:
1. Tree-sitter WASM memory leak (compress/removeComments mode)
parser.parse()returns aTreeobject backed by WASM heap memory. Unlike JS objects, WASM allocations are not managed by the garbage collector — they must be explicitly freed viatree.delete(). This call was missing, causing WASM heap to grow with every parsed file.Similarly,
Queryobjects were not freed inLanguageParser.dispose().Fix: Added
tree.delete()in afinallyblock inparseFile(), andquery.delete()inLanguageParser.dispose().2. MCP server unbounded cache growth
Two module-level
Mapcaches grow without limit in long-running MCP server processes:outputFileRegistryinmcpToolRuntime.ts— accumulates an entry perpack_codebase/pack_remote_repositorycall, never evictedfileChangeCountsCacheinoutputSort.ts— accumulates per uniquecwd:maxCommitscombinationFix: Added size caps (100 and 50 respectively) with FIFO eviction using
Map's insertion-order iteration.Checklist
npm run testnpm run lint