Fix file watcher robustness: dir watching, cleanup, suppression#124
Conversation
…ppression Addresses all review feedback on #121: 1. Watch the data directory instead of individual files, so watchers work even when config.json/trust.json don't exist at daemon startup (P1). Also fixes atomic rename writes (trust-store uses renameSync) that would break per-file watchers on many platforms. 2. Cancel pending debounce timers in stopFileWatchers() to prevent use-after-teardown callbacks firing after server shutdown. 3. Suppress redundant config reload when handleModelSet writes config.json — the method already does the full reload sequence, and a delayed watcher callback would incorrectly evict sessions created in the 200ms debounce window. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| handlers[filename](); | ||
| }, 200); | ||
| debounceTimers.set(filename, timer); | ||
| this.debounceTimers.push(timer); |
There was a problem hiding this comment.
🟡 debounceTimers array grows without bound (memory leak)
Every file-change event pushes a new timer to this.debounceTimers (line 131), but completed or replaced timers are never removed from the array. The local debounceTimers Map correctly deduplicates per filename, but the instance-level array only accumulates.
Root Cause and Impact
When a watched file changes, the handler at assistant/src/daemon/server.ts:125-131 creates a new setTimeout and pushes it to this.debounceTimers. If the same file changes again within the debounce window, the old timer is cleared via the local Map (debounceTimers.delete / clearTimeout), but its reference remains in this.debounceTimers. Over the lifetime of a long-running daemon, every single fs event adds an entry:
const timer = setTimeout(() => {
debounceTimers.delete(filename);
log.info({ file: filename }, 'File changed, reloading');
handlers[filename]();
}, 200);
debounceTimers.set(filename, timer); // local Map — properly replaces
this.debounceTimers.push(timer); // instance array — only growsAdditionally, stopFileWatchers() calls clearTimeout on every entry including already-fired timers (harmless but wasteful), and the array is only cleared on stop(). For a daemon that runs for days/weeks, this is an unbounded memory leak proportional to the number of file-change events received.
Impact: Memory leak in long-running daemon processes. Each fs event leaks a timer ID reference that is never reclaimed until server stop.
Prompt for agents
Replace the this.debounceTimers array with a different approach. The simplest fix is to store the local debounceTimers Map as an instance field (e.g. this.debounceTimers = new Map<string, ReturnType<typeof setTimeout>>()) instead of using both a local Map and an instance array. Then in stopFileWatchers(), iterate the Map values to clearTimeout, and clear the Map. This eliminates the unbounded growth since the Map naturally deduplicates by filename. Specifically:
1. Change the class field from `private debounceTimers: ReturnType<typeof setTimeout>[] = []` to `private debounceTimers = new Map<string, ReturnType<typeof setTimeout>>()`
2. In startFileWatchers(), remove the local `debounceTimers` Map and use `this.debounceTimers` directly
3. In stopFileWatchers(), change to: `for (const timer of this.debounceTimers.values()) { clearTimeout(timer); } this.debounceTimers.clear();`
Was this helpful? React with 👍 or 👎 to provide feedback.
| // would incorrectly evict sessions created after this method returns. | ||
| this.suppressConfigReload = true; | ||
| saveRawConfig(raw); | ||
| setTimeout(() => { this.suppressConfigReload = false; }, 300); |
There was a problem hiding this comment.
🟡 suppressConfigReload reset timer not tracked for cleanup on stop
The setTimeout at line 372 that resets suppressConfigReload to false is not added to this.debounceTimers, so it is not cancelled when stopFileWatchers() is called during server shutdown.
Detailed Explanation
In handleModelSet at assistant/src/daemon/server.ts:370-372:
this.suppressConfigReload = true;
saveRawConfig(raw);
setTimeout(() => { this.suppressConfigReload = false; }, 300);This PR specifically aims to fix use-after-teardown issues from timers firing after stopFileWatchers(). The debounce timers in startFileWatchers are tracked and cancelled on stop, but this 300ms timer in handleModelSet is not tracked at all. If stop() is called within 300ms of a model_set command, this timer fires after teardown.
The practical impact is minimal (it just sets a boolean to false on a stopped server), but it contradicts the PR's own stated goal and pattern. More importantly, if the server were ever restarable or if future code adds teardown logic that depends on timer quiescence, this would be a real problem.
Impact: Minor inconsistency — timer fires after server stop, but only sets a boolean with no side effects on a stopped server.
| setTimeout(() => { this.suppressConfigReload = false; }, 300); | |
| const resetTimer = setTimeout(() => { this.suppressConfigReload = false; }, 300); | |
| this.debounceTimers.push(resetTimer); | |
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 21745c9fbb
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| this.suppressConfigReload = true; | ||
| saveRawConfig(raw); |
There was a problem hiding this comment.
Reset reload suppression on config write failure
handleModelSet sets suppressConfigReload = true before calling saveRawConfig, but the reset is only scheduled afterward; if saveRawConfig throws (for example due to permissions or disk errors), the flag stays true indefinitely and later config.json watcher events are ignored by the early return in the watcher handler. That leaves hot-reload effectively disabled until daemon restart after a single failed model update.
Useful? React with 👍 / 👎.
Address review feedback on PR #124: - Replace debounceTimers array with Map instance field to prevent unbounded growth — the Map naturally deduplicates by filename - Track the suppressConfigReload reset timer in the Map so it gets cancelled on server stop - Reset suppressConfigReload immediately if saveRawConfig throws, preventing hot-reload from staying permanently disabled - Remove unused `join` import Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address review feedback on PR #124: - Replace debounceTimers array with Map instance field to prevent unbounded growth — the Map naturally deduplicates by filename - Track the suppressConfigReload reset timer in the Map so it gets cancelled on server stop - Reset suppressConfigReload immediately if saveRawConfig throws, preventing hot-reload from staying permanently disabled - Remove unused `join` import Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Summary
handleModelSetwrites config.json (it already does the full reload)Context
Addresses all 5 review comments on #121 from Devin and Codex:
watchFileskipped when file doesn't exist at startup — new users who haven't created config.json/trust.json yet would never get hot reloadrenameSyncin trust-store) break per-filefs.watch()on many platformsstopFileWatchers(), causing use-after-teardownhandleModelSet→saveRawConfigtriggers the file watcher, causing a redundant reload that evicts sessions created in the 200ms debounce windowThe directory-watching approach solves #1 and #2 together. Debounce timer tracking solves #3. The
suppressConfigReloadflag solves #4.Test plan
npx tsc --noEmitpassesbun testpasses (212 tests)vellum config set model ...— verify hot reload triggers/model— verify no duplicate reload in daemon logs🤖 Generated with Claude Code