Skip to content

Fix file watcher robustness: dir watching, cleanup, suppression#124

Merged
siddseethepalli merged 1 commit into
mainfrom
fix/file-watcher-robustness
Feb 8, 2026
Merged

Fix file watcher robustness: dir watching, cleanup, suppression#124
siddseethepalli merged 1 commit into
mainfrom
fix/file-watcher-robustness

Conversation

@siddseethepalli
Copy link
Copy Markdown
Contributor

@siddseethepalli siddseethepalli commented Feb 8, 2026

Summary

  • Watch the data directory instead of individual files — handles files that don't exist at startup and atomic rename writes
  • Cancel pending debounce timers on server stop to prevent use-after-teardown
  • Suppress redundant config reload when handleModelSet writes config.json (it already does the full reload)

Context

Addresses all 5 review comments on #121 from Devin and Codex:

  1. P1 (both bots): watchFile skipped when file doesn't exist at startup — new users who haven't created config.json/trust.json yet would never get hot reload
  2. P2 (Codex): Atomic rename writes (renameSync in trust-store) break per-file fs.watch() on many platforms
  3. P2 (Devin): Debounce timers fire after stopFileWatchers(), causing use-after-teardown
  4. P2 (Devin): handleModelSetsaveRawConfig triggers the file watcher, causing a redundant reload that evicts sessions created in the 200ms debounce window

The directory-watching approach solves #1 and #2 together. Debounce timer tracking solves #3. The suppressConfigReload flag solves #4.

Test plan

  • Verify npx tsc --noEmit passes
  • Verify bun test passes (212 tests)
  • Manual: start daemon without config.json, then vellum config set model ... — verify hot reload triggers
  • Manual: call /model — verify no duplicate reload in daemon logs
  • Manual: stop daemon shortly after config change — verify no errors from stale timers

🤖 Generated with Claude Code


Open with Devin

…ppression

Addresses all review feedback on #121:

1. Watch the data directory instead of individual files, so watchers work
   even when config.json/trust.json don't exist at daemon startup (P1).
   Also fixes atomic rename writes (trust-store uses renameSync) that
   would break per-file watchers on many platforms.

2. Cancel pending debounce timers in stopFileWatchers() to prevent
   use-after-teardown callbacks firing after server shutdown.

3. Suppress redundant config reload when handleModelSet writes
   config.json — the method already does the full reload sequence,
   and a delayed watcher callback would incorrectly evict sessions
   created in the 200ms debounce window.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@siddseethepalli siddseethepalli merged commit a147915 into main Feb 8, 2026
@siddseethepalli siddseethepalli deleted the fix/file-watcher-robustness branch February 8, 2026 22:27
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 5 additional findings in Devin Review.

Open in Devin Review

handlers[filename]();
}, 200);
debounceTimers.set(filename, timer);
this.debounceTimers.push(timer);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 debounceTimers array grows without bound (memory leak)

Every file-change event pushes a new timer to this.debounceTimers (line 131), but completed or replaced timers are never removed from the array. The local debounceTimers Map correctly deduplicates per filename, but the instance-level array only accumulates.

Root Cause and Impact

When a watched file changes, the handler at assistant/src/daemon/server.ts:125-131 creates a new setTimeout and pushes it to this.debounceTimers. If the same file changes again within the debounce window, the old timer is cleared via the local Map (debounceTimers.delete / clearTimeout), but its reference remains in this.debounceTimers. Over the lifetime of a long-running daemon, every single fs event adds an entry:

const timer = setTimeout(() => {
  debounceTimers.delete(filename);
  log.info({ file: filename }, 'File changed, reloading');
  handlers[filename]();
}, 200);
debounceTimers.set(filename, timer);  // local Map — properly replaces
this.debounceTimers.push(timer);       // instance array — only grows

Additionally, stopFileWatchers() calls clearTimeout on every entry including already-fired timers (harmless but wasteful), and the array is only cleared on stop(). For a daemon that runs for days/weeks, this is an unbounded memory leak proportional to the number of file-change events received.

Impact: Memory leak in long-running daemon processes. Each fs event leaks a timer ID reference that is never reclaimed until server stop.

Prompt for agents
Replace the this.debounceTimers array with a different approach. The simplest fix is to store the local debounceTimers Map as an instance field (e.g. this.debounceTimers = new Map<string, ReturnType<typeof setTimeout>>()) instead of using both a local Map and an instance array. Then in stopFileWatchers(), iterate the Map values to clearTimeout, and clear the Map. This eliminates the unbounded growth since the Map naturally deduplicates by filename. Specifically:

1. Change the class field from `private debounceTimers: ReturnType<typeof setTimeout>[] = []` to `private debounceTimers = new Map<string, ReturnType<typeof setTimeout>>()`
2. In startFileWatchers(), remove the local `debounceTimers` Map and use `this.debounceTimers` directly
3. In stopFileWatchers(), change to: `for (const timer of this.debounceTimers.values()) { clearTimeout(timer); } this.debounceTimers.clear();`
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

// would incorrectly evict sessions created after this method returns.
this.suppressConfigReload = true;
saveRawConfig(raw);
setTimeout(() => { this.suppressConfigReload = false; }, 300);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 suppressConfigReload reset timer not tracked for cleanup on stop

The setTimeout at line 372 that resets suppressConfigReload to false is not added to this.debounceTimers, so it is not cancelled when stopFileWatchers() is called during server shutdown.

Detailed Explanation

In handleModelSet at assistant/src/daemon/server.ts:370-372:

this.suppressConfigReload = true;
saveRawConfig(raw);
setTimeout(() => { this.suppressConfigReload = false; }, 300);

This PR specifically aims to fix use-after-teardown issues from timers firing after stopFileWatchers(). The debounce timers in startFileWatchers are tracked and cancelled on stop, but this 300ms timer in handleModelSet is not tracked at all. If stop() is called within 300ms of a model_set command, this timer fires after teardown.

The practical impact is minimal (it just sets a boolean to false on a stopped server), but it contradicts the PR's own stated goal and pattern. More importantly, if the server were ever restarable or if future code adds teardown logic that depends on timer quiescence, this would be a real problem.

Impact: Minor inconsistency — timer fires after server stop, but only sets a boolean with no side effects on a stopped server.

Suggested change
setTimeout(() => { this.suppressConfigReload = false; }, 300);
const resetTimer = setTimeout(() => { this.suppressConfigReload = false; }, 300);
this.debounceTimers.push(resetTimer);
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 21745c9fbb

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +370 to 371
this.suppressConfigReload = true;
saveRawConfig(raw);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reset reload suppression on config write failure

handleModelSet sets suppressConfigReload = true before calling saveRawConfig, but the reset is only scheduled afterward; if saveRawConfig throws (for example due to permissions or disk errors), the flag stays true indefinitely and later config.json watcher events are ignored by the early return in the watcher handler. That leaves hot-reload effectively disabled until daemon restart after a single failed model update.

Useful? React with 👍 / 👎.

siddseethepalli added a commit that referenced this pull request Feb 8, 2026
Address review feedback on PR #124:
- Replace debounceTimers array with Map instance field to prevent
  unbounded growth — the Map naturally deduplicates by filename
- Track the suppressConfigReload reset timer in the Map so it gets
  cancelled on server stop
- Reset suppressConfigReload immediately if saveRawConfig throws,
  preventing hot-reload from staying permanently disabled
- Remove unused `join` import

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
siddseethepalli added a commit that referenced this pull request Feb 8, 2026
Address review feedback on PR #124:
- Replace debounceTimers array with Map instance field to prevent
  unbounded growth — the Map naturally deduplicates by filename
- Track the suppressConfigReload reset timer in the Map so it gets
  cancelled on server stop
- Reset suppressConfigReload immediately if saveRawConfig throws,
  preventing hot-reload from staying permanently disabled
- Remove unused `join` import

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant