Skip to content

fix(fs.watch): fix second watcher not receiving events after close#27044

Closed
robobun wants to merge 1 commit into
mainfrom
claude/fix-fs-watch-after-close-18919
Closed

fix(fs.watch): fix second watcher not receiving events after close#27044
robobun wants to merge 1 commit into
mainfrom
claude/fix-fs-watch-after-close-18919

Conversation

@robobun

@robobun robobun commented Feb 15, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Fixes fs.watch not receiving events when a new watcher is created for the same file after a previous watcher is closed (fs.watch does not work after previous .close() #18919)
  • Fixes a use-after-free (ASAN: use-after-poison) in the file watcher thread when path memory was freed while still referenced by watchlist entries

Root cause

When fs.watch is closed, its watchlist entry is marked for deferred eviction (not immediately removed). When a new fs.watch is created for the same file, Watcher.addFile finds the stale entry by hash and returns early — skipping inotify/kqueue registration for the new file descriptor. The watcher thread eventually evicts the old entry and closes its fd, leaving the new watcher with no active platform watch.

Additionally, _decrementPathRefNoLock freed the path string while it was still referenced by the watcher thread's watchlist entries (including parent directory entries whose file_path is a substring of the freed path), causing use-after-free on the watcher thread.

Fix

  1. Watcher.addFile: When an existing entry with matching hash is found and it's pending eviction, revive it by removing from the eviction list and updating in-place with the new fd, file path, and platform watch registration (reviveEvictedEntry).
  2. _decrementPathRefNoLock: Don't free the path string — it's still referenced by the watchlist. Accept the small bounded leak (one path string per close/re-watch cycle) rather than risk memory corruption.

Test plan

  • New regression test test/regression/issue/18919.test.ts — creates a watcher, triggers an event, closes it, creates a second watcher on the same file, and verifies the second watcher receives events
  • Test fails with system bun (USE_SYSTEM_BUN=1) — confirms the bug exists
  • Test passes with debug build (bun bd test) — confirms the fix works
  • No ASAN errors in debug build
  • Existing test/js/node/watch/ tests pass (same pre-existing failures as before)

Fixes #18919.

🤖 Generated with Claude Code

When an `fs.watch` watcher was closed and a new one created for the same
file, the second watcher would never receive events. This happened because:

1. Closing a watcher marked the watchlist entry for eviction but didn't
   immediately remove it.
2. When the new watcher called `addFile`, `indexOf(hash)` found the stale
   (pending-eviction) entry and returned early, skipping inotify/kqueue
   registration for the new file descriptor.
3. The watcher thread would eventually evict the old entry and close its
   fd, leaving the new watcher with no active platform watch.

The fix adds `reviveEvictedEntry` to `Watcher.addFile`: when an existing
entry is found and it's pending eviction, the entry is removed from the
eviction list and updated in-place with the new fd, file path, and
platform watch registration (inotify on Linux, kqueue on macOS).

Also fixes a use-after-free in `_decrementPathRefNoLock` where the path
string was freed while still referenced by the watcher thread's watchlist
entries (including parent directory entries whose file_path was a substring
of the freed path).

Fixes #18919.

Co-Authored-By: Claude <noreply@anthropic.com>
@robobun

robobun commented Feb 15, 2026

Copy link
Copy Markdown
Collaborator Author
Updated 6:12 PM PT - Feb 14th, 2026

❌ Your commit 9940a4c3 has 1 failures in Build #37318 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 27044

That installs a local version of the PR into your bun-27044 executable, so you can run:

bun-27044 --bun

@coderabbitai

coderabbitai Bot commented Feb 15, 2026

Copy link
Copy Markdown
Contributor

Walkthrough

The changes implement a mechanism to revive previously evicted watchlist entries when a file is re-watched after being closed, preventing loss of watch capability. This includes a new helper function for entry revival with platform-specific re-registration and adjustments to path reference management to prevent use-after-free during concurrent watcher operations.

Changes

Cohort / File(s) Summary
Watcher Entry Revival
src/Watcher.zig
Introduces private reviveEvictedEntry helper that removes entries from the eviction list, closes stale file descriptors, updates watchlist entries in-place with new fd and file path, and re-registers with platform watchers (kqueue/inotify). Integrates revival logic into addFile to handle hash collisions by attempting entry revival before insertion.
Path Reference Management
src/bun.js/node/path_watcher.zig
Modifies _decrementPathRefNoLock to defer path string deallocation; path remains referenced by the Watcher's watchlist and is freed later during eviction or manager destruction to avoid use-after-free in the watcher thread.
Regression Test
test/regression/issue/18919.test.ts
Adds integration test validating sequential fs.watch creation after closing a previous watcher on the same file, ensuring subsequent watchers correctly detect file changes after the first watcher is closed.
🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (6 files):

⚔️ src/Watcher.zig (content)
⚔️ src/bun.js/ModuleLoader.zig (content)
⚔️ src/bun.js/VirtualMachine.zig (content)
⚔️ src/bun.js/node/path_watcher.zig (content)
⚔️ src/bundler/entry_points.zig (content)
⚔️ test/js/bun/http/bun-serve-html-entry.test.ts (content)

These conflicts must be resolved before merging into main.
Resolve conflicts locally and push changes to this branch.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the primary fix—addressing the issue where a second fs.watch fails to receive events after the first watcher is closed.
Description check ✅ Passed The PR description comprehensively covers root causes, fixes implemented, and test validation, aligning with the required template structure.
Linked Issues check ✅ Passed The code changes directly address both core issues from #18919: reviveEvictedEntry handles the stale entry problem, and avoiding path string deallocation prevents use-after-free errors.
Out of Scope Changes check ✅ Passed All changes are tightly scoped to fixing the file watcher eviction and memory management issues described in #18919, with no unrelated modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@src/Watcher.zig`:
- Around line 789-793: The code unsafely assumes new_file_path is
null-terminated by casting new_file_path.ptr[0..new_file_path.len :0] for
watchPath; make the null-termination requirement explicit by changing the
parameter type of the function that takes new_file_path to [:0]const u8 (or
otherwise validate/ensure the sentinel before casting), update call sites such
as addFile and uses of WatchItem.file_path to either pass a [:0]const u8 (e.g.,
from PathInfo.path) or perform an explicit check/cast with sentinel
verification, and keep the watchPath invocation using the safe [:0]const u8
slice so you no longer rely on an unchecked sentinel cast in Watcher.zig.
- Around line 757-797: reviveEvictedEntry mutates eviction state and replaces
the fd/file_path before attempting platform registration, so on Linux a failed
platform.watchPath leaves a live entry with no watch; fix by performing platform
registration first (call this.platform.watchPath(...) on Linux and
addFileDescriptorToKQueueWithoutChecks(...) on mac) and only after it succeeds
remove the index from evict_list, close the old fd, and update watchlist slice
items (.fd, .file_path, .eventlist_index). Alternatively, if you prefer minimal
change, capture the original evict_list_i, evict_list contents, original fd and
file_path and restore them (and re-add the index to evict_list) if
platform.watchPath/.addFileDescriptorToKQueueWithoutChecks fails so the entry
remains pending eviction; reference reviveEvictedEntry, evict_list/evict_list_i,
watchlist.slice(), slice.items(.fd)/.file_path/.eventlist_index, and
platform.watchPath/addFileDescriptorToKQueueWithoutChecks when making the
change.

In `@test/regression/issue/18919.test.ts`:
- Around line 44-48: The test captures stdout, stderr, and exitCode from proc
but never asserts on stderr; add an assertion that stderr is empty (or contains
no error text) after reading it to ensure no errors were logged—e.g., add
expect(stderr).toBe("") (or a more specific check) alongside the existing
expect(stdout)... and expect(exitCode)... checks so that variables stdout,
stderr, exitCode (and proc) are validated.

Comment thread src/Watcher.zig
Comment on lines +757 to +797
fn reviveEvictedEntry(this: *Watcher, index: WatchItemIndex, new_fd: bun.FileDescriptor, new_file_path: string) bool {
// Check if this index is in the eviction list
var found = false;
var write: WatchItemIndex = 0;
for (this.evict_list[0..this.evict_list_i]) |item| {
if (item == index) {
found = true;
continue; // remove from eviction list
}
this.evict_list[write] = item;
write += 1;
}
if (!found) return false;
this.evict_list_i = write;

// Close the old fd (it's stale since the previous watcher was closed)
var slice = this.watchlist.slice();
const fds = slice.items(.fd);
if (!Environment.isWindows) {
if (index < fds.len and fds[index].isValid()) {
fds[index].close();
}
}

// Update the entry in-place with new fd and file_path
fds[index] = new_fd;
slice.items(.file_path)[index] = new_file_path;

// Re-register with the platform watcher (inotify/kqueue)
if (comptime Environment.isMac) {
this.addFileDescriptorToKQueueWithoutChecks(new_fd, index);
} else if (comptime Environment.isLinux) {
const path_z: [:0]const u8 = new_file_path.ptr[0..new_file_path.len :0];
slice.items(.eventlist_index)[index] = switch (this.platform.watchPath(path_z)) {
.err => return false,
.result => |r| r,
};
}

return true;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Partial update risk when Linux inotify registration fails.

If the entry is found in the eviction list, the function:

  1. Removes it from the eviction list (lines 761-770)
  2. Closes the old fd (lines 775-778)
  3. Updates fd and file_path (lines 782-783)
  4. Attempts inotify registration (lines 788-793)

If step 4 fails, the function returns false, but steps 1-3 have already been applied. The entry is now:

  • No longer pending eviction
  • Has the new fd and path
  • Has a stale eventlist_index (no valid inotify watch)

This leaves a zombie entry that won't receive events and won't be cleaned up.

🛡️ Suggested fix: restore eviction on failure
     // Re-register with the platform watcher (inotify/kqueue)
     if (comptime Environment.isMac) {
         this.addFileDescriptorToKQueueWithoutChecks(new_fd, index);
     } else if (comptime Environment.isLinux) {
         const path_z: [:0]const u8 = new_file_path.ptr[0..new_file_path.len :0];
         slice.items(.eventlist_index)[index] = switch (this.platform.watchPath(path_z)) {
-            .err => return false,
+            .err => {
+                // Registration failed; re-add to eviction list to clean up
+                if (this.evict_list_i < max_eviction_count) {
+                    this.evict_list[this.evict_list_i] = index;
+                    this.evict_list_i += 1;
+                }
+                return false;
+            },
             .result => |r| r,
         };
     }
🤖 Prompt for AI Agents
In `@src/Watcher.zig` around lines 757 - 797, reviveEvictedEntry mutates eviction
state and replaces the fd/file_path before attempting platform registration, so
on Linux a failed platform.watchPath leaves a live entry with no watch; fix by
performing platform registration first (call this.platform.watchPath(...) on
Linux and addFileDescriptorToKQueueWithoutChecks(...) on mac) and only after it
succeeds remove the index from evict_list, close the old fd, and update
watchlist slice items (.fd, .file_path, .eventlist_index). Alternatively, if you
prefer minimal change, capture the original evict_list_i, evict_list contents,
original fd and file_path and restore them (and re-add the index to evict_list)
if platform.watchPath/.addFileDescriptorToKQueueWithoutChecks fails so the entry
remains pending eviction; reference reviveEvictedEntry, evict_list/evict_list_i,
watchlist.slice(), slice.items(.fd)/.file_path/.eventlist_index, and
platform.watchPath/addFileDescriptorToKQueueWithoutChecks when making the
change.

Comment thread src/Watcher.zig
Comment on lines +789 to +793
const path_z: [:0]const u8 = new_file_path.ptr[0..new_file_path.len :0];
slice.items(.eventlist_index)[index] = switch (this.platform.watchPath(path_z)) {
.err => return false,
.result => |r| r,
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Potential type safety issue with sentinel cast on Linux path.

The cast new_file_path.ptr[0..new_file_path.len :0] assumes new_file_path is null-terminated, but the parameter type is string (i.e., []const u8), not [:0]const u8. While current callers pass null-terminated paths (from PathInfo.path which is [:0]const u8), this assumption isn't enforced by the type system.

Consider changing the parameter type to [:0]const u8 to make the null-termination requirement explicit:

♻️ Suggested type change
-fn reviveEvictedEntry(this: *Watcher, index: WatchItemIndex, new_fd: bun.FileDescriptor, new_file_path: string) bool {
+fn reviveEvictedEntry(this: *Watcher, index: WatchItemIndex, new_fd: bun.FileDescriptor, new_file_path: [:0]const u8) bool {

And at the call site in addFile, the file_path parameter would need to be cast or the function signature adjusted. Since WatchItem.file_path is string, you may need to verify the sentinel at the call site or accept the current implicit contract.

🤖 Prompt for AI Agents
In `@src/Watcher.zig` around lines 789 - 793, The code unsafely assumes
new_file_path is null-terminated by casting
new_file_path.ptr[0..new_file_path.len :0] for watchPath; make the
null-termination requirement explicit by changing the parameter type of the
function that takes new_file_path to [:0]const u8 (or otherwise validate/ensure
the sentinel before casting), update call sites such as addFile and uses of
WatchItem.file_path to either pass a [:0]const u8 (e.g., from PathInfo.path) or
perform an explicit check/cast with sentinel verification, and keep the
watchPath invocation using the safe [:0]const u8 slice so you no longer rely on
an unchecked sentinel cast in Watcher.zig.

Comment on lines +44 to +48
const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]);

expect(stdout).toContain("File changed");
expect(stdout).toContain("File changed again");
expect(exitCode).toBe(0);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider checking stderr for better test diagnostics.

The test correctly checks stdout before exitCode per guidelines. However, stderr is captured but never examined. If the fix has issues, errors might be logged to stderr while stdout still contains both messages (from partial execution), causing the test to pass misleadingly.

💡 Suggested improvement
   const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]);

+  expect(stderr).toBe("");
   expect(stdout).toContain("File changed");
   expect(stdout).toContain("File changed again");
   expect(exitCode).toBe(0);
🤖 Prompt for AI Agents
In `@test/regression/issue/18919.test.ts` around lines 44 - 48, The test captures
stdout, stderr, and exitCode from proc but never asserts on stderr; add an
assertion that stderr is empty (or contains no error text) after reading it to
ensure no errors were logged—e.g., add expect(stderr).toBe("") (or a more
specific check) alongside the existing expect(stdout)... and expect(exitCode)...
checks so that variables stdout, stderr, exitCode (and proc) are validated.

@claude

claude Bot commented Feb 15, 2026

Copy link
Copy Markdown
Contributor

Code Review

Newest first

🟡 9940a — 1 issue(s) found

Issue Severity File
Test uses setTimeout instead of awaiting conditions 🟡 test/regression/issue/18919.test.ts

Powered by Claude Code Review

Comment on lines +28 to +35
setTimeout(() => {
fs.writeFileSync(file, 'C');
}, 200);
});

setTimeout(() => {
fs.writeFileSync(file, 'B');
}, 200);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: The setTimeout calls in the -e fixture script (lines 28-30 and 33-35) use a fixed 200ms delay to trigger file writes for fs.watch. While unlikely to cause real flakiness since fs.watch registers synchronously, the existing codebase convention in test/js/node/watch/fs.watch.test.ts uses setInterval with repeated writes (the repeat() helper) to avoid race conditions on macOS where a watcher can be returned before it is actually watching.

Why this is a problem

Issue Description

The test file test/regression/issue/18919.test.ts uses setTimeout(() => { fs.writeFileSync(file, 'C'); }, 200) and setTimeout(() => { fs.writeFileSync(file, 'B'); }, 200) inside the spawned -e fixture script to trigger file writes that an fs.watch watcher should detect. The CLAUDE.md rule states: "Do not use setTimeout in tests. Instead, await the condition to be met."

Specific Code

The two setTimeout calls appear at lines 28-30 and 33-35 of the test file, both inside the inline -e script passed to Bun.spawn. The first triggers the initial file change that the outer fs.watch watcher detects, and the second triggers a change for a nested watcher created after the first one closes.

Why Existing Code Does Not Fully Prevent This

The outer test code itself is well-structured: it properly uses await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]) to await the spawned process completing, which is the correct pattern. The setTimeout usage is confined to the child process fixture script, not the test runner. This makes the concern more nuanced than a straightforward CLAUDE.md violation in the test body.

Assessment

All three verifiers agree this is at most a style nit. The setTimeout calls are inside a spawned child process fixture, not in the test runner itself, which softens the CLAUDE.md violation. The test is unlikely to be flaky in practice because fs.watch on Linux uses inotify which registers synchronously, and 200ms is generous. However, the codebase already has a well-established pattern (setInterval with repeated writes via the repeat() helper in fs.watch.test.ts) designed to handle the exact race condition that single-write setTimeout could theoretically hit on macOS. Following this convention would make the test more robust and consistent.

Suggested Fix

Replace the two setTimeout + single writeFileSync calls in the fixture script with setInterval loops that repeatedly write the file every 10-20ms, and clear the interval inside the fs.watch callback (similar to the repeat() pattern in test/js/node/watch/fs.watch.test.ts). This ensures the write will be detected even if the watcher is not immediately ready.

@github-actions

Copy link
Copy Markdown
Contributor

Closing this PR because it has been inactive for more than 90 days.

@github-actions github-actions Bot closed this May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fs.watch does not work after previous .close()

1 participant