fix(fs.watch): fix second watcher not receiving events after close#27044
fix(fs.watch): fix second watcher not receiving events after close#27044robobun wants to merge 1 commit into
Conversation
When an `fs.watch` watcher was closed and a new one created for the same file, the second watcher would never receive events. This happened because: 1. Closing a watcher marked the watchlist entry for eviction but didn't immediately remove it. 2. When the new watcher called `addFile`, `indexOf(hash)` found the stale (pending-eviction) entry and returned early, skipping inotify/kqueue registration for the new file descriptor. 3. The watcher thread would eventually evict the old entry and close its fd, leaving the new watcher with no active platform watch. The fix adds `reviveEvictedEntry` to `Watcher.addFile`: when an existing entry is found and it's pending eviction, the entry is removed from the eviction list and updated in-place with the new fd, file path, and platform watch registration (inotify on Linux, kqueue on macOS). Also fixes a use-after-free in `_decrementPathRefNoLock` where the path string was freed while still referenced by the watcher thread's watchlist entries (including parent directory entries whose file_path was a substring of the freed path). Fixes #18919. Co-Authored-By: Claude <noreply@anthropic.com>
|
Updated 6:12 PM PT - Feb 14th, 2026
❌ Your commit
🧪 To try this PR locally: bunx bun-pr 27044That installs a local version of the PR into your bun-27044 --bun |
WalkthroughThe changes implement a mechanism to revive previously evicted watchlist entries when a file is re-watched after being closed, preventing loss of watch capability. This includes a new helper function for entry revival with platform-specific re-registration and adjustments to path reference management to prevent use-after-free during concurrent watcher operations. Changes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@src/Watcher.zig`:
- Around line 789-793: The code unsafely assumes new_file_path is
null-terminated by casting new_file_path.ptr[0..new_file_path.len :0] for
watchPath; make the null-termination requirement explicit by changing the
parameter type of the function that takes new_file_path to [:0]const u8 (or
otherwise validate/ensure the sentinel before casting), update call sites such
as addFile and uses of WatchItem.file_path to either pass a [:0]const u8 (e.g.,
from PathInfo.path) or perform an explicit check/cast with sentinel
verification, and keep the watchPath invocation using the safe [:0]const u8
slice so you no longer rely on an unchecked sentinel cast in Watcher.zig.
- Around line 757-797: reviveEvictedEntry mutates eviction state and replaces
the fd/file_path before attempting platform registration, so on Linux a failed
platform.watchPath leaves a live entry with no watch; fix by performing platform
registration first (call this.platform.watchPath(...) on Linux and
addFileDescriptorToKQueueWithoutChecks(...) on mac) and only after it succeeds
remove the index from evict_list, close the old fd, and update watchlist slice
items (.fd, .file_path, .eventlist_index). Alternatively, if you prefer minimal
change, capture the original evict_list_i, evict_list contents, original fd and
file_path and restore them (and re-add the index to evict_list) if
platform.watchPath/.addFileDescriptorToKQueueWithoutChecks fails so the entry
remains pending eviction; reference reviveEvictedEntry, evict_list/evict_list_i,
watchlist.slice(), slice.items(.fd)/.file_path/.eventlist_index, and
platform.watchPath/addFileDescriptorToKQueueWithoutChecks when making the
change.
In `@test/regression/issue/18919.test.ts`:
- Around line 44-48: The test captures stdout, stderr, and exitCode from proc
but never asserts on stderr; add an assertion that stderr is empty (or contains
no error text) after reading it to ensure no errors were logged—e.g., add
expect(stderr).toBe("") (or a more specific check) alongside the existing
expect(stdout)... and expect(exitCode)... checks so that variables stdout,
stderr, exitCode (and proc) are validated.
| fn reviveEvictedEntry(this: *Watcher, index: WatchItemIndex, new_fd: bun.FileDescriptor, new_file_path: string) bool { | ||
| // Check if this index is in the eviction list | ||
| var found = false; | ||
| var write: WatchItemIndex = 0; | ||
| for (this.evict_list[0..this.evict_list_i]) |item| { | ||
| if (item == index) { | ||
| found = true; | ||
| continue; // remove from eviction list | ||
| } | ||
| this.evict_list[write] = item; | ||
| write += 1; | ||
| } | ||
| if (!found) return false; | ||
| this.evict_list_i = write; | ||
|
|
||
| // Close the old fd (it's stale since the previous watcher was closed) | ||
| var slice = this.watchlist.slice(); | ||
| const fds = slice.items(.fd); | ||
| if (!Environment.isWindows) { | ||
| if (index < fds.len and fds[index].isValid()) { | ||
| fds[index].close(); | ||
| } | ||
| } | ||
|
|
||
| // Update the entry in-place with new fd and file_path | ||
| fds[index] = new_fd; | ||
| slice.items(.file_path)[index] = new_file_path; | ||
|
|
||
| // Re-register with the platform watcher (inotify/kqueue) | ||
| if (comptime Environment.isMac) { | ||
| this.addFileDescriptorToKQueueWithoutChecks(new_fd, index); | ||
| } else if (comptime Environment.isLinux) { | ||
| const path_z: [:0]const u8 = new_file_path.ptr[0..new_file_path.len :0]; | ||
| slice.items(.eventlist_index)[index] = switch (this.platform.watchPath(path_z)) { | ||
| .err => return false, | ||
| .result => |r| r, | ||
| }; | ||
| } | ||
|
|
||
| return true; | ||
| } |
There was a problem hiding this comment.
Partial update risk when Linux inotify registration fails.
If the entry is found in the eviction list, the function:
- Removes it from the eviction list (lines 761-770)
- Closes the old fd (lines 775-778)
- Updates fd and file_path (lines 782-783)
- Attempts inotify registration (lines 788-793)
If step 4 fails, the function returns false, but steps 1-3 have already been applied. The entry is now:
- No longer pending eviction
- Has the new fd and path
- Has a stale
eventlist_index(no valid inotify watch)
This leaves a zombie entry that won't receive events and won't be cleaned up.
🛡️ Suggested fix: restore eviction on failure
// Re-register with the platform watcher (inotify/kqueue)
if (comptime Environment.isMac) {
this.addFileDescriptorToKQueueWithoutChecks(new_fd, index);
} else if (comptime Environment.isLinux) {
const path_z: [:0]const u8 = new_file_path.ptr[0..new_file_path.len :0];
slice.items(.eventlist_index)[index] = switch (this.platform.watchPath(path_z)) {
- .err => return false,
+ .err => {
+ // Registration failed; re-add to eviction list to clean up
+ if (this.evict_list_i < max_eviction_count) {
+ this.evict_list[this.evict_list_i] = index;
+ this.evict_list_i += 1;
+ }
+ return false;
+ },
.result => |r| r,
};
}🤖 Prompt for AI Agents
In `@src/Watcher.zig` around lines 757 - 797, reviveEvictedEntry mutates eviction
state and replaces the fd/file_path before attempting platform registration, so
on Linux a failed platform.watchPath leaves a live entry with no watch; fix by
performing platform registration first (call this.platform.watchPath(...) on
Linux and addFileDescriptorToKQueueWithoutChecks(...) on mac) and only after it
succeeds remove the index from evict_list, close the old fd, and update
watchlist slice items (.fd, .file_path, .eventlist_index). Alternatively, if you
prefer minimal change, capture the original evict_list_i, evict_list contents,
original fd and file_path and restore them (and re-add the index to evict_list)
if platform.watchPath/.addFileDescriptorToKQueueWithoutChecks fails so the entry
remains pending eviction; reference reviveEvictedEntry, evict_list/evict_list_i,
watchlist.slice(), slice.items(.fd)/.file_path/.eventlist_index, and
platform.watchPath/addFileDescriptorToKQueueWithoutChecks when making the
change.
| const path_z: [:0]const u8 = new_file_path.ptr[0..new_file_path.len :0]; | ||
| slice.items(.eventlist_index)[index] = switch (this.platform.watchPath(path_z)) { | ||
| .err => return false, | ||
| .result => |r| r, | ||
| }; |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Potential type safety issue with sentinel cast on Linux path.
The cast new_file_path.ptr[0..new_file_path.len :0] assumes new_file_path is null-terminated, but the parameter type is string (i.e., []const u8), not [:0]const u8. While current callers pass null-terminated paths (from PathInfo.path which is [:0]const u8), this assumption isn't enforced by the type system.
Consider changing the parameter type to [:0]const u8 to make the null-termination requirement explicit:
♻️ Suggested type change
-fn reviveEvictedEntry(this: *Watcher, index: WatchItemIndex, new_fd: bun.FileDescriptor, new_file_path: string) bool {
+fn reviveEvictedEntry(this: *Watcher, index: WatchItemIndex, new_fd: bun.FileDescriptor, new_file_path: [:0]const u8) bool {And at the call site in addFile, the file_path parameter would need to be cast or the function signature adjusted. Since WatchItem.file_path is string, you may need to verify the sentinel at the call site or accept the current implicit contract.
🤖 Prompt for AI Agents
In `@src/Watcher.zig` around lines 789 - 793, The code unsafely assumes
new_file_path is null-terminated by casting
new_file_path.ptr[0..new_file_path.len :0] for watchPath; make the
null-termination requirement explicit by changing the parameter type of the
function that takes new_file_path to [:0]const u8 (or otherwise validate/ensure
the sentinel before casting), update call sites such as addFile and uses of
WatchItem.file_path to either pass a [:0]const u8 (e.g., from PathInfo.path) or
perform an explicit check/cast with sentinel verification, and keep the
watchPath invocation using the safe [:0]const u8 slice so you no longer rely on
an unchecked sentinel cast in Watcher.zig.
| const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]); | ||
|
|
||
| expect(stdout).toContain("File changed"); | ||
| expect(stdout).toContain("File changed again"); | ||
| expect(exitCode).toBe(0); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider checking stderr for better test diagnostics.
The test correctly checks stdout before exitCode per guidelines. However, stderr is captured but never examined. If the fix has issues, errors might be logged to stderr while stdout still contains both messages (from partial execution), causing the test to pass misleadingly.
💡 Suggested improvement
const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]);
+ expect(stderr).toBe("");
expect(stdout).toContain("File changed");
expect(stdout).toContain("File changed again");
expect(exitCode).toBe(0);🤖 Prompt for AI Agents
In `@test/regression/issue/18919.test.ts` around lines 44 - 48, The test captures
stdout, stderr, and exitCode from proc but never asserts on stderr; add an
assertion that stderr is empty (or contains no error text) after reading it to
ensure no errors were logged—e.g., add expect(stderr).toBe("") (or a more
specific check) alongside the existing expect(stdout)... and expect(exitCode)...
checks so that variables stdout, stderr, exitCode (and proc) are validated.
Code ReviewNewest first 🟡 9940a — 1 issue(s) found
Powered by Claude Code Review |
| setTimeout(() => { | ||
| fs.writeFileSync(file, 'C'); | ||
| }, 200); | ||
| }); | ||
|
|
||
| setTimeout(() => { | ||
| fs.writeFileSync(file, 'B'); | ||
| }, 200); |
There was a problem hiding this comment.
🟡 Minor: The setTimeout calls in the -e fixture script (lines 28-30 and 33-35) use a fixed 200ms delay to trigger file writes for fs.watch. While unlikely to cause real flakiness since fs.watch registers synchronously, the existing codebase convention in test/js/node/watch/fs.watch.test.ts uses setInterval with repeated writes (the repeat() helper) to avoid race conditions on macOS where a watcher can be returned before it is actually watching.
Why this is a problem
Issue Description
The test file test/regression/issue/18919.test.ts uses setTimeout(() => { fs.writeFileSync(file, 'C'); }, 200) and setTimeout(() => { fs.writeFileSync(file, 'B'); }, 200) inside the spawned -e fixture script to trigger file writes that an fs.watch watcher should detect. The CLAUDE.md rule states: "Do not use setTimeout in tests. Instead, await the condition to be met."
Specific Code
The two setTimeout calls appear at lines 28-30 and 33-35 of the test file, both inside the inline -e script passed to Bun.spawn. The first triggers the initial file change that the outer fs.watch watcher detects, and the second triggers a change for a nested watcher created after the first one closes.
Why Existing Code Does Not Fully Prevent This
The outer test code itself is well-structured: it properly uses await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]) to await the spawned process completing, which is the correct pattern. The setTimeout usage is confined to the child process fixture script, not the test runner. This makes the concern more nuanced than a straightforward CLAUDE.md violation in the test body.
Assessment
All three verifiers agree this is at most a style nit. The setTimeout calls are inside a spawned child process fixture, not in the test runner itself, which softens the CLAUDE.md violation. The test is unlikely to be flaky in practice because fs.watch on Linux uses inotify which registers synchronously, and 200ms is generous. However, the codebase already has a well-established pattern (setInterval with repeated writes via the repeat() helper in fs.watch.test.ts) designed to handle the exact race condition that single-write setTimeout could theoretically hit on macOS. Following this convention would make the test more robust and consistent.
Suggested Fix
Replace the two setTimeout + single writeFileSync calls in the fixture script with setInterval loops that repeatedly write the file every 10-20ms, and clear the interval inside the fs.watch callback (similar to the repeat() pattern in test/js/node/watch/fs.watch.test.ts). This ensures the write will be detected even if the watcher is not immediately ready.
|
Closing this PR because it has been inactive for more than 90 days. |
Summary
fs.watchnot receiving events when a new watcher is created for the same file after a previous watcher is closed (fs.watch does not work after previous .close() #18919)Root cause
When
fs.watchis closed, its watchlist entry is marked for deferred eviction (not immediately removed). When a newfs.watchis created for the same file,Watcher.addFilefinds the stale entry by hash and returns early — skipping inotify/kqueue registration for the new file descriptor. The watcher thread eventually evicts the old entry and closes its fd, leaving the new watcher with no active platform watch.Additionally,
_decrementPathRefNoLockfreed the path string while it was still referenced by the watcher thread's watchlist entries (including parent directory entries whosefile_pathis a substring of the freed path), causing use-after-free on the watcher thread.Fix
Watcher.addFile: When an existing entry with matching hash is found and it's pending eviction, revive it by removing from the eviction list and updating in-place with the new fd, file path, and platform watch registration (reviveEvictedEntry)._decrementPathRefNoLock: Don't free the path string — it's still referenced by the watchlist. Accept the small bounded leak (one path string per close/re-watch cycle) rather than risk memory corruption.Test plan
test/regression/issue/18919.test.ts— creates a watcher, triggers an event, closes it, creates a second watcher on the same file, and verifies the second watcher receives eventsUSE_SYSTEM_BUN=1) — confirms the bug existsbun bd test) — confirms the fix workstest/js/node/watch/tests pass (same pre-existing failures as before)Fixes #18919.
🤖 Generated with Claude Code