Skip to content

Treat unregistering an already-removed poll registration as success#31821

Merged
dylan-conway merged 2 commits into
mainfrom
claude/kqueue-unregister-already-gone
Jun 4, 2026
Merged

Treat unregistering an already-removed poll registration as success#31821
dylan-conway merged 2 commits into
mainfrom
claude/kqueue-unregister-already-gone

Conversation

@dylan-conway

@dylan-conway dylan-conway commented Jun 4, 2026

Copy link
Copy Markdown
Member

What

FilePoll::unregister_with_fd_impl returns an error when its EV_DELETE (kqueue) or EPOLL_CTL_DEL fails with ENOENT/EBADF. Those errnos mean the kernel-side registration is already gone — which is a routine teardown outcome, not a failure. This PR counts them as successful deregistration on all three platform branches (macOS kevent, Linux epoll, FreeBSD kevent).

Why

Two ordinary ways a registration disappears underneath a live FilePoll:

  • fd closed while registered: close() removes an fd's kevents, and epoll drops closed fds automatically. The forced unregister during reader teardown then gets EBADF/ENOENT. This is the scenario from Fix panic when FilePoll unregister fails on macOS #31701.
  • macOS pty teardown: closing a pty master marks the slave's knotes EV_EOF|EV_ONESHOT, so the kernel deletes them when the hangup event is delivered (verified: the delivered event carries flags=0x8095, i.e. EV_EOF|EV_ONESHOT|EV_DISPATCH|EV_ENABLE|EV_ADD, even though we registered with EV_DISPATCH). The reader's teardown EV_DELETE then finds nothing → ENOENT. This fires on every terminal window/tab close while a tty is polled, e.g. raw-mode stdin.

#31701 fixed the errno decoding for these entries (they previously fed an errno value through the −1-sentinel decoder and panicked on the unwrap), but still surfaces them as errors. That has a real cost: the early error return skips the registration-flag clearing at the end of unregister_with_fd_impl, so the poll still claims poll_readable and a later teardown call re-issues another doomed EV_DELETE (observed via instrumentation: the same poll failing the delete twice during one pty teardown). Every in-tree caller discards the unregister result, so nothing recovers the state.

libuv handles this the same way: uv__kqueue_delete (src/unix/kqueue.c) returns silently for exactly EBADF/ENOENT on EV_DELETE (and aborts on anything else), and the epoll side ignores EPOLL_CTL_DEL failures for closed fds (uv__platform_invalidate_fd discards the return value; the io_uring batch path comments "Failed submissions are either EPOLL_CTL_DEL commands for file descriptors that have been closed ... Ignore the former").

How verified

  • pty harness: bun under script(1) polling raw-mode stdin (plus subprocess churn), then SIGHUP + revoke(2) + pty-master SIGKILL — 5/5 runs tear down cleanly, stdin delivers end/close, process exits normally
  • test/js/bun/spawn/spawn-pipe-stale-fd-unregister.test.ts passes (exercises the modified macOS kevent and Linux epoll branches)
  • test/cli/run/run-crash-handler.test.ts passes

No new test: callers discard the unregister result, so the change has no JS-observable surface; the modified branches are covered by the existing stale-fd fixture above.

A FilePoll's EV_DELETE (kqueue) or EPOLL_CTL_DEL can fail with
ENOENT/EBADF when the kernel-side registration is already gone. Two
routine producers:

- the fd was closed while the poll was still registered: close()
  removes an fd's kevents, and epoll drops closed fds automatically
- on macOS, closing a pty master marks the slave's knotes
  EV_EOF|EV_ONESHOT, so the kernel deletes them when the hangup event
  is delivered; the reader's teardown EV_DELETE then finds nothing.
  This happens on every terminal window/tab close while a tty is
  polled (e.g. raw-mode stdin)

Both mean the kernel state already matches what unregistration wants,
so count them as success instead of returning an error. The error
return also skipped the flag-clearing at the end of
unregister_with_fd_impl, leaving the poll claiming to be registered and
re-issuing doomed deletes on later teardown calls. libuv ignores the
same errnos for its kqueue/epoll delete operations.

Follow-up to #31701, which fixed the errno decoding for these entries
but still surfaced them as errors.

Verified with a pty harness (bun under script(1) polling raw stdin,
then SIGHUP + revoke + pty master kill): teardown is clean, stdin
delivers end/close, and the process exits normally. Existing coverage:
spawn-pipe-stale-fd-unregister.test.ts exercises the modified macOS
kevent and Linux epoll branches.
@robobun

robobun commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

@coderabbitai

coderabbitai Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

Updated POSIX event loop deregistration error handling to treat EBADF and ENOENT as successful outcomes. Added a documented helper function and applied the concept across Linux, macOS, and FreeBSD unregister paths to tolerate expected teardown races.

Changes

Deregistration error handling

Layer / File(s) Summary
Deregistration-already-gone helper and documentation
src/io/posix_event_loop.rs
Introduced deregistration_already_gone(errno: sys::E) -> bool helper with documentation defining EBADF/ENOENT conditions that occur during teardown (fd closed while registered, kqueue knotes deleted on EOF). Fixed an inline comment clarifying errno_sys behavior.
Linux and Android unregister error handling
src/io/posix_event_loop.rs
Replaced errno_sys(ctl, ...) with direct sys::get_errno(ctl) matching: accepts SUCCESS, ignores deregistration_already_gone errnos, returns error-tagged result for other cases.
macOS kqueue unregister error handling
src/io/posix_event_loop.rs
Changed from fixed-index EV_ERROR checks to iterating populated changelist entries (up to 2), skipping errors that decode to deregistration_already_gone and falling back to kevent_change_error for others.
FreeBSD unregister error handling
src/io/posix_event_loop.rs
Replaced errno_sys(rc, sys::Tag::kevent) with direct sys::get_errno(rc) matching: accepts SUCCESS, ignores deregistration_already_gone errnos, returns error-tagged result for other cases.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: treating already-removed poll registrations as successful deregistrations, which is the core objective of this PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description is comprehensive and follows the template structure with clear 'What', 'Why', and 'How verified' sections.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find any issues — the change is small, well-reasoned, and matches libuv's handling — but since it alters error semantics in the core POSIX event loop deregistration path across three platforms, it's worth a maintainer's quick look.

Extended reasoning...

Overview

This PR modifies FilePoll::unregister_with_fd_impl in src/io/posix_event_loop.rs so that ENOENT/EBADF from EPOLL_CTL_DEL / EV_DELETE are treated as successful deregistration rather than errors. A new deregistration_already_gone helper encodes the predicate, and each of the three platform branches (Linux epoll, macOS kevent64, FreeBSD kevent) now falls through to the flag-clearing tail instead of early-returning. The macOS per-entry EV_ERROR check is also refactored from two unrolled conditionals into a loop over 0..rc.min(2) — semantically equivalent (the rc < 0 global-failure case is already handled above, so the usize::try_from cannot fail).

Security risks

None. This is internal event-loop bookkeeping with no user-facing input, auth, crypto, or data-exposure surface. Treating already-gone registrations as success cannot grant new capabilities; at worst it would mask a programming error, but the PR description and libuv precedent make a convincing case that these errnos are routine teardown outcomes.

Level of scrutiny

Moderate-to-high. The diff is small (~40 lines net) and the logic is straightforward, but it lives in the core POSIX event loop and changes error-path control flow on every supported Unix platform. The key behavioral effect — that the PollReadable/PollWritable/etc. flags now get cleared even when the kernel-side registration was already gone — is the whole point and looks strictly beneficial, but a maintainer familiar with the FilePoll lifecycle should confirm there's no caller that relies on the old error return to detect a stale fd.

Other factors

  • All in-tree callers reportedly discard the unregister result, so the only observable change is the flag clearing (avoids redundant doomed EV_DELETEs on later teardown).
  • Matches libuv's handling of the same errnos on the same syscalls.
  • Existing tests (spawn-pipe-stale-fd-unregister.test.ts, run-crash-handler.test.ts) exercise the modified branches; no new test added since the change has no JS-observable surface.
  • No CODEOWNERS entry covers this file.
  • No prior reviews or outstanding reviewer comments on the PR.

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find any issues with the logic, but this touches core event-loop fd teardown across three platform branches and the CI run shows a terminal.test.ts failure on macOS 14 x64 (the exact pty/terminal teardown scenario this PR discusses), so it's worth a human look before merging.

Extended reasoning...

Overview

This PR modifies FilePoll::unregister_with_fd_impl in src/io/posix_event_loop.rs so that ENOENT/EBADF from EPOLL_CTL_DEL (Linux), EV_DELETE via kevent64 (macOS), and EV_DELETE via kevent (FreeBSD) are treated as successful deregistration rather than errors. A new deregistration_already_gone helper encodes the errno set, and the macOS branch is refactored from two unrolled index checks into a bounded loop over 0..rc.min(2) that skips already-gone entries before falling back to kevent_change_error. The rationale (kernel already removed the registration; libuv does the same; early-return was leaving stale poll_* flags set) is well-researched and clearly documented.

Security risks

None identified. This is errno-classification logic in the teardown path; no untrusted input, auth, crypto, or data exposure surface is involved.

Level of scrutiny

High. This is the core POSIX event-loop fd-lifecycle code that every pipe/socket/pty/process poll in the runtime flows through. Swallowing errnos in this layer is correct here per the libuv precedent and the kernel-behavior analysis in the PR description, but mistakes in this area can manifest as hangs, leaked registrations, or interactions with fd reuse that are hard to reproduce. The macOS branch also includes a small structural refactor (loop + usize::try_from(rc.min(2))), not just an added guard. This is not a mechanical or config-level change.

Other factors

  • The CI build shows test/js/bun/terminal/terminal.test.ts failing on macOS 14 x64. The PR's motivating scenario is specifically macOS pty/terminal teardown, so this failure may be related and should be checked rather than assumed flaky. (The widespread bunx.test.ts failures, including on Windows where this file doesn't compile, look unrelated.)
  • No new test is added; the author notes the change has no JS-observable surface since callers discard the unregister result, and points to the existing spawn-pipe-stale-fd-unregister.test.ts for coverage of the modified branches.
  • No CODEOWNERS entry covers this path.

Given the criticality of the code path and the possibly-related macOS terminal test failure, I'm deferring to a human reviewer rather than approving.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants