FilePoll: support readable+writable on one poll (fixes c-ares TCP DNS assertion)#29955
Conversation
…socket state to it c-ares can request both readable and writable on a TCP DNS fd (writable while connecting / send queue pending, readable for the response). dns.zig registered both on one FilePoll, which set both .poll_readable and .poll_writable. On unregister this tripped the "!(poll_readable && poll_writable)" assertion; on epoll the second register was a CTL_MOD that silently dropped the first direction. posix_event_loop.zig: - registerWithFd (epoll): when the other direction is already registered, OR its mask into the CTL_MOD events so both stay armed. - unregisterWithFd (kqueue mac/freebsd): when both directions are set, submit two EV_DELETE changes (READ + WRITE). epoll's CTL_DEL already removes both. Drop the both-directions assertion. - needs_rearm early-return: also clear .poll_writable. dns.zig onDNSSocketState: when the desired direction set differs from what's registered, unregister first then re-register each requested direction.
|
Updated 7:04 PM PT - Apr 30th, 2026
❌ @robobun, your commit 5eb10fc has 1 failures in
🧪 To try this PR locally: bunx bun-pr 29955That installs a local version of the PR into your bun-29955 --bun |
|
Found 5 issues this PR may fix:
🤖 Generated with Claude Code |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughEpoll/kevent registration/unregistration logic was updated to preserve opposite-direction interests, fix flag removal bugs, and correct per-entry error detection. DNS socket poll state now fully resynchronizes with c-ares’ desired readable/writable set. A new UDP+TCP DNS test fixture and its test were added. Changes
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Review rate limit: 0/5 reviews remaining, refill in 56 minutes and 7 seconds. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/async/posix_event_loop.zig`:
- Around line 836-848: The change in posix_event_loop.zig that updates the epoll
flag logic (variables: flags, one_shot_flag, linux.EPOLL, and checks
this.flags.contains(.poll_writable)/(.poll_readable)) affects Linux, macOS, and
FreeBSD branches — run the cross-platform Zig compile pass (bun run
zig:check-all) and fix any platform-specific compile failures by synchronizing
equivalent logic in the macOS/FreeBSD branches (preserve platform-specific
constants and guards or add proper conditional compilation), updating any
mismatched symbol names or flags so all platform builds succeed.
In `@src/bun.js/api/bun/dns.zig`:
- Around line 2563-2569: The resync branch that calls poll.unregister and
poll.register can fail partway and leave the polls map holding a FilePoll that
no longer matches kernel state; change the logic in the block that compares
have_readable/have_writable to readable/writable so that you check each
poll.unregister() and poll.register() return value and on the first error you
abort further register/unregister attempts, remove/deinit the corresponding
FilePoll entry from the polls map, and surface/return the error; specifically,
stop on any failing call to poll.unregister(vm.event_loop_handle.?, false) or
poll.register(vm.event_loop_handle.?, .readable/.writable, false), remove the
poll entry from polls (the FilePoll for this fd) and run its cleanup to avoid
leaving a stale entry.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 65eb24b6-c9da-4b6a-b91b-63183546f39b
📒 Files selected for processing (2)
src/async/posix_event_loop.zigsrc/bun.js/api/bun/dns.zig
There was a problem hiding this comment.
I didn't find any bugs, but this touches core FilePoll register/unregister logic across epoll, macOS kqueue, and FreeBSD kqueue and removes a long-standing assertion — worth a human pass to sanity-check the platform-specific paths.
Extended reasoning...
Overview
This PR enables a single FilePoll to be registered for both readable and writable events simultaneously, fixing an assertion failure when c-ares uses TCP DNS. It modifies registerWithFd (epoll: OR the other direction into the CTL_MOD mask) and unregisterWithFd (kqueue: submit two EV_DELETE changes; drop the both-directions assertion; fix the needs_rearm early-return to clear .poll_writable instead of double-removing .poll_process). dns.zig:onDNSSocketState now fully resyncs the poll (unregister + re-register) when c-ares's desired direction set changes.
Security risks
None identified. This is internal event-loop bookkeeping with no untrusted input parsing, auth, or boundary crossing.
Level of scrutiny
High. FilePoll underpins subprocess I/O, file sinks/readers, DNS, terminal, and process watching across all POSIX targets. The change has three separate platform code paths (Linux epoll, Darwin kevent64, FreeBSD kevent) that each need to be correct, and it removes a defensive assertion that previously guarded an unsupported state. Subtle questions a human should confirm: whether the asymmetric EPOLL.ERR handling between readable/writable masks is intentional when combined, whether the dns.zig unregister-then-register sequence can drop an edge-triggered event between syscalls, and whether the changelist[1] error check on macOS is reachable given KEVENT_FLAG_ERROR_EVENTS semantics.
Other factors
The PR description is thorough, tests pass locally (139 DNS + 111 async_hooks), and the needs_rearm fix is clearly a pre-existing copy-paste bug. The change was split out of #29897 per prior review feedback, so it's already been somewhat scoped down. Still, event-loop polling correctness is load-bearing enough that auto-approval isn't appropriate.
|
@robobun resume this PR, its not correct. y sound, but there are a few real problems and one design wrinkle worth flagging. s correct |
…yscalls - unregisterWithFd (macOS): check rc<0 before reading per-entry EV_ERROR results, and guard changelist[i] on rc>=i+1 (error events are packed from index 0 with KEVENT_FLAG_ERROR_EVENTS, not positionally matched to changes). - registerWithFd (epoll): debugAssert against one_shot + bidirectional (EPOLLONESHOT disarms the whole fd on the first event either way). - dns.zig onDNSSocketState: only full-unregister when a direction is being dropped; when purely adding, register() alone suffices (CTL_MOD on epoll, separate knote on kqueue). - FreeBSD unregister: note that nevents=0 surfaces the first failing change via errno; silent second EV_DELETE miss is harmless.
|
✅ No merge conflicts detected when merging into Your branch is good to go! |
|
Addressed in 30af659:
|
Exercises the FilePoll bidirectional path: ares_open_connection() for a TCP DNS socket reports (readable=1, writable=1) via sock_state_cb, so onDNSSocketState registers both directions on one poll. Without the fix, the second register()'s CTL_MOD drops EPOLLIN on Linux (the response is never read and the query busy-loops until c-ares gives up), and unregisterWithFd asserts !(poll_readable && poll_writable) on close in debug builds on all POSIX platforms.
…ts precedence - registerWithFd/unregisterWithFd (macOS): xnu sets kev.flags |= EV_ERROR, so a failed EV_DELETE comes back as EV_DELETE|EV_ERROR (0x4002) and the previous equality test never matched. Use (flags & EV_ERROR) != 0 to match packages/bun-usockets/src/eventing/epoll_kqueue.c. - dns.zig onDNSSocketState (Windows): Zig parses 'if (r) A else 0 | if (w) B else 0' as 'if (r) A else (0 | ...)', so readable=writable=true yielded UV_READABLE only. Parenthesize each if-expression so uv_poll_start gets both bits for c-ares TCP DNS.
… assertion) (oven-sh#29955) c-ares can request both readable and writable on the same TCP DNS fd — writable while connecting or when a send hits EAGAIN, readable for the response. `dns.zig:onDNSSocketState` registered both directions on a single `FilePoll`, which set both `.poll_readable` and `.poll_writable`; on close, `unregisterWithFd` tripped `bun.assert(!(poll_readable && poll_writable))`. On epoll the second `register` was also a `CTL_MOD` that silently overwrote the first direction's mask. ### Changes `src/async/posix_event_loop.zig`: - **`registerWithFd` (epoll):** when the other direction is already registered on this poll, OR its event mask into the `CTL_MOD` so both stay armed. `debugAssert` against bidirectional + one-shot (EPOLLONESHOT disarms the whole fd on the first event in either direction — not hit by the DNS path, which passes one_shot=false). - **`registerWithFd` (kqueue):** unchanged. kqueue keys on `(ident, filter)`, so the second `register()` call's `EV_ADD` creates a separate knote for the new filter without touching the existing one — there is no "already registered → modify" branch on kqueue. - **`unregisterWithFd` (kqueue mac/freebsd):** when both directions are set, submit two `EV_DELETE` changes (`EVFILT_READ` + `EVFILT_WRITE`) in one kevent call. epoll's `CTL_DEL` keys on fd alone so already removes both. Dropped the both-directions assertion. - macOS: changelist is `[2]kevent64_s`. The rc<0 global-error check now runs before per-entry `EV_ERROR` checks, which are guarded on `rc >= N` (with `KEVENT_FLAG_ERROR_EVENTS` error events are packed from index 0, not positionally matched to changes). - FreeBSD: `nevents=0`, so per-entry errors surface as rc=-1/errno for the first failing change; a silent miss on the second `EV_DELETE` (ENOENT) is harmless. - `needs_rearm` early-return now also clears `.poll_writable` (was only clearing readable/process/machport). `src/bun.js/api/bun/dns.zig:onDNSSocketState` (POSIX): when c-ares's desired direction set differs from what's registered: - **Adding only** (e.g. W → R+W): just `register()` the new direction — one `CTL_MOD` on epoll, one `EV_ADD` on kqueue. - **Dropping a direction** (e.g. W → R): `unregister()` then re-`register()` the remaining direction(s). FilePoll has no per-direction unregister, and leaving writable armed on a level-triggered connected socket would busy-loop. c-ares DNS fds are short-lived so the extra syscall is acceptable. `activate`/`deactivate` are already idempotent via `has_incremented_poll_count`, so registering twice keeps the loop refcount correct. ### Tests - `bun run zig:check-all`: all platforms compile (Linux/macOS/Windows × debug/release, x64/arm64). - `test/js/node/dns/` + `test/js/bun/dns/`: 137 pass / 2 fail (both pre-existing env-dependent: `dns.prefetch` cache-hit timing, `dns.getServers` resolv.conf — fail identically on parent of this commit). - `test/js/node/async_hooks/`: 110 pass / 3 todo / 1 fail (pre-existing `http-clientrequest` timeout, unrelated). - `async-context-dns-resolveTxt.js` fixture (original repro): exit 0. Split out of oven-sh#29897 per review. --------- Co-authored-by: robobun <robobun@users.noreply.github.com> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
c-ares can request both readable and writable on the same TCP DNS fd — writable while connecting or when a send hits EAGAIN, readable for the response.
dns.zig:onDNSSocketStateregistered both directions on a singleFilePoll, which set both.poll_readableand.poll_writable; on close,unregisterWithFdtrippedbun.assert(!(poll_readable && poll_writable)). On epoll the secondregisterwas also aCTL_MODthat silently overwrote the first direction's mask.Changes
src/async/posix_event_loop.zig:registerWithFd(epoll): when the other direction is already registered on this poll, OR its event mask into theCTL_MODso both stay armed.debugAssertagainst bidirectional + one-shot (EPOLLONESHOT disarms the whole fd on the first event in either direction — not hit by the DNS path, which passes one_shot=false).registerWithFd(kqueue): unchanged. kqueue keys on(ident, filter), so the secondregister()call'sEV_ADDcreates a separate knote for the new filter without touching the existing one — there is no "already registered → modify" branch on kqueue.unregisterWithFd(kqueue mac/freebsd): when both directions are set, submit twoEV_DELETEchanges (EVFILT_READ+EVFILT_WRITE) in one kevent call. epoll'sCTL_DELkeys on fd alone so already removes both. Dropped the both-directions assertion.[2]kevent64_s. The rc<0 global-error check now runs before per-entryEV_ERRORchecks, which are guarded onrc >= N(withKEVENT_FLAG_ERROR_EVENTSerror events are packed from index 0, not positionally matched to changes).nevents=0, so per-entry errors surface as rc=-1/errno for the first failing change; a silent miss on the secondEV_DELETE(ENOENT) is harmless.needs_rearmearly-return now also clears.poll_writable(was only clearing readable/process/machport).src/bun.js/api/bun/dns.zig:onDNSSocketState(POSIX): when c-ares's desired direction set differs from what's registered:register()the new direction — oneCTL_MODon epoll, oneEV_ADDon kqueue.unregister()then re-register()the remaining direction(s). FilePoll has no per-direction unregister, and leaving writable armed on a level-triggered connected socket would busy-loop. c-ares DNS fds are short-lived so the extra syscall is acceptable.activate/deactivateare already idempotent viahas_incremented_poll_count, so registering twice keeps the loop refcount correct.Tests
bun run zig:check-all: all platforms compile (Linux/macOS/Windows × debug/release, x64/arm64).test/js/node/dns/+test/js/bun/dns/: 137 pass / 2 fail (both pre-existing env-dependent:dns.prefetchcache-hit timing,dns.getServersresolv.conf — fail identically on parent of this commit).test/js/node/async_hooks/: 110 pass / 3 todo / 1 fail (pre-existinghttp-clientrequesttimeout, unrelated).async-context-dns-resolveTxt.jsfixture (original repro): exit 0.Split out of #29897 per review.