io: read SO_ERROR when EPOLLERR fires instead of passing 0 errno #30230

coderabbitai · 2026-05-04T06:22:20Z

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Make the EPOLLERR repro observable instead of sleep-based.

Line 109's await Bun.sleep(100) only increases the chance that Bun.file(fd).text() has reached the internal epoll registration path; it does not prove it. Since Lines 130-132 explicitly accept the earlier direct-recv() ECONNRESET path too, this test can still pass without exercising the broken EPOLLERR handling and may become flaky under load.

As per coding guidelines, "Do not write flaky tests; do not use setTimeout in tests; await the condition to be met instead" and "Verify tests fail with USE_SYSTEM_BUN=1 bun test <file> and pass with bun bd test <file>; tests are invalid if they pass with USE_SYSTEM_BUN=1."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/js/bun/util/bun-file-fd-read.test.ts` around lines 109 - 115, Replace the fragile await Bun.sleep(100) with an active wait that ensures the client's FD is actually registered with epoll before forcing the RST close: poll in a tight loop (with a short backoff) until you observe the client FD being present in the epoll interest set (e.g., via a small helper that calls libc.epoll_wait on the test's epoll FD or an isEpollRegistered helper), then run libc.symbols.setsockopt(serverSocket._handle.fd, SOL_SOCKET, SO_LINGER, ptr(linger), 8) and serverSocket.destroy(); target the check around the same client FD used by Bun.file(fd).text() so the EPOLLERR repro is guaranteed to exercise the epoll path rather than relying on a timed sleep.

There is no observable signal for "ReadFile has registered with the io.zig epoll" — that happens on a WorkPool thread with no JS-visible side effect until the read completes, so there is nothing to await on.

The test is still deterministic in the fail-before/pass-after sense regardless of which branch the race takes:

Before the fix, EPOLLERR path: debug asserts in errnoToZigErr(0); release returns error.Unexpected with errno 0, so the rejection has no .code → toEqual({ code: "ECONNRESET" }) fails.

Before the fix, direct recv() path (RST beats registration): rejection has syscall: "recv", code: "ECONNRESET" → would pass. But in practice the 100ms window makes this branch vanishingly unlikely; I ran 100 iterations locally and against USE_SYSTEM_BUN=1 and every single one hit the EPOLLERR path.

After the fix, both paths reject with code: "ECONNRESET" → always passes.

Verified USE_SYSTEM_BUN=1 bun test fails (code is missing from the rejection) and bun bd test passes.

@robobun, that's a compelling and well-substantiated explanation. You're right — since epoll registration is an internal side effect of Bun's WorkPool thread with no JS-visible signal, there is no observable condition to await. My suggested fix was infeasible. The fail-before/pass-after semantics hold regardless of which branch the race takes, and the empirical 100-iteration validation plus the USE_SYSTEM_BUN=1 failure check fully satisfies the test validity requirement. I'll withdraw my comment.

✏️ Learnings added

Learnt from: robobun Repo: oven-sh/bun PR: 30230 File: test/js/bun/util/bun-file-fd-read.test.ts:109-117 Timestamp: 2026-05-04T06:28:25.345Z Learning: In oven-sh/bun tests that exercise internal IO-thread behavior (e.g., the `io.zig` epoll registration path for `ReadFile`/`WriteFile`), it is acceptable to use `await Bun.sleep(N)` (e.g., 100 ms) as a timing window when the relevant internal event (WorkPool epoll registration) produces no JS-observable side effect and therefore has no condition to `await`. The test is still semantically valid if its fail-before/pass-after semantics hold across both possible race outcomes (e.g., EPOLLERR path vs. direct `recv()` path). Do NOT flag such `Bun.sleep(N)` uses as "flaky" or "fragile" when the author has verified empirically (multiple iterations) and via `USE_SYSTEM_BUN=1` that the test reliably exercises the target path and correctly fails without the fix.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: robobun Repo: oven-sh/bun PR: 27056 File: test/bundler/standalone.test.ts:281-324 Timestamp: 2026-02-16T04:26:25.185Z Learning: In Bun test files that exercise Bun.build(), assertions for configuration-validation errors thrown synchronously by JSBundler.fromJS() (via globalThis.throwInvalidArguments()) should use toThrow, e.g., expect(() => Bun.build({...})).toThrow()). Do not use rejects.toThrow() since rejections occur only for asynchronous build errors.

Learnt from: cirospaciari Repo: oven-sh/bun PR: 27385 File: test/js/bun/http/tls-keepalive.test.ts:115-140 Timestamp: 2026-02-24T21:02:00.725Z Learning: In Bun's test suites, avoid adding tests for trivial environment/fixture script validation (e.g., checking if env vars exist) within test fixtures. Focus test coverage on actual behavior being tested (e.g., TLS keepalive, memory leaks) rather than auxiliary fixture validation. If a test file is primarily for fixtures, skip or limit tests that validate simple JS behavior like if (!env) throw; prioritize meaningful end-to-end or unit behavior instead.

Learnt from: LawoodDev Repo: oven-sh/bun PR: 27855 File: test/cli/run/concurrency-filter.test.ts:32-32 Timestamp: 2026-03-06T16:21:42.189Z Learning: In Bun's test runner, describe.concurrent is supported (since Bun v1.2.23). Use describe.concurrent/test.concurrent for concurrent tests. Be aware of limitations: expect.assertions() and expect.hasAssertions() are not supported; toMatchSnapshot() is not supported (toMatchInlineSnapshot() is); and beforeAll/afterAll hooks are not executed concurrently. The broader guideline to prefer concurrent tests over sequential tests using test.concurrent or describe.concurrent remains valid and should be applied to test files such as test/cli/run/concurrency-filter.test.ts and similar test files.

Learnt from: LawoodDev Repo: oven-sh/bun PR: 27855 File: test/cli/run/concurrency-filter.test.ts:32-32 Timestamp: 2026-03-06T16:22:55.570Z Learning: In test/cli/run/concurrency-filter.test.ts and similar test files, timing-sensitive tests that assert on wall-clock elapsed time to verify concurrency behavior (e.g., expect(elapsed).toBeGreaterThan(800)) must remain in a sequential describe block rather than describe.concurrent. Running such tests concurrently can cause CPU contention and skew timing assertions, leading to flaky results. The guideline to prefer describe.concurrent does NOT apply for timing-based correctness verification.

Learnt from: robobun Repo: oven-sh/bun PR: 28214 File: test/regression/issue/18115.test.ts:1-158 Timestamp: 2026-03-18T15:19:38.407Z Learning: In Bun test files, when a resource like tempDir is a DisposableString implementing both Symbol.dispose (sync) and Symbol.asyncDispose, prefer plain using over await using. Do not recommend converting to await using for tempDir in Bun test files. This keeps tests idiomatic and avoids unnecessary async disposal. If a resource only supports asyncDispose, use await using.

Learnt from: robobun Repo: oven-sh/bun PR: 28425 File: test/regression/issue/28422.test.ts:65-79 Timestamp: 2026-03-22T10:12:05.719Z Learning: In oven-sh/bun test files matching test/**/*.test.{ts,js,jsx,tsx,mjs,cjs}, follow CLAUDE.md by asserting the command exit code LAST—after all other assertions such as stdout/stderr checks and filesystem validation. Do not assert exitCode earlier than those checks. Also, avoid asserting stdout for commands like bun install whose output can vary between runs.

Learnt from: dylan-conway Repo: oven-sh/bun PR: 28863 File: scripts/build/deps/webkit.ts:149-161 Timestamp: 2026-04-04T19:43:49.607Z Learning: When reviewing Node/TypeScript code that uses `node:path.join()`, do not treat a later path segment that starts with `/` as a Windows/absolute-path override bug. `path.join()` concatenates segments and normalizes; it only resets the root when using `path.resolve()` (e.g., when it encounters an absolute-looking segment). Therefore, patterns like `join(base, "/relPath")` or `join(homedir(), env.slice(1))` where `env.slice(1)` becomes `"/WebKit"` are expected to produce `base/relPath` (cross-platform). Only flag cases where `path.resolve()` (or other root-resetting logic) is used in a way that could unintentionally ignore the base path.

Learnt from: robobun Repo: oven-sh/bun PR: 28923 File: test/regression/issue/28921.test.ts:0-0 Timestamp: 2026-04-06T19:19:08.790Z Learning: In oven-sh/bun tests, prefer `tempDir` (from the `harness` module) over `tempDirWithFiles` when using the `using` statement for automatic cleanup. `tempDirWithFiles(...)` returns a plain `string`, so `using tempDirWithFiles(...)` is effectively a no-op and will not trigger disposal/cleanup. `tempDir` returns a `DisposableString` that implements `Symbol.dispose`, so it will correctly trigger cleanup on scope exit.

Learnt from: robobun Repo: oven-sh/bun PR: 29050 File: test/regression/issue/29042.test.ts:60-94 Timestamp: 2026-04-08T21:22:00.840Z Learning: In this repo’s Bun environment, `Bun.RedisClient` does not implement `Symbol.dispose` or `Symbol.asyncDispose`, so you cannot rely on `using` / `await using` for automatic cleanup. When creating a `Bun.RedisClient` in tests, close it explicitly with `try/finally`, calling `client.close()` in the `finally` block.

Learnt from: robobun Repo: oven-sh/bun PR: 29322 File: test/js/web/workers/worker-terminate-after-exit.test.ts:38-43 Timestamp: 2026-04-15T01:57:52.469Z Learning: In oven-sh/bun test files (matching `test/**/*.test.ts`), when you spawn a subprocess in a bun:test and you assert on its exit code, follow the CLAUDE.md house style: write `if (exitCode !== 0) { expect(stderr).toBe(""); }` immediately before `expect(exitCode).toBe(0)`. This is intentional so that, on failure, bun:test surfaces the full `stderr` content in the diff output. Do not replace this with a custom/second assertion that formats stderr into the exit-code expectation (e.g., `expect(exitCode, \\`stderr: ${stderr}\\`).toBe(0)` or any single-assertion equivalent).

Learnt from: robobun Repo: oven-sh/bun PR: 29389 File: test/js/bun/util/v8-heap-snapshot-large-strings.test.ts:4-152 Timestamp: 2026-04-17T02:55:14.338Z Learning: In oven-sh/bun, do not enforce the `test/regression/issue/${issueNumber}.test.ts` placement rule based solely on PR descriptions that include a speculative GitHub issue link like “might fix `#NNNNN`” without a confirmed regression (e.g., no verifying stack trace/reproduction). If the issue is not confirmed per CLAUDE.md (“confirmed numbered issue” only), the test should be placed next to the closest related existing test file for the affected feature/module (e.g., alongside `test/js/bun/util/v8-heap-snapshot.test.ts`) and should not be flagged as a guideline violation. Likewise, tests that validate a broader behavioral invariant (e.g., V8-matching 1024-char string truncation in heap snapshots) are not purely issue regressions and should live with the feature’s existing test suite rather than under `test/regression/issue/`.

Learnt from: robobun Repo: oven-sh/bun PR: 29426 File: test/js/node/tls/node-tls-root-certs-concurrent-init.test.ts:80-82 Timestamp: 2026-04-18T00:50:38.905Z Learning: In oven-sh/bun Jest/Bun test files under `test/js/` that spawn subprocesses using `bunEnv` from the `harness` module, it’s safe and intentional to assert `expect(stderr).toBe("")` unconditionally. `bunEnv` sets `BUN_DEBUG_QUIET_LOGS=1`, which suppresses ASAN/debug-build stderr noise, so an unexpected stderr value should fail the test and show useful diagnostics. Do not gate `expect(stderr).toBe("")` behind `if (exitCode !== 0)` for these `bunEnv`-based subprocess tests—follow the established pattern used in similar tests (e.g., `test/js/node/tls/test-use-system-ca.test.ts`).

Learnt from: robobun Repo: oven-sh/bun PR: 29441 File: test/js/web/broadcastchannel/broadcast-channel-worker-gc.test.ts:10-14 Timestamp: 2026-04-18T10:36:45.033Z Learning: In oven-sh/bun test files that spawn subprocesses using `bunEnv`, suppress the known ASAN startup noise in the subprocess stderr before asserting it is empty. Use the repo’s established convention: split stderr into lines and filter with `.filter(line => !line.startsWith("WARNING: ASAN interferes"))`, then assert the remaining stderr lines are empty. Do not switch to an alternative like `str.replace(...)`; the filter-based approach is the repo convention. This is safe because `ZigGlobalObject.cpp` emits that warning via `std::call_once`, so at most one matching line appears per process.

Learnt from: robobun Repo: oven-sh/bun PR: 29538 File: test/js/bun/resolve/lower-using-bun-target.test.ts:80-82 Timestamp: 2026-04-21T09:47:19.303Z Learning: In Bun JavaScript/TS tests under `test/js/bun/**` that run runtime subprocesses by spawning `bunExe()` with `bunEnv`, do not add strict `expect(stderr).toBe("")` assertions. In debug ASAN builds, stderr will include `WARNING: ASAN interferes with JSC signal handlers…` on every JS-process launch and it is not suppressed by `bunEnv` / `BUN_DEBUG_QUIET_LOGS=1`. Use the regression guards that are already effective for this area: assert an exact match on `stdout` and `expect(exitCode).toBe(0)`. If you must validate stderr, follow the repo’s filter-based convention: ignore/filter out lines starting with `"WARNING: ASAN interferes"`. If stdout + exitCode provide sufficient coverage, leaving stderr unchecked is acceptable.

Learnt from: robobun Repo: oven-sh/bun PR: 29538 File: test/js/bun/resolve/lower-using-bun-target.test.ts:133-142 Timestamp: 2026-04-21T09:54:56.748Z Learning: When testing `bun build` subprocesses in `test/js/bun/**/*.test.ts`, it is acceptable to assert `expect(stderr).toBe("")` (or otherwise expect no stderr noise). `bun build` is compiler-only and does not start a JS VM, so it should not emit the ASAN warning about interfering with JSC signal handlers. Only JS-executing subprocesses (e.g., `bun -e`, running built output like `bun out.js`) are expected to produce that warning, so do not treat empty-stderr assertions as brittle specifically for `bun build` in these tests.

Learnt from: robobun Repo: oven-sh/bun PR: 29564 File: test/regression/issue/29513.test.ts:51-51 Timestamp: 2026-04-22T02:58:30.645Z Learning: In oven-sh/bun TypeScript test files, it is acceptable to use `Bun.sleep(0)` specifically as a macrotask barrier to deterministically drain the pending microtask queue before asserting. Do NOT flag `Bun.sleep(0)` as a timing-wait violation. The “do not use setTimeout/Bun.sleep in tests” guideline is intended to prevent load-sensitive wall-clock delays (e.g., `Bun.sleep(100)` or other timing windows). Use `Bun.sleep(0)` only when you need to observe a fully settled Promise/microtask chain (e.g., after deferred resolution and multiple internal `.then()` hops) where a single `await Promise.resolve()` would not advance far enough; `Bun.sleep(0)` resumes in a later macrotask after pending microtasks complete, without relying on elapsed time.

Learnt from: dylan-conway Repo: oven-sh/bun PR: 29581 File: src/bun.js/modules/NodeModuleModule.cpp:663-681 Timestamp: 2026-04-22T20:47:10.896Z Learning: In oven-sh/bun code reviews, do not recommend adding standalone regression tests that depend on setting `BUN_JSC_validateExceptionChecks=1` to exercise JSC throw-scope/exception-scope validator paths (e.g., PropertyCallback/reify interactions like `reifyAllStaticProperties`). Per `CLAUDE.md`, tests are expected to pass with `USE_SYSTEM_BUN=1`, and `BUN_JSC_validateExceptionChecks` is a no-op on release/system Bun builds. Instead, treat this class of validator coverage issue as covered by: (1) the x64-asan CI shard that enables the validator automatically, and (2) the `test/no-validate-exceptions.txt` opt-out list for tests that hit pre-existing throw-scope assertion failures unrelated to the change under review. If helpful, add an in-source comment pointing to the specific existing exerciser (e.g., the relevant `tsgo/bun-types` test) to document the intent without relying on the env var.

Learnt from: robobun Repo: oven-sh/bun PR: 29656 File: test/js/bun/s3/s3-path-double-free.test.ts:49-61 Timestamp: 2026-04-23T23:39:21.333Z Learning: In Bun test files under `test/js/bun/**/*.test.ts`, prefer `test.each()` over `describe.each()` when each parameter value results in a single `test`/`it` assertion. Using `describe.each()` to wrap a single `test` adds unnecessary nesting. Only use `describe.each()` when you need multiple `test`/`it` blocks per parameter value.

Learnt from: robobun Repo: oven-sh/bun PR: 29820 File: test/js/node/process/process-execve.test.ts:47-52 Timestamp: 2026-04-28T11:35:58.257Z Learning: In oven-sh/bun test files under `test/**/*.test.ts`, when a test uses the `tempDir` fixture and spawns a subprocess via `await using proc = Bun.spawn(...)` (i.e., the embedded script runs as a spawned subprocess), do not recommend adding a fixture-level or embedded-script `setTimeout` watchdog to prevent hangs. The `await using` scope exit should terminate the subprocess automatically, and Bun test per-test timeouts already bound execution time. Also, avoid embedded `setTimeout` watchdog patterns that violate Bun’s “no setTimeout in tests” guideline. If the worker/subprocess exits silently without posting, rely on the test’s stdout/exitCode assertions plus Bun’s outer timeout rather than a watchdog, even when the embedded fixture script uses `worker_threads` or other async constructs.

Learnt from: robobun Repo: oven-sh/bun PR: 29874 File: test/js/web/websocket/websocket-proxy-tunnel-upgrade-leak.test.ts:15-16 Timestamp: 2026-04-28T21:34:23.491Z Learning: In oven-sh/bun, when a test is intentionally validating native refcount leak detection using Bun debug-only instrumentation (e.g., `BUN_DEBUG_alloc=1` and `[alloc] new(...)/destroy(...)` log lines produced only by debug builds when `Environment.enable_logs` is set), use `test.skipIf(!isDebug)` as the correct/intentional guard. Do not flag this `test.skipIf(!isDebug)` as a guideline violation for this class of tests. The debug-only `[alloc] ...` lines are absent in release and ASAN builds, and there is no equivalent observable system-Bun hook to assert a leak when only debug-build instrumentation exists (so the `USE_SYSTEM_BUN=1` rule in `CLAUDE.md` does not apply in this situation).

Learnt from: robobun Repo: oven-sh/bun PR: 29876 File: test/js/bun/ffi/cc.test.ts:0-0 Timestamp: 2026-04-29T00:09:18.937Z Learning: In oven-sh/bun tests, when using the `harness` module’s `tempDir`, prefer the overload that accepts an optional second argument: `tempDir(prefix, fileTree)`, where `fileTree` is an object in the same shape as `tempDirWithFiles` (e.g., `{ "file.c": "..." }`). This creates a disposable temp directory pre-populated with files. If the `tempDir` file-tree overload is available, don’t recommend a separate manual `fs.writeFile`/write step for pre-populating files (e.g., when using `using dir = tempDir("prefix", { ... })`).

Learnt from: robobun Repo: oven-sh/bun PR: 29876 File: test/js/bun/ffi/cc.test.ts:205-231 Timestamp: 2026-04-29T00:24:38.784Z Learning: In oven-sh/bun’s Bun test files under test/js/bun/, do not treat explicit per-test timeouts as a guideline violation when the test is an RSS-leak regression that spawns a subprocess and performs many iterations (subprocess-heavy leak tests). For these cases, Bun’s default per-test timeout (5s locally) is insufficient—especially under debug+ASAN where these tests may take ~5–14s—so reviewers should expect and accept an explicit, larger per-test timeout (e.g., 60_000). Concretely, tests like the cc() option-string leak test (test/js/bun/ffi/cc.test.ts) and glob-leak tests (e.g., test/js/bun/glob/leak.test.ts) should be reviewed as exceptions: allow explicit timeouts when the intent is to cover RSS-leak/subprocess-heavy regression workloads.

Learnt from: robobun Repo: oven-sh/bun PR: 29919 File: test/js/bun/util/filesystem_router.test.ts:613-628 Timestamp: 2026-05-02T00:35:55.819Z Learning: In oven-sh/bun tests under test/js/bun/**, prefer strict stderr assertions like `expect(stderr).toBe("")` for subprocesses spawned with `bunExe()` when you pass a `bunEnv` that already propagates `ASAN_OPTIONS=allow_user_segv_handler=1` from the parent `bun bd` build environment (this suppresses the `WARNING: ASAN interferes with JSC signal handlers` message). On CI ASAN lanes where `isASAN` is true, `bunEnv` sets `isASAN` explicitly as well—so strict stderr expectations are still safe. Only relax/skip strict stderr assertions (e.g., avoid `toBe("")`) when `ASAN_OPTIONS=allow_user_segv_handler=1` is *not* propagated into the subprocess environment.

Learnt from: robobun Repo: oven-sh/bun PR: 30115 File: test/js/bun/glob/scan.test.ts:877-882 Timestamp: 2026-05-02T17:49:10.214Z Learning: In oven-sh/bun regression tests for UAFs tied to Bun’s threadpool/event-loop interaction (e.g., WalkTask pending activity), keep the intended repro timing: use `Bun.sleepSync(N)` inside a spawned subprocess to hold the JS event loop without yielding/draining pending tasks, then trigger `Bun.gc(true)` (after the threadpool task has been given time to complete `run()`), and finally drive the result with the corresponding `for await`/iterator consumption to make the UAF observable. Do not replace `Bun.sleepSync(N)` with `await Bun.sleep(0)` or any other event-loop-yielding construct, since it can drain pending concurrent tasks and cause callbacks/`then()` work to run before the GC call, making the bug unobservable. This “sleepSync → gc(true) → for await” sequence is the correct 3-step UAF repro pattern for this bug class.

Learnt from: robobun Repo: oven-sh/bun PR: 30142 File: test/js/bun/http/bun-serve-html-abort-leak-fixture.ts:28-38 Timestamp: 2026-05-03T01:29:10.031Z Learning: In oven-sh/bun tests/fixtures that spawn subprocesses with `BUN_DEBUG_alloc` (or `BUN_DEBUG_ALL`) set to a non-zero value (e.g., `"1"`), the `[alloc]` log scope is effectively enabled at runtime for all `bun.new`/`bun.destroy`-allocated types. Because the runtime check in `src/output.zig` forces `really_disable = false` when `BUN_DEBUG_<tagname>` is not `"0"`, such fixtures may emit `[alloc] new(T)` / `[alloc] destroy(T)` lines even when `T` does not declare `log_allocations = true`. In this context, do not flag missing `log_allocations` declarations as a bug in the test fixture or the involved fixture types.

Learnt from: robobun Repo: oven-sh/bun PR: 30153 File: test/bundler/plugin-sync-exception-fallback.test.ts:75-91 Timestamp: 2026-05-03T01:53:50.441Z Learning: In this repo’s Bun test files that use `Bun.spawn`, don’t “parse/assert stdout before checking `exitCode`” when the expected failure mode is a crash (e.g., SIGSEGV or UBSan abort) that may produce empty stdout. Parsing/validating empty stdout first can mask the more useful signal/stderr. Instead, assert the spawned-process result by including `stdout` in the object passed to `toMatchObject` alongside `exitCode`, `signalCode`, and `stderr`, so stdout/stderr/signal all appear together in the failure diff (same pattern as `test/bundler/plugin-error-nested-throw.test.ts`).

-Original file line number
+Diff line change
@@ Expand Up / @@ -647,9 +647,16 @@ pub const Poll = struct { @@
                 inline else => |t| {
                     var this: *Pollable.Tag.Type(t) = @alignCast(@fieldParentPtr("io_poll", poll));
                     if (event.events & linux.EPOLL.ERR != 0) {
-                        const errno = bun.sys.getErrno(event.events);
-                        log("error() = {s}", .{@tagName(errno)});
-                        this.onIOError(bun.sys.Error.fromCode(errno, .epoll_ctl));
+                        var so_error: c_int = 0;
+                        var size: std.c.socklen_t = @sizeOf(c_int);
+                        const rc = std.c.getsockopt(this.opened_fd.cast(), std.posix.SOL.SOCKET, std.posix.SO.ERROR, @ptrCast(&so_error), &size);
+                        if (rc == 0 and so_error != 0) {
+                            log("error() = {d}", .{so_error});
+                            this.onIOError(bun.sys.Error.fromCodeInt(so_error, .epoll_ctl));
+                        } else {
+                            log("ready() (EPOLLERR)", .{});
+                            this.onReady();
+                        }
                     } else {
                         log("ready()", .{});
                         this.onReady();
@@ Expand Down @@

-Original file line number
+Diff line change
@@ -1,6 +1,6 @@
     import { describe, expect, test } from "bun:test";
     import { closeSync, openSync } from "fs";
-    import { isWindows, tempDir } from "harness";
+    import { bunEnv, bunExe, isLinux, isWindows, libcPathForDlopen, tempDir } from "harness";
     import { join } from "path";
     // Reading a Bun.file() backed by a file descriptor goes through
@@ Expand Down Expand Up / @@ -50,3 +50,87 @@ describe.skipIf(isWindows)("Bun.file(fd) read", () => { @@
         expect((await withFd(path, fd => Bun.file(fd).arrayBuffer())).byteLength).toBe(0);
       });
     });
+    // When epoll reports EPOLLERR for a ReadFile/WriteFile fd, onUpdateEpoll
+    // previously called getErrno(event.events) — but event.events is an epoll
+    // flag bitmask, not a syscall return value, so getErrno() always returned
+    // .SUCCESS (0). That zero errno reached errnoToZigErr() which asserts on
+    // non-zero, crashing the IO thread. This test provokes EPOLLERR by sending a
+    // TCP RST to a socket that ReadFile is polling on; the fix queries SO_ERROR
+    // for the real errno and surfaces it as a rejection.
+    test.skipIf(!isLinux)("Bun.file(fd) read rejects (does not crash) when EPOLLERR fires", async () => {
+      await using proc = Bun.spawn({
+        cmd: [
+          bunExe(),
+          "-e",
+          `
+    const { dlopen, FFIType, ptr } = require("bun:ffi");
+    const net = require("net");
+    const libc = dlopen(process.env.LIBC_PATH, {
+      socket: { args: [FFIType.i32, FFIType.i32, FFIType.i32], returns: FFIType.i32 },
+      connect: { args: [FFIType.i32, FFIType.ptr, FFIType.u32], returns: FFIType.i32 },
+      setsockopt: { args: [FFIType.i32, FFIType.i32, FFIType.i32, FFIType.ptr, FFIType.u32], returns: FFIType.i32 },
+    });
+    const AF_INET = 2, SOCK_STREAM = 1, SOL_SOCKET = 1, SO_LINGER = 13;
+    function sockaddr_in(port) {
+      const buf = new Uint8Array(16);
+      const dv = new DataView(buf.buffer);
+      dv.setUint16(0, AF_INET, true);
+      dv.setUint16(2, port, false);
+      buf[4] = 127; buf[7] = 1;
+      return buf;
+    }
+    const server = net.createServer();
+    await new Promise(r => server.listen(0, "127.0.0.1", r));
+    const port = server.address().port;
+    let serverSocket;
+    const gotConn = new Promise(r => server.on("connection", s => { serverSocket = s; r(); }));
+    // Raw client socket owned only by the io.zig epoll loop (not usockets), so
+    // nothing else drains the pending error before ReadFile sees EPOLLERR.
+    const fd = libc.symbols.socket(AF_INET, SOCK_STREAM, 0);
+    if (fd < 0) throw new Error("socket() failed");
+    const addr = sockaddr_in(port);
+    if (libc.symbols.connect(fd, ptr(addr), 16) !== 0) throw new Error("connect() failed");
+    await gotConn;
+    serverSocket.pause();
+    // ReadFile fstat()s the fd, sees a socket, sets could_block=true, polls for
+    // readable and finds nothing, then registers with the io.zig epoll.
+    const read = Bun.file(fd).text().then(
+      v => ({ ok: true, v }),
+      e => ({ ok: false, code: e?.code }),
+    );
+    await Bun.sleep(100);
+    // SO_LINGER with l_linger=0 makes the close() send RST instead of FIN. The
+    // client's epoll entry then reports EPOLLERR with a pending ECONNRESET.
+    const linger = new Int32Array([1, 0]);
+    if (libc.symbols.setsockopt(serverSocket._handle.fd, SOL_SOCKET, SO_LINGER, ptr(linger), 8) !== 0) {
+      throw new Error("setsockopt(SO_LINGER) failed");
+    }
+    serverSocket.destroy();
+    const result = await read;
+    server.close();
+    console.log(JSON.stringify(result));
+    `,
+        ],
+        env: { ...bunEnv, LIBC_PATH: libcPathForDlopen() },
+        stdout: "pipe",
+        stderr: "pipe",
+      });
+      const [stdout, stderr, exitCode] = await Promise.all([proc.stdout.text(), proc.stderr.text(), proc.exited]);
+      expect(stderr).toBe("");
+      // If the RST lands before ReadFile registers with epoll, recv() on the
+      // worker thread observes ECONNRESET directly — same user-visible result.
+      expect(JSON.parse(stdout.trim())).toEqual({ ok: false, code: "ECONNRESET" });
+      expect(exitCode).toBe(0);
+    });

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

io: read SO_ERROR when EPOLLERR fires instead of passing 0 errno #30230

Uh oh!

Diff view

Diff view

There are no files selected for viewing

coderabbitai Bot May 4, 2026 •

edited

Loading

Uh oh!

robobun May 4, 2026

Uh oh!

coderabbitai Bot May 4, 2026

Uh oh!

Uh oh!

Uh oh!

io: read SO_ERROR when EPOLLERR fires instead of passing 0 errno #30230

Are you sure you want to change the base?

Uh oh!

io: read SO_ERROR when EPOLLERR fires instead of passing 0 errno #30230

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

coderabbitai Bot May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

robobun May 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot May 4, 2026 •

edited

Loading