You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1925d7 fix(server): add kJSTypeBigInt to JSC C API JSType enum (#29758)
What does this PR do?
server.fetch() panicked with index out of bounds: index 7, len 7
(debug) or segfaulted (release) when passed a BigInt argument.
The Zig binding for the JavaScriptCore C API JSType enum was missing
the kJSTypeBigInt variant. JSValueGetType can return this value, and
the result is used as an index into an EnumArray of error messages in onFetch. When a BigInt was passed, the index (7) was out of bounds for
the 7-element array.
Add kJSTypeBigInt to the JSType enum in javascript_core_c_api.zig to match JSValueRef.h
Add the corresponding entries to fetch_type_error_names / fetch_type_error_string_values / fetch_type_error_strings
Now rejects with TypeError: fetch() expects a string, but received BigInt instead of crashing.
How did you verify your code works?
Added a regression test in test/js/bun/http/bun-server.test.ts
covering BigInt, Symbol, Boolean, and Number arguments. Verified it
crashes on main and passes with this fix.
9a6bf8 build: drop top-level await from fetch-cli.ts (#29771)
fetch-cli.ts is imported by source.ts/zig.ts as a library (for fetchCliPath) and also runs as a CLI. The guarded await main() marks
the module HasTLA, which forces every importer — and the {config,webkit,flags,source} cycle — onto the spec's async-evaluation
path for code that's dead on import. Replace with main().catch(...) so
the module stays sync when imported.
This is the immediate trigger for the ReferenceError: Cannot access 'webkit' before initialization crash a freshly-built bun hits running scripts/build.ts, which several open farm/* branches (#29725,
#29731, #29733, #29749, #29756, #28512) each work around by relocating WEBKIT_VERSION. Supersedes the scripts/build/ portions of those.
Verified: build/debug/bun-debug scripts/build.ts --help (was crashing,
now works), fetch-cli.ts CLI usage and BuildError/non-BuildError exit
codes unchanged.
f19819 http2: heap-allocate Stream so *Stream survives map rehash during re-entrant JS (#29765)
What does this PR do?
H2FrameParser.streams stored Streamby value in a bun.U32HashMap. Any *Stream obtained from getPtr / getEntry().value_ptr / valueIterator() pointed into the map's
backing storage and dangled whenever a re-entrant JS callback (an
options getter, a write callback, a forEachStream callback, onStreamStart) called session.request() and triggered a rehash.
Under ASAN this surfaces as heap-use-after-free at several distinct call
sites.
This changes the map to bun.U32HashMap(*Stream) with Stream
heap-allocated via bun.TrivialNew in handleReceivedStreamID and
freed in H2FrameParser.deinit. *Stream is now stable for the
lifetime of the parser regardless of map growth, which fixes every
current and future call site that holds a *Stream across a JS
dispatch.
Streams are never individually removed from the map today
(freeResources() clears resources in place but the entry remains until
parser deinit), so no ref-counting is required — the parser is already
ref-counted and owns the streams for its full lifetime. Reclaiming
closed-stream entries is a pre-existing concern unchanged by this PR and
will be addressed separately.
forEachStream and detachFromJS additionally switch from valueIterator() to StreamResumableIterator since they call into JS
while walking buckets; the bucket walk itself can still be invalidated
by rehash even though the values it yields are now stable.
How did you verify your code works?
test/js/node/http2/node-http2-streams-rehash.test.ts covers the three
known repros (forEachStream timeout listener, request() options getter,
flushQueue write callback). All three trip ASAN heap-use-after-free on a
debug build of main and pass with this change. The existing 245-test node-http2.test.js suite continues to pass.
Closes #29754
Closes #29756
Closes #29759
35825a Fix abort in structuredClone with ArrayBuffer >= 2GiB (#29764)
What
structuredClone() (and bun:jscserialize()) aborted the process
when serializing an ArrayBuffer, SharedArrayBuffer,
resizable/growable variant, or typed array view whose backing buffer was
~2GiB or larger.
structuredClone(newArrayBuffer(2**31));// SIGABRT
Why
The structured-clone serialization buffer is a WTF::Vector<uint8_t>,
whose capacity is capped at UINT32_MAX >> 1 (2GiB - 1). Writing the
raw bytes of a large ArrayBuffer into it via Vector::append pushes
the total size past that cap, and append uses FailureAction::Crash,
so allocateBuffer hits CRASH().
Fix
Before writing ArrayBuffer contents in the ArrayBufferTag / ResizableArrayBufferTag paths, call m_buffer.tryReserveCapacity(...)
for the tag + length header(s) + payload. If the reservation fails
(capacity limit or OOM), fail serialization with DataCloneError
instead of crashing. Because reservation happens before the tag is
written, the serialized stream is not left in a partially-written state,
and because the reservation only succeeds for sizes that fit the vector,
the subsequent write(const uint8_t*, unsigned) call (which takes a
32-bit length) cannot receive a truncated length either.
Found by Fuzzilli. Fingerprint 5a52c34b9bc49de3.
Also in this PR
scripts/build/deps/index.ts (and its 4 callers) changes allDeps from
a module-level const array to a cached function allDeps(). This is
unrelated to the crash fix but was required to land it: deps/webkit.ts
→ flags.ts/source.ts → config.ts → deps/webkit.ts forms an
import cycle, and depending on which module the graph is entered
through, the webkit binding can still be in TDZ when deps/index.ts
eagerly constructs the allDeps array. Rebuilding with a bun built from
current main trips this and bun bd / bun run build:release fail
with Cannot access 'webkit' before initialization. Deferring the array
construction to first call sidesteps the ordering dependency without
changing the list contents or link order.
b989ff Preserve Handlers.active_connections across socket/listener reload() (#29752)
Problem
socket.reload() (on TCPSocket/TLSSocket) and listener.reload()
replace the Handlers struct wholesale with the value returned by Handlers.fromJS(), which always initialises active_connections = 0.
But the socket's own markActive() contribution — and, when called from
inside a callback, the live Handlers.Scope — are still counted against
the old value.
Consequences
reload() inside a data handler → counter overwritten to 0; the
enclosing scope.exit() then hits 0 - 1 on a u32 → integer-overflow
panic in safe builds.
reload() outside any handler (client socket) → counter drops to 0;
the next callback's enter()/exit() cycle takes it 0→1→0, and the
client-mode branch of markInactive frees the heap Handlers
allocation while socket.handlers still points at it →
heap-use-after-free on the following callback (segfault in release).
Listener.reload() with live accepted sockets → counter zeroed;
closing any of them underflows.
Repro
usingserver=Bun.listen({hostname: "127.0.0.1",port: 0,socket: {open(s){s.write("x");},data(){}}});constc=awaitBun.connect({hostname: "127.0.0.1",port: server.port,socket: {data(s){s.reload({socket: {data(){},drain(){}}});},drain(){}},});// debug build: panic: integer overflow in Handlers.markInactive// release build: SIGSEGV on the second data event
Fix
Save active_connections before the deinit() + struct assignment and
restore it afterwards in both NewSocket.reload
(src/bun.js/api/bun/socket.zig) and Listener.reload
(src/bun.js/api/bun/socket/Listener.zig).
Verification
New subprocess fixture test/js/bun/net/socket-reload-fixture.ts drives
all three sequences (inside-callback reload, outside-callback reload
with two separate onData events, listener reload with a live accepted
socket then close). Wired into socket.test.ts.
Adds FreeBSD as a cross-compile target, following the same model as
#29675 (Android): host clang + --target=x86_64-unknown-freebsd14.3 --sysroot=<base.txz>.
Closes #29675
Stacked on #29675 — this PR includes the Android commits since both
share the crossTarget/sysroot build infrastructure. The
FreeBSD-specific diff starts at d892fcae0a.
FreeBSD is a separate OS (not a Linux abi like Android), so it goes
in type OS = ...|"freebsd", Environment.isFreeBSD, OperatingSystem.freebsd. It shares kqueue with macOS but uses plain kevent/struct kevent (not Darwin's kevent64_s), and FreeBSD 13+
has eventfd(2) and copy_file_range(2).
Builtins: FreeBSD ships compiler-rt as /usr/lib/libgcc.a and clang's
freebsd driver finds it via --sysroot — no resource-dir symlinking
needed (unlike NDK).
aarch64-unknown-freebsd is a Rust Tier 3 target (no prebuilt std), so
lolhtml uses -Zbuild-std for that arch.
Host-GCC include leak: on amazonlinux, clang's driver injects /usr/include/c++/N even with --sysroot, breaking #include_next in
the sysroot's libc++. Fixed with -nostdlibinc + explicit -isystem
for the two sysroot dirs.
Prior art
Builds on lwhsu/bun claude/freebsd-support (Zig source changes,
adapted from old CMake build to scripts/build/*.ts) and nektro's af85c02f6d (zig build check-freebsd).
f97aa6 stdio: skip exit-time tcsetattr unless Bun modified termios (#29593)
What / why
Fixes #29592.
When a Bun process runs with a TTY on fd 0/1/2, it snapshots termios at
startup and unconditionally writes the snapshot back at exit. Termios is
a property of the /dev/pts/* device, not the fd, so when a downstream
pipeline consumer (less, fzf, fx, ...) has since opened /dev/tty
and entered raw mode, our exit-time tcsetattr clobbers their raw-mode
state mid-flight. The user-visible symptom is a pager that appears alive
but unresponsive to keypresses — everything is echoed and line-buffered
until the user hits Enter.
bun /tmp/bun-tty-bug.js | less # hit 'q' → nothing happens
The race is timing-dependent but the mechanism is deterministic: the TCSETS calls happen on every run. Scripts that exit fast enough can
win the race against less's own setRawMode, but anything heavier
loses reliably.
Fix
When Bun is a pipeline producer (stdout is a pipe, not a TTY — the bun foo.js | less shape), gate bun_restore_stdio on a new bun_stdio_modified[fd] flag so fds Bun never touched are left alone at
exit. The flag is a volatile sig_atomic_t (signal-context-safe, since bun_restore_stdio runs from both the atexit path and the SIGINT/SIGTERM handler) and is set inside Bun__ttySetModebefore uv__tcsetattr so a signal landing mid-transition still triggers
restoration.
When stdout is a TTY (interactive wrapper — bun run <tui> after a
child crash, the crash-handler banner, --watch reload, the
signal-death re-raise in run_command/bunx/lifecycle_script_runner), keep the unconditional
restore so the shell prompt comes back cooked. Those paths don't route
through Bun__ttySetMode so a child that took termios raw via
FFI/stty/ioctl and died would otherwise leave the terminal raw.
Fds Bun modified via process.stdin.setRawMode(true) still get restored
through the existing uv_tty_reset_mode atexit hook in wtf-bindings.cpp, which runs before bun_restore_stdio and holds the
pre-setRawMode termios snapshot.
Verification
Three tests in test/js/bun/terminal/terminal-spawn.test.ts:
pipeline producer exit does not clobber raw mode on shared tty device — opens a PTY via bun:ffi's openpty, spawns a child with stdin/stderr on the PTY slave and stdout as a pipe (the real bun foo.js | less shape), flips PTY termios raw from the parent, lets the
child exit. Fails before the fix (ICANON restored to cooked), passes
after.
interactive wrapper (stdout tty) restores cooked termios on child exit — all three stdio on the PTY (stdout is a TTY), parent flips
raw, child exits without touching termios; asserts ICANON/ECHO come back
cooked. Guards against over-gating regressing bun run <tui> /
crash-handler / watch-reload restore.
child that called setRawMode restores termios on exit — child
calls setRawMode(true), writes RAW, blocks on stdin; parent observes
ICANON/ECHO cleared while live (proves setRawMode took effect), sends a
byte, asserts termios back to cooked (proves uv_tty_reset_mode atexit
still runs).
All three are handshake-driven (no setTimeout), platform-aware (Darwin tcflag_t offset, glibc/musl/darwin soname differences in dlopen),
and assert termios bits before exit code so a regression surfaces first
in the failure diff.
923991 Bun.spawn: downgrade JSRef after async stdout/stderr drain (#29748)
What
Subprocess.onCloseIO now calls updateHasPendingActivity() after it
converts a stdout/stderr .pipe to .buffer/.ignore (or clears
stdin).
Why
If the child process exits while a stdout/stderr PipeReader is still
pending — e.g. a grandchild still holds the write end of the pipe, or
the exit notification races ahead of pipe EOF — onProcessExit's
deferred updateHasPendingActivity() observes hasPendingActivityStdio() == true and keeps this_value as a Strong JSRef.
Later, the pipe drains asynchronously on the event loop and PipeReader.onReaderDone → Subprocess.onCloseIO flips the Readable to .buffer/.ignore. Previously nothing re-evaluated pending activity at
that point, so the JSRef stayed Strong forever and the JSSubprocess
object plus its buffered output leaked for the lifetime of the parent
process.
Repro
The new test/js/bun/spawn/spawn-unread-stdout-gc.test.ts spawns a
child that immediately spawns a detached grandchild inheriting stdout,
then exits. The grandchild writes after a short delay, so onProcessExit in the parent hits EAGAIN on the stdout read and the
drain completes asynchronously via onCloseIO. A FinalizationRegistry
then checks that the Subprocess wrappers are collectable.
Verification
Without the fix: 0/10 Subprocess objects collected (consistently).
With the fix: 10/10 collected (consistently across multiple runs).
git stash -- src/ → test fails; git stash pop → test passes.
d644b4 hot: defer reload while a rejected module is unreported (#29740)
What
Runtime fix for the --hot sourcemap race that #29735 works around at
the test level. Two changes:
VirtualMachine.reload() now also defers when pending_internal_promise is .rejected but its error hasn't been
printed yet (pending_internal_promise_reported_at != hot_reload_counter), not just when .pending. The deferred reload runs
from reportExceptionInHotReloadedModuleIfNeeded()after the error is
remapped and printed against its own sourcemap.
SavedSourceMap.putMappings() keeps the existing table entry when
the incoming InternalSourceMap has zero mappings. A 0-mapping map can
never answer a lookup, so dropping it is never worse than installing it;
this defends against any other path that re-transpiles a comment-only
partial read.
Why
vm.source_mappings is a path-hash → blob table overwritten in place on
every transpile. The event-loop tick drains microtasks between tasks, so
a watcher event that arrives after a module's eval rejects but before reportExceptionInHotReloadedModuleIfNeeded() prints it can run another reload() — which re-reads the file (possibly mid-rewrite, since a 2MB writeFileSync is truncate + several write()s) and overwrites the
table entry. The still-unreported error is then either remapped against
the wrong map (transpiled coords leak through, e.g. :1:12 when the
source is line 1003) or dropped entirely when the new pending_internal_promise replaces the old one.
This was flaking on aarch64 in should work with sourcemap generation
(see #29735) and can also affect users whose editors save
non-atomically.
Test
The new test in test/cli/hot/hot.test.ts makes the window
deterministic: the hot file truncates itself to a comment-only stub
immediately before throwing, guaranteeing a fresh watcher event lands
between reject and report.
bun bd test test/cli/hot/hot.test.ts -t "should not remap against a stale sourcemap" — 1 pass, 41 expect() calls (20 iterations)
USE_SYSTEM_BUN=1 bun test test/cli/hot/hot.test.ts -t "should not remap against a stale sourcemap" — fails (the error is dropped and the
test times out waiting for it)
bun bd test test/cli/hot/hot.test.ts -t sourcemap — all 4
sourcemap tests pass
bun run zig:check-all — clean
Note: the self-rewriting hot file in the new test occasionally
exposes a separate, pre-existing edge case where an entry that rewrites
itself mid-eval and then throws can lose the error (reproduces on
released bun too). I observed this as a rare hang on debug builds when
running under heavy parallel load; it did not reproduce in serial bun bd test runs. If it surfaces in CI it's worth a follow-up — the root
cause is independent of this change.
86e6ab Use wtf/MathExtras.h for double→int conversions; fjcvtzs/fcvtzs on arm64 (#29746)
What does this PR do?
C++ (correctness): replace static_cast<intN_t>(double) with truncateDoubleTo{Int32,Int64,Uint32,Uint64} from wtf/MathExtras.h at
argument-parsing sites where the input can be NaN/±Inf/out-of-range — JSBuffer.cpp (parseArrayIndex, toString start, write length, BigInt
offset, Buffer.from(ArrayBuffer, offset, length)), perf_hooks
Histogram constructor/prototype, JSHTTPParserPrototype.cpp, wrapAnsi.cpp. Same hardware instruction is emitted; this removes UB — static_cast of an out-of-range double is undefined, which lets the
optimizer assume the input was in range and potentially elide the bounds
check that follows. The MathExtras helpers go through intrinsics/inline
asm so no such assumption is possible.
Zig (perf, arm64):
JSValue.tryConvertToStrictInt32(f64) ?i32 mirrors WTF::tryConvertToStrictInt32. On targets with .jsconv (apple_m1 →
v8.4a → v8.3a) it lowers to fjcvtzs + cset instead of
isnan/isinf/convert/round-trip/signbit, and jsNumberWithType(f64)
reuses the returned int instead of converting again. On x86_64 / Linux
arm64 it falls through to a range-gated @​intFromFloat (which also
fixes a pre-existing out-of-range fptosi in the old canBeStrictInt32).
coerceJSValueDoubleTruncatingTT for i32/i64 on aarch64 is now a
single fcvtzs via inline asm — its NaN→0 / overflow→min/max saturation
is bit-identical to the branchy fallback. The i52 case keeps the
branches.
How did you verify your code works?
bun bd test test/js/node/buffer.test.js — 457 pass
bun bd test test/js/bun/perf_hooks/histogram.test.ts — 38 pass
bun bd test test/js/node/http/node-http-parser — 9 pass
bun run zig:check-all — 61/61
objdump -d build/debug/bun-debug | grep fjcvtzs confirms the new
sites
Bun.color(NaN | ±Infinity | 1e300 | 2147483647.9, "ansi-256")
matches system bun (probes Zig toInt64 saturation)
Buffer.readDoubleLE round-trip of -0.0, NaN, ±Infinity, INT32_MIN/MAX±1 unchanged
Adds correctness coverage for the new JSC GetModuleNamespace
single-BFS fast path:
deep export * from chain — every transitive binding is present
two siblings with the same local — name is excluded (ambiguous), but a
local in the importing module shadows correctly
indirect re-export through a star chain — still resolves via the slow
path
These pass on both the current and patched JSC (the optimization
preserves observable behavior).
Bench impact from the WebKit change (500 modules, 30k star edges, local
release without LTO):
before: ~694ms
after: ~320ms
node 24: ~525ms, deno 2.7: ~157ms
Separately found while auditing: a fresh import() that reaches a
module whose previous load failed resolves on Bun (all versions back to
1.3.11) but rejects on Node/Deno — pre-existing, not touched by this PR.
e2017e ws: respect perMessageDeflate: false in upgrade request (#29685)
What
Stop emitting Sec-WebSocket-Extensions: permessage-deflate; client_max_window_bits on the upgrade request when the caller passes perMessageDeflate: false — matching Node + ws.
Why
new WebSocket(url, { perMessageDeflate: false }) (directly or through
the ws package) was a no-op in Bun. The upgrade builder hardcoded the
extension offer; the ws compat shim in src/js/thirdparty/ws.js never
looked at options.perMessageDeflate; and the JS→C++ option parser had
no field for it. This broke gateway paths that reject upgrade requests
advertising extensions the deployment does not want.
After this change: prints undefined, matching Node.
Changes
src/js/thirdparty/ws.js — read options.perMessageDeflate and
forward perMessageDeflate: false into the native WebSocket options.
src/bun.js/bindings/webcore/JSWebSocket.cpp — parse the perMessageDeflate property from the options object.
src/bun.js/bindings/webcore/WebSocket.{h,cpp} — new m_offerPerMessageDeflate field, setter, create() overloads take it, connect() passes it to the Zig layer.
src/bun.js/bindings/headers.h — extend the Bun__WebSocketHTTP{,S}Client__connect signatures with the new bool offerPerMessageDeflate trailing arg.
src/http/websocket_client/WebSocketUpgradeClient.zig — thread
the flag through connect → buildRequestBody, emit the extensions
line only when the flag is true, store the flag on the client so we can
also ignore a server-side permessage-deflate response when we did not
offer it (RFC 6455 §9.1).
test/js/first_party/ws/ws.test.ts — three tests under a new perMessageDeflate upgrade header describe: opt-out via ws API,
baseline still advertises extensions, opt-out via globalThis.WebSocket. All use a captive TCP listener that reads the
raw upgrade request bytes.
Verification
bun bd test test/js/first_party/ws/ws.test.ts -t "perMessageDeflate"
3 pass, 0 fail
And the failure mode from before — src/ stashed, tests run against
baseline — the two opt-out tests fail as expected, the baseline-behavior
test passes. Pre-existing unrelated failures in the same file (30 tests
timing out at 1000ms) are unchanged in count.
res.setTimeout(msecs) on a client-side IncomingMessage was creating
a ref'dsetTimeout which kept the event loop alive for the full
timeout duration even after the response completed. In Node.js, IncomingMessage.prototype.setTimeout delegates to socket.setTimeout(), which uses an unref'd timer.
Repro
consthttp=require("http");constagent=newhttp.Agent({keepAlive: true});http.get({ host, port, agent },res=>{res.setTimeout(90000);res.resume();res.on("end",()=>{/* nothing left to do */});});
Node exits immediately after end. Bun (before) waits the full 90s.
Cause
_http_client.ts overrides res.setTimeout on the IncomingMessage
with an inline implementation that called setTimeout(fn, msecs)
without .unref(), without returning this, and without treating msecs === 0 as "clear the timeout".
Fix
.unref() the timer so it doesn't keep the event loop alive
return res (Node returns this)
msecs <= 0 clears the existing timer instead of scheduling setTimeout(fn, 0)
attach the callback via res.on("timeout", cb) so repeated setTimeout calls stack listeners the same way Node does
Relation to "idle keep-alive sockets block exit"
Investigated whether Bun's HTTP keep-alive pool itself refs the event
loop — it doesn't. http.ClientRequest goes through native fetch;
pooled sockets live on the HTTP thread's own loop and the only JS-loop
ref is FetchTasklet.poll_ref, which is released when the response body
finishes. Plain http.get / https.get / axios / agentkeepalive with keepAlive: true all exit immediately. The one path that produces the
"process waits N seconds after the last request" symptom is this ref'd res.setTimeout timer.
How did you verify your code works?
New test in test/js/node/http/node-http.test.ts spawns a child that
makes a keep-alive http.get, calls res.setTimeout(60000), consumes
the body, and asserts the child exits without hitting a 2s sentinel.
USE_SYSTEM_BUN=1 bun test → fails (BAD_RETURN + STILL_ALIVE)
bun bd test → passes
git stash push -- src/ && bun bd test → fails; git stash pop && bun bd test → passes
Also verified:
timeout still fires when the server is slow to respond
listener semantics (on, not removed by setTimeout(0)) match Node
existing test/js/node/http/client-timeout-error.test.ts and node-http.test.ts -t setTimeout still pass
4ddc37 test: deflake cpu-prof.test.ts on Windows (time-bound all workloads to 100ms) (#29741)
What does this PR do?
Deflakes test/cli/run/cpu-prof.test.ts on Windows, which flaked in 9
of the last 200 builds — e.g. build
48023 on Windows 2019
x64-baseline at line 187:
expect(functionNames.some((name: string) => name !== "(root)" && name !== "(program)")).toBe(true);
Expected: true Received: false
and again on this PR's first push (build
48042) at lines 241/285
with Received: "No samples collected.\n".
Root cause: On Windows, JSC's SamplingProfiler effectively ticks at
the ~15.6ms default timer quantum (WTF::sleep is bounded by it without timeBeginPeriod). After #29393 (WebKit module-loader rewrite),
entry-module evaluation is async — loadAndEvaluateModule returns a
pending promise and user code runs after fetch→link→evaluate microtask
checkpoints — so the first sampler tick can land in loader internals
before user code starts. The previous workloads were either
iteration-bounded (for (i < 1000000), JITs to <1ms) or time-bounded
for only 32/50ms, which is no longer enough margin to guarantee even one
sample lands in user code.
7 of the 9 historical flakes are post-#29393; the other 2 predate it.
The test was always borderline, the rewrite made it ~7× flakier.
Jarred's earlier deflake (8058d78b6a) bumped 16→32ms and time-bounded
the first test, but that's no longer sufficient.
Fix: Time-bound every workload in the file for 100ms (~6 sampler
ticks on Windows), including the previously iteration-bounded myFunction(). Under describe.concurrent the wall-clock cost is the
max not the sum, so the suite stays at ~1.9s.
How did you verify your code works?
bun bd test test/cli/run/cpu-prof.test.ts — 30/30 consecutive passes
on Windows after each commit
BuildKite history scan of last 200 builds correlating flake commits
with WebKit upgrade ancestry
c91c61 test(inspect): deflake test-reporter.test.ts retroactive-enable test (#29734)
What
Replaces wall-clock Bun.sleep synchronization in the "retroactively
reports tests when TestReporter.enable is called after tests are
discovered" test with a deterministic Console.messageAdded signal +
gate file.
Why
The test was flaking with Timeout waiting for 3 ended tests, got 2
(e.g. build 47992, 47990, 47989, 47988, 47986, 47982).
TestReporter.enable is delivered cross-thread via postTaskTo onto
the debuggee's JS event loop. The old test assumed Bun.sleep(200)
(client) + Bun.sleep(500) (test A1) guaranteed the enable task would
be processed before A1 finished. Under CI load that ordering doesn't
hold: A1's 500ms timer can win, onSequenceCompleted(A1) runs with test_id_for_debugger == 0 and the agent still disabled
(Execution.zig:589-591), and A1's TestReporter.end is silently
dropped — retroactive replay (Debugger.zig:351-422) only emits found, never end.
How
Test A1 now console.log("__A1_RUNNING__") and polls for a gate file
instead of sleeping 500ms.
The client enables the Console domain, waits for that message
(collection done, A1 executing), then sends TestReporter.enable.
Once the 5 retroactive found events arrive (which proves enable
actually ran on the JS thread), the client writes the gate file,
releasing A1.
A1 completes with the agent enabled, so all 3 end events fire
deterministically.
Test plan
bun bd test test/cli/inspect/test-reporter.test.ts passes
30 sequential runs: 30/30 pass
30 runs in parallel under 4 CPU-spinner background jobs: 30/30
pass
ad5d33 test: avoid TLA self-cycle in bun-main dynamic-import test (#29738)
Follow-up to #29719.
bun:main statically imports the entry file, so await import("bun:main") at the entry's top level is a TLA self-cycle: bun:main waits for entry.mjs (async dep), and entry.mjs waits for bun:main's evaluation promise. Per spec that promise never settles.
The old JSC module loader broke these cycles early, which is why #29719
passed locally (tested against 892042c2, pre-#29393). After #29393
(WebKit module-loader rewrite) the loader correctly leaves the promise
unsettled, so the test now hangs and times out at 90s — see build 48023
(debian-asan, win x64, win x64-baseline).
Fix: drop the top-level await and use import("bun:main").then(...)
so entry.mjs finishes synchronously, bun:main finishes, and the
import resolves on the next microtask. The preload and --hot tests are
unaffected (preload isn't in bun:main's dep graph).
Note: this also surfaced that Bun now hangs forever on any unsettled-TLA
cycle instead of exiting 13 like Node — separate PR coming for that.
How did you verify your code works?
bun bd test test/js/bun/resolve/bun-main-entry-point.test.ts — 3
pass, 20 consecutive runs on Windows
USE_SYSTEM_BUN=1 bun test ... — 2 fail (still catches the alias bug
on unfixed bun)
6829d1 build(windows): use -Xclang -include instead of /FI in pch rule (#29736)
Summary
clang-cl's /FI<header> auto-promotes itself to -include-pch <pch>
when a .pch already exists at the /Fp path — and it does this for both internal cc1 jobs of /Yc, including the -emit-pch job
that's supposed to overwrite the PCH. So when cxxflags change and ninja
re-runs the pch rule, the create-PCH step ends up validating the stale
PCH instead of overwriting it.
I hit this after pulling #29653 (/EHsc → /EHs-c-) into a build dir
that had a pre-existing PCH:
[499/864] pch pch\root-pch.h.hxx.pch
FAILED: pch/root-pch.h.hxx.pch pch/root-pch.h.hxx.cxx.obj
error: exception handling was enabled in precompiled file 'pch\root-pch.h.hxx.pch' but is currently disabled
Minimal repro (no ccache, no Bun headers — clang 21.1.8):
clang-cl -v on the second line shows the -emit-pch cc1 invocation
receiving -include-pch h.hxx.pch instead of -include h.hxx — -###
with the .pch deleted shows -include, so it's filesystem-dependent
driver behavior.
Fix: swap /FI$pch_header for -Xclang -include -Xclang $pch_header. The cc1-level -include bypasses the driver's PCH
auto-detection, so -emit-pch always reads the header source. The Unix pch rule and the Windows cxx_pch consumer rule already use this
form, so this also makes the two platforms' force-include spelling
consistent.
Test plan
bunx tsc --noEmit -p scripts/build/tsconfig.json
Minimal repro: /Yc + -Xclang -include + stale-flag PCH on disk
→ exit 0, PCH overwritten (size changes)
-### confirms -emit-pch job gets -include (not -include-pch) with PCH on disk
ninja pch\root-pch.h.hxx.pch builds 156MB PCH; a cxx_pch
consumer compiles against it
Windows CI green
f225c8 ci:errors: surface flaky (retried) tests from BuildKite annotations (#29742)
Summary
bun run ci:errors only parsed style=error annotations, so the context=flaky style=warning annotation (tests that failed once and
passed on retry) was silently dropped.
Split that bundled flaky annotation into one synthetic entry per test
path and render them after the hard failures, tagged [flaky] in
yellow. Multi-platform flakes are grouped under a single heading and
reuse the existing per-platform body dedup.
renderAnnotation now takes the formatted tag string directly; --all help text updated since flaky no longer needs that flag.
Test plan
bun run ci:errors 48023 --no-compare lists test/cli/run/cpu-prof.test.ts under a [flaky] heading
26 raw flaky sections collapse to 18 unique test headings; dev-and-prod.test.ts and hot.test.ts each show 4 platform
sub-sections
Hard-failure annotations still render first with [new]/[pre-existing] tags
30478d test: deflake 29240.test.ts on Windows (cpu-prof sampling) (#29720)
What does this PR do?
Deflakes test/regression/issue/29240.test.ts on Windows, where it
failed ~17/199 builds with anotherNodes.length === 0.
Root cause: JSC's SamplingProfiler effectively ticks at the ~15.6ms
default timer quantum on Windows (WTF::sleep is bounded by it without timeBeginPeriod). The script's anotherFunction() ran a
fixed-iteration Math.sqrt loop that finishes in <1ms once JIT'd, so it
was never on top of stack when the sampler fired.
Fix: Replace the iteration-bounded loops in doWork() and anotherFunction() with time-bounded loops (for (const end = performance.now() + 100; performance.now() < end; )). Each function now
occupies the CPU for a contiguous 100ms, guaranteed to span ~6 sampler
ticks even at the Windows quantum. The outer 200ms driver loop is
dropped — total runtime stays ~200ms.
Function definition line numbers are deliberately preserved
(fibonacci=1, doWork=6, anotherFunction=14) so the existing callFrame.lineNumber assertions don't change. Only the positionTicks
upper bound moves 27→24 to match the new script length.
The second test (script.ts / hot()) was already time-bounded and
needed no changes.
How did you verify your code works?
bun bd test test/regression/issue/29240.test.ts — 25/25 consecutive
passes on Windows
c94252 test: deflake FUSE tests by mounting once with a longer poll budget (#29718)
What does this PR do?
Fixes the glob-on-fuse.test.ts flake on Alpine CI (79 occurrences
across 44 of the last 70 builds, e.g. build
47922).
Root cause
The test mounts a FUSE filesystem via python3 fuse-fs.py once per
test (4×), polling up to 250 × 5ms = 1.25s for the mount to appear.
On Alpine, this file's deterministic shard slot happens to run while docker compose is still extracting Redis/MinIO images in the
background. With disk I/O saturated, the first python3/libfuse
cold-start exceeds the 1.25s budget and the assertion at line 41 fails.
Tests 2-4 in the same file then pass (warm page cache, ~170ms per
mount), and the retry passes (docker has finished). run-file-on-fuse.test.ts has the identical pattern but never flakes
because it lands in a different shard whose tests Bump lint-staged from 13.1.2 to 13.2.0 #1-12 are slower, so
it runs ~50s after docker finishes.
Mount once in beforeAll / unmount in afterAll instead of per-test
(4× → 1× mount cycles).
Raise the poll budget from 1.25s to 8s; still exits early if the
python process crashes.
afterAll runs even if beforeAll throws, so cleanup is guaranteed.
Applied the same change to run-file-on-fuse.test.ts since it has the
same latent issue.
How did you verify your code works?
bun bd test test/cli/run/glob-on-fuse.test.ts test/cli/run/run-file-on-fuse.test.ts → 6 pass, 0 fail
20 consecutive runs of glob-on-fuse.test.ts and 10 of both files
together → all pass, no leaked mounts
Passes under simulated cold-cache + I/O contention locally
Verified afterAll runs when beforeAll throws in Bun's test runner
da74c5 resolver: add bun:main to HardcodedModule.Alias (fix flaky test) (#29719)
What
Add bun:main to HardcodedModule.Alias.bun_extra_alias_kvs so the
runtime transpiler stops rewriting import("bun:main") into import("main").
Why (the flake)
bun-main-entry-point.test.ts has been flaky since it landed in #29450.
The test was never exercising the code path it claimed to:
bun:main is in HardcodedModule.map but was missing from the Alias map
so RuntimeTranspilerStore.zig:534 stripped the bun: prefix,
leaving import("main") in the emitted JS
at runtime that fell through to resolveAndAutoInstall, which fetched the npm main package (main@​1000.0.1) over the network
the test's typeof m !== "object" check passed against the npm
package, so it "passed"
when the registry fetch was slow or failed on CI, stdout was empty →
flake
Confirmed by observing require.cache gain ~/.bun/install/cache/main@​1000.0.1@​@​@​1/index.js after the import, and
by reproducing the failure deterministically with [install] auto = "disable".
Test changes
Add package.json: "{}" to the temp dirs so auto-install is off — any
future regression fails loudly with "Cannot find package 'main'" instead
of silently passing via npm
Assert Object.keys(m).length === 0 (the real wrapper exports
nothing; the npm package exports default,length,name,prototype)
Collapse assertions into one toEqual({stdout, stderr, exitCode, signalCode}) so a failure shows the child's stderr
Update the preload test's expected order — now that import("bun:main") actually evaluates the wrapper, entry.mjs runs
before the preload's await resumes (ENTRY_OK\nPRELOAD_OK\n)
How did you verify your code works?
bun bd test test/js/bun/resolve/bun-main-entry-point.test.ts — 3
pass, 30 consecutive runs on Windows
USE_SYSTEM_BUN=1 bun test ... — 2 fail (correctly catches the bug on
unfixed bun)
bun bd test test/js/bun/http/bun-serve-html-entry.test.ts -t "bun:main" — pass
bun bd test test/js/bun/resolve/import-meta*.test.* — 47 pass
bun bd test test/js/node/module/node-module-module.test.js — 28 pass
bun run zig:check-all — pass
73e888 Upgrade WebKit to 87fd0daba19a (module-loader rewrite) (#29393)
Upgrades WebKit to upstream aac4aed489d1 (2026-04-24) via oven-sh/WebKit#199.
Promise.prototype.finally was split fast/slow upstream; Bun's ALS
context wrapping moved to the fast path. Slow path may need coverage if
ALS-through-finally tests fail.
node:vmlink() now passes nullptr for scriptFetcher (was a
JSValue). If node:vm tests rely on threading it, needs a ScriptFetcher
subclass.
migration
Upstream is deprecating the JSC-specific cast helpers in favor of WTF's
generic TypeCasts.h. Bun's C++ bindings use jsCast / jsDynamicCast
heavily — expect deprecation warnings or follow-up migration.
195397957f97 Make downcast/dynamicDowncast/uncheckedDowncast work
with JSCell subclasses
a6df2880b331 Drop jsDynamicCast<>() in favor of dynamicDowncast<>()
a20b2c96bcb4 Drop jsCast<T*>() in favor of uncheckedDowncast<T>()
178cea00b798 Replace jsSecureCast<>() with downcast<>()
Module loader
407d0feac1cdRemove JSScriptFetcher and JSScriptFetchParameters — Bun's loader bridge wraps these; the
rerere-resolved files for both headers signal the previous merge already
adapted, but verify ModuleLoader.cpp
e236b9dd9455 Fix null-env deref in CyclicModuleRecord::initializeEnvironment for Wasm modules
Header reorganization (drives ~60 of the 75 merge conflicts)
Aggressive include-minimization sweep across core JSC headers. Conflicts
are mechanical: upstream rewrote include blocks while Bun keeps
quote-style "Foo.h" instead of <JavaScriptCore/Foo.h>.
9f2eb90301dc Minimize includes in CodeBlock.h, JSCJSValue.h, JSCJSValueInlines.h, VM.h
744271668d05 Expensive header files slow full build
5ff0c08af8bf Reduce cost of StructureInlines.h
405a323e0a80 Use pre-compiled headers consistently for all ports
ed1c48 test(intl): assert date values, not format string, in regress-1451943 (#29708)
On macOS Bun links the system libicucore (via WebKit), so CLDR data
varies by OS release. The en-US date pattern for calendar: "iso8601"
is "1/1/1582" on macOS ≤15 and "1582-01-01" on macOS 26 (which
matches what upstream V8's test now asserts). The calendar behaviour the
test guards — that iso8601 doesn't apply a Julian offset to pre-Oct-1582
dates — is correct on both.
Compare year/month/day via formatToParts (numerically, so zero-padding
is irrelevant) instead of the rendered string.
Verified formatToParts output on both ICU variants:
macOS 26: [{year:"1582"},{month:"01"},{day:"01"}]
macOS 15: [{month:"1"},{day:"1"},{year:"1582"}]
75b747 ci: don't tier-target darwin x64 (single entry, any Intel box) (#29705)
Tier split bottlenecked the smaller x64 pool. arm64 keeps
latest+previous; x64 goes back to a single entry routed to any of the 5
Intel boxes. x64 jobs/PR: 4 → 2, throughput ~3.0 → ~7.5 PR/hr.
64c585 scripts/agent.mjs: macOS launchd install + start support (#29672)
Summary
Ports macOS launchd support into scripts/agent.mjs so new macOS test
runners can be provisioned with one command. The current macOS CI fleet
runs a hand-patched fork of this script that never landed in the repo;
this brings the repo version up to parity.
macOS paths under ~/Library/{Services,Caches,Logs,Preferences}/buildkite-agent
install on macOS writes buildkite-agent.cfg (token + queue, no spawn= — the test suite assumes it owns the machine), the launchd
plist, the daily-reboot cleanup plist, and launchctl bootstraps both.
Re-running preserves an existing token.
start / exec passes --config <cfg> and emits the tags ci.mjs
targets (release=<macOS major>, posix/windows, ephemeral=false).
342928 ci: drop redundant darwin release=13 test platforms (#29703)
Summary
getTestAgent() for darwin targets jobs by queue + os + arch only
— notrelease. So the release: "13" entries weren't actually
testing on macOS 13; they just ran the suite a second time on whatever
agent picked them up. (There are no release=13 arm64 agents in the
fleet anyway.)
Dropping them halves per-PR darwin test jobs (8 → 4) with zero
coverage loss. The fixed-capacity macOS runner fleet has been the
throughput bottleneck (~1.8 PR/hr arm64); this doubles effective
throughput before any new hardware.
New macOS 26 runners pick up the remaining jobs without needing a
separate release: "26" entry, since release isn't part of the agent
query rules.
Test plan
node --check .buildkite/ci.mjs
Verify a PR build generates 4 darwin test jobs instead of 8
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Updated Packages