You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
450072 install: disable isolated global virtual store by default until no longer experimental (#30473)
03ebdf chore: prevent auto-update actions from running on forks (#30464)
What does this PR do?
Makes these actions only run on upstream Bun. I don't need PRs like
these to be
opened since I only keep my repo up for sending PRs back here, and Bun
will merge those updates itself.
How did you verify your code works?
fe735f http: arm idle timer on open so a stalled TLS handshake times out (#30376)
Fixes the bun install hang reported in #30325 (latest comment — still
reproducing on 1.3.13).
Repro
Point bun install at an HTTPS registry that accepts TCP but never
answers the TLS ClientHello:
// raw TCP server that swallows the ClientHello and never repliesnet.createServer(s=>s.on("data",()=>{})).listen(0);
[install]
registry = "https://127.0.0.1:<port>/"
bun install connects, the socket goes ESTABLISHED, and the process
blocks in epoll_wait forever with no timer armed. This is the state
the reporter captured in their Gitea/Kubernetes CI: three ESTABLISHED
sockets to the npm CDN, zero rx/tx, 14+ minutes and counting.
Root cause
HTTPClient.onOpen() starts the TLS handshake but does not arm the
socket's idle timer — the first setTimeout(socket, 5) call is in onWritable(), which only runs after the handshake completes.
Freshly-connected sockets inherit long_timeout = 255 (disabled) from
the connecting socket, so a stall anywhere between TCP-connect and
handshake-done has no timer at all. The bun install main loop then
waits forever on pendingTaskCount() == 0 because the NetworkTask
callback never fires.
The earlier fixes in #29611 / #29649 covered a different hang (4xx/5xx
tarball responses not releasing the task slot); they didn't touch this
path.
Fix
Arm the idle timer in onOpen() so it covers the TLS handshake.
Wire the short-tick onTimeout handler in HTTPContext.Handler
alongside the existing onLongTimeout — socket.setTimeout(seconds)
picks whichever timer fits the duration, so both must dispatch.
Read the idle-timeout duration from a new BUN_CONFIG_HTTP_IDLE_TIMEOUT env var (seconds). Default is 300 — the
previous hard-coded 5 minutes — so nothing changes for unconfigured
environments except that the handshake phase is now covered. 0
disables the timer (same as disable_timeout = true).
Route the experimental h2 client session's rearmTimeout through the
same value for consistency.
Verification
New test test/cli/install/bun-install-stalled-tls.test.ts starts a raw
TCP server that accepts connections and never replies, points bun install at it over https://, sets BUN_CONFIG_HTTP_IDLE_TIMEOUT=3 / BUN_CONFIG_HTTP_RETRY_COUNT=0, and asserts the install fails with a
timeout error.
# without this change
(fail) bun install times out when the registry accepts TCP but never completes the TLS handshake [60004.48ms]
^ this test timed out after 60000ms.
# with this change
(pass) bun install times out when the registry accepts TCP but never completes the TLS handshake [4483.87ms]
fetch-http2-client.test.ts (58 tests) and bun-install-retry.test.ts
still pass.
6d0d86 Bun.serve: validate websocket.perMessageDeflate is a boolean or object (#30398)
What does this PR do?
Fixes a debug assertion crash when websocket.perMessageDeflate is set
to a primitive value that isn't a boolean (e.g. a number, string,
bigint, or symbol).
Previously this fell through the undefined / boolean / null checks
in WebSocketServerContext.onCreate and called JSValue.getTruthy("compress") on the primitive, hitting bun.debugAssert(target.isObject()) in JSValue.get.
Now it throws TypeError: websocket expects perMessageDeflate to be a boolean or an object.
3457c1 Bun.mmap: throw on non-object options instead of asserting (#30404)
What does this PR do?
Bun.mmap(path, 256) (or any non-object second argument) hit a debug
assertion in JSValue.get() because the options value was passed
straight to getBooleanLoose without first checking that it is an
object.
Now non-object, non-nullish values throw ERR_INVALID_ARG_TYPE, and undefined/null are treated the same as omitting the options
argument.
Added a test to test/js/bun/util/mmap.test.js covering
number/string/boolean (throw) and undefined/null (no throw).
Found by Fuzzilli (fingerprint b1832bde6df73226).
Co-authored-by: Alistair Smith <hi@alistair.sh>
1600ac tls: load intermediates and trusted-people from Windows system stores (#30408)
What does this PR do?
Brings --use-system-ca / NODE_USE_SYSTEM_CA on Windows to parity
with Node.js's ReadWindowsCertificates
(src/crypto/crypto_context.cc).
Before this change, root_certs_windows.cpp only enumerated the ROOT
store for CURRENT_USER and LOCAL_MACHINE (2 CertOpenStore calls).
Node opens 13: ROOT, CA (intermediates), and TrustedPeople across LOCAL_MACHINE, CURRENT_USER, and the group-policy / enterprise
variants — and filters by EKU.
The most user-visible consequence of the old behavior: when a server is
misconfigured and sends only the leaf cert without its intermediates
(very common on intranets, the primary use case for --use-system-ca),
Node can still build the chain from the intermediates Windows keeps in
the CA store; Bun would fail with unable to get local issuer certificate.
Changes, all mirroring Node:
before
after
Store names
ROOT
ROOT, CA, TrustedPeople
Locations
LOCAL_MACHINE, CURRENT_USER
+ GROUP_POLICY,
ENTERPRISE variants
CERT_STORE_OPEN_EXISTING_FLAG
no
yes (don't create a missing
store)
EKU server-auth filter (CertGetEnhancedKeyUsage)
no
yes (skip
certs restricted to e.g. code-signing only)
IsCertTrustedForServerAuth and GatherCertsForLocation are direct
ports of the equivalents in Node's crypto_context.cc, adapted to Bun's
raw-DER-blob layering (this TU is kept OpenSSL-free to avoid wincrypt.h / BoringSSL macro collisions; root_certs.cpp does the d2i_X509 conversion).
Related issues (context, not fixes)
The issue-finder bot flagged #17108, #28612, and #9365. None of them are
closed by this PR because it only changes behavior when --use-system-ca / NODE_USE_SYSTEM_CA=1 is set:
#17108 asked for the base feature, which #22441 already shipped — this
PR refines which Windows stores it reads.
#28612 reports unable to get local issuer certificate with no --use-system-ca set (default bundled store, public CAs, TTY-startup
race) — different layer.
#9365 reproduces on WSL/Linux too and predates --use-system-ca —
likely a server omitting intermediates with no system-store fallback at
all.
d5945c Add bun.sys.sigaction with bionic-correct layout for Android (#30389)
Problem
std.c.Sigaction / std.c.sigset_t for .linux assume the glibc/musl
layout: { handler, mask[128B], flags, restorer }. bionic LP64 is { int sa_flags; sa_handler; sigset_t sa_mask; sa_restorer } where sigset_t is a single unsigned long.
When Bun calls std.posix.sigaction() on Android, bionic reads sa_handler from offset 8 — which is Zig's mask[0]:
Every handler installed with mask = sigemptyset() (SIGPIPE/SIGXFSZ
in main, the crash handler, SIGINT in
repl/filter_run/multi_run/Coordinator) silently becomes SIG_DFL.
Broken-pipe kills the process, no crash trace on SEGV, Ctrl+C isn't
caught.
WaiterThread.reloadHandlers() sets mask[0] = 1<<16 (SIGCHLD), so
bionic installs the handler 0x10000. When SIGCHLD fires, the kernel
jumps there → SEGV_MAPERR, rip=0x10000, rdi=17. Reproduces 100% on
a full Android emulator (not under qemu-user).
Fix
Add bun.sys.{Sigaction, sigset_t, sigemptyset, sigaddset, sigaction}
in src/sys/sys.zig:
Android: an extern struct matching bionic bits/signal_types.h
(__LP64__) and an @​extern to libc sigaction that takes it. sigset_t = c_ulong.
Everywhere else: transparent aliases of std.posix.*, so nothing
changes.
Replace all nine callsites (main.zig, crash_handler.zig, process.zig, repl.zig, filter_run.zig, multi_run.zig, Coordinator.zig, Global.zig).
A comptime tripwire fires if @​sizeOf(std.c.Sigaction) ever shrinks to
the bionic 32 bytes, so the workaround can be dropped once the Zig
stdlib bug is fixed upstream.
Also adds zig build check-android[-debug], gated on -Dandroid_ndk_sysroot like check-freebsd, so the Android-only struct
gets compile-checked.
bun run zig:check-all still passes all 16 targets. Host smoke tests
(spawn/exit-code, spawn/spawn-signal, spawn/spawn-kill-signal,
plus BUN_FEATURE_FLAG_FORCE_WAITER_THREAD=1 to hit the modified
SIGCHLD path) pass.
No runtime test is included because the misbehaviour only surfaces on a
real Android kernel, which CI does not run.
9ed6e8 publish: send readme and readmeFilename to the registry (#30257)
What
bun publish packed README.md into the published tarball but never
populated the version-level readme / readmeFilename fields in the
JSON body sent to the npm registry. Packages published with Bun
therefore show an empty readme via the registry API (e.g. npm view <pkg> readme readmeFilename), even though the tarball contains one —
exactly what was reported in #30255.
How
Find the first file whose name matches README or README.*
(case-insensitive); if found, read it and include readme (contents) + readmeFilename in the version metadata. A non-empty readme already
present in package.json wins.
Two publish flows:
Workspace publish (bun publish): findWorkspaceReadme scans the
workspace directory directly.
Tarball publish (bun publish path.tgz): track the first
README-matching entry during the existing libarchive iteration
(one-shot, so bytes get read as encountered).
First match wins — no markdown-preference / tie-breaking / empty-string
sentinel. The reporter wanted npm parity; they pushed back on
competing-readme edge cases, not on scope.
No change to what ends up in the tarball — the README was already being
packed by isUnconditionallyIncludedFile.
Verification
Two tests in test/cli/install/bun-publish.test.ts under describe("readme") — workspace publish, and tarball publish
(pre-packed .tgz fed to bun publish path.tgz). Both run against a
mock registry, capture the PUT body, and assert body.versions[version].readme / readmeFilename. Fail on main, pass
with the fix.
d352df usockets: don't defer TLS close when close_notify flush fails (#30368)
What
ssl_handle_shutdown in packages/bun-usockets/src/crypto/openssl.c
returned 0 ("wait for the peer") when SSL_shutdown failed with SSL_ERROR_WANT_WRITE / WANT_READ during a graceful close
(force_fast_shutdown == 0). Change it to return 1 so the raw socket
closes (and SSL_free runs) immediately.
Why
SSL_shutdown allocates BoringSSL's ssl->s3->write_buffer to hold the
encoded close_notify alert. If the BIO write fails (kernel buffer
full, peer already gone), SSL_shutdown returns -1 with WANT_WRITE.
The old code returned 0 from ssl_handle_shutdown, which told us_internal_ssl_close to leave the fd open and wait for the peer.
That deferral is correct for the SSL_shutdown() == 0 case (alert
flushed, waiting for the peer's reply — see the comment in us_internal_ssl_close). It's wrong here: the alert never went out, SSL_SENT_SHUTDOWN is already set on the first call, and once us_internal_ssl_is_shut_down is true on_writable/on_data
short-circuit without ever re-dispatching the queued alert. There is no
retry path, so the socket stays in limbo holding s->ssl and the write
buffer until some other event arrives — which may never happen.
This shows up as an intermittent LSan failure on the Debian x64-asan
lane in test/js/node/http/node-https-checkServerIdentity.test.ts: the
spawned child calls server.close() and exits right after the request
emits error, so the lingering accepted socket never gets another event
and SSL_free never runs.
Direct leak of 417 byte(s) in 1 object(s) allocated from:
bssl::SSLBuffer::EnsureCap vendor/boringssl/ssl/ssl_buffer.cc:72
bssl::do_tls_write vendor/boringssl/ssl/s3_pkt.cc:194
bssl::tls_dispatch_alert vendor/boringssl/ssl/s3_pkt.cc:373
SSL_shutdown vendor/boringssl/ssl/ssl_lib.cc:1039
ssl_handle_shutdown packages/bun-usockets/src/crypto/openssl.c:821
us_internal_ssl_close packages/bun-usockets/src/crypto/openssl.c:871
us_internal_ssl_on_data
This is a recurring pre-existing flake on main — see retrigger commits 8722a10109, 13a267eabb, ded11f3fb7.
Behavior
The unsent close_notify is best-effort. Closing without it produces
an abrupt FIN, which is indistinguishable from a half-close to the peer
(TLS 1.3 doesn't require it; TLS 1.2 implementations tolerate it).
The deferred-graceful-close path (SSL_shutdown() == 0 → wait for the
peer's close_notify) is unchanged.
force_fast_shutdown == 1 (forceful close from _destroy() / abort)
already returned 1 here, so no change.
Testing
bun bd test test/js/node/http/node-https-checkServerIdentity.test.ts
— 4 pass, 0 fail.
bun bd test test/js/node/tls/node-tls-connect.test.ts test/js/bun/net/socket.test.ts — 57 pass, 3 skip, 1 fail. The single
failure (setSession() should not leak the SSL_SESSION returned by d2i_SSL_SESSION, RSS-growth threshold 40 MiB exceeded at ~293 MiB)
reproduces identically on unmodified main in a macOS debug build —
pre-existing, unrelated to this change.
The LSan leak itself is Linux-ASAN-only and timing-dependent; not
reproducible on macOS, so there is no deterministic regression test.
tls.getCACertificates('system') on macOS evaluates every keychain
certificate with SecTrustEvaluateWithError under a SecPolicyCreateSSL policy, which makes trustd attempt OCSP/CRL/AIA
fetches per cert. On managed Macs running a NetworkExtension content
filter, every denied flow still pays the filter's per-flow
cryptographic-signing overhead — turning what should be a local lookup
into ~10 s of wall-clock time. The same code path also passes kSecMatchTrustedOnly to SecItemCopyMatching, which forces trustd to
evaluate every keychain cert with the default
(network-revocation-enabled) policy before the per-cert loop even
starts.
A sample of the stall puts 100 % of thread time inside SecTrustEvaluateWithError → securityd_send_sync_and_do, and log stream shows ~650 outbound flows to AIA/cross-cert URLs each being
signed and policy-denied by the NE filter. The Security framework itself
is not the bottleneck — the same SecTrustEvaluateWithError calls under SecPolicyCreateBasicX509 complete in ~50 ms on the same machine.
What this changes
Bring the macOS loader in line with how Node.js
(ReadMacOSKeychainCertificates / IsCertificateTrustedForPolicy in src/crypto/crypto_context.cc) and Chromium
(net/cert/internal/trust_store_mac.cc) enumerate the keychain trust
store, plus skip the network entirely:
Drop kSecMatchTrustedOnly from SecItemCopyMatching. Per-cert
filtering already happens below; the query-level flag only adds
redundant network-revocation-enabled evaluations. Node does not pass it.
Replace the trust-settings stub with a full parser. The previous is_trust_settings_trusted_for_policy was a placeholder that always
returned UNSPECIFIED, so every cert fell through to the slow trust
evaluation. The new parser is a port of Node's IsTrustDictionaryTrustedForPolicy / IsTrustSettingsTrustedForPolicy
and resolves certs with explicit user/admin/system trust settings via
cheap local XPC lookups.
Defer the SecTrustEvaluateWithError fallback until after every
domain is consulted. Only certs with no decisive settings in any
domain reach it. This both avoids redundant XPC round-trips and respects
an explicit Deny in a later domain (the previous structure could call
the fallback during the user-domain iteration and short-circuit before
seeing an admin-domain Deny).
Use SecPolicyCreateBasicX509 + SecTrustSetNetworkFetchAllowed(false) for that fallback so it builds
the chain without touching the network. This is enumeration, not
connection-time validation; EKU/server-auth is enforced by OpenSSL at
handshake time against the resulting trust store. Trust-anchor
enumeration only needs to know whether the cert is a usable anchor.
The kSecTrustSettings{Application,Policy,PolicyString,Result}
dictionary keys are CFSTR() macros in SecTrustSettings.h, not
exported data symbols, so they are constructed at runtime with CFStringCreateWithCString rather than dlsym'd like the rest of the
loader. kSecPolicyOid and kSecPolicyAppleSSL are exported and
continue to be dlsym'd.
Behavior
Intended to be a perf-only change. On a clean Mac the result is
unchanged (verified: 0 certs both before and after — the default
keychain search list does not include SystemRootCertificates.keychain). On a managed Mac, certs that
previously fell through to SecTrustEvaluateWithError because the stub
never resolved them now resolve via the trust-settings parser, which is
the more authoritative answer.
Risk / non-goals
SecPolicyCreateBasicX509 is less strict than SecPolicyCreateSSL — it
does not check for server-auth EKU on the chain. That check is
appropriate for connection-time validation of a server's leaf cert, not
for deciding whether a CA cert in the local keychain should be a trust
anchor; OpenSSL applies its own EKU/basic-constraint checks at handshake
time. The change can therefore include a CA cert that the SSL policy
would have rejected, but those certs are already present and trusted in
the user's keychain.
SecTrustSetNetworkFetchAllowed is treated as optional (not in the
required-symbols check) since it's only a perf hint.
Testing
clang++ -fsyntax-only -Wall -Wextra against the same flags as the
build: clean.
Side-by-side test harness linking the old and new translation units
against BoringSSL: us_load_system_certificates_macos returns identical
results (0 certs, ~10 ms) on a clean dev Mac.
The trust-settings parser was exercised against the real System-domain
keychain dictionaries
(SecTrustSettingsCopyCertificates(kSecTrustSettingsDomainSystem), 159
certs) to confirm runtime-built CFString keys match CFSTR()-built
dictionary keys via content equality.
All dlsym symbol names verified to exist in Security.framework
(and the four that don't — the kSecTrustSettings* macros — are
constructed with CFStringCreateWithCString instead).
Existing test/regression/issue/24339.test.ts already skips on macOS
because the result is environment-dependent; no new test is added for
the same reason. A deterministic test would require installing a
temporary trust setting into a keychain, which needs elevated privileges
and is unsuitable for CI.
a17a13 process: surface CTRL_CLOSE_EVENT as SIGHUP / CTRL_BREAK_EVENT as SIGBREAK on Windows (#30321)
Fixes #16899
What
process.on('SIGHUP', …) and process.on('SIGBREAK', …) now receive
Windows console-control events, matching Node.js:
Console event
Signal
CTRL_CLOSE_EVENT (closing the console window)
SIGHUP
CTRL_BREAK_EVENT (Ctrl+Break)
SIGBREAK
Why
libuv's uv__signal_control_handler already maps these events to SIGHUP/SIGBREAK, but Bun's Windows signalNameToNumberMap was
missing both entries. So process.on('SIGHUP', …) was treated as a
plain EventEmitter event, no uv_signal_t was created, and libuv's
dispatch had nowhere to deliver the signal — the default handler
terminated the process.
How
Add SIGHUP and SIGBREAK to the Windows branch of signalNameToNumberMap
Add SIGBREAK to signalNames[] and signalNumberToNameMap (#ifdef SIGBREAK, so POSIX builds are unaffected)
Notes on #16899
The repro uses process.on('exit', …) only. That alone does not
fire on console-window close in Node.js either — Windows hard-terminates
the process after the control handler returns. The portable pattern is:
process.on('SIGHUP', () => process.exit()); // 'exit' listeners now run
on console close
After this PR, that works in Bun.
- `CTRL_LOGOFF_EVENT` / `CTRL_SHUTDOWN_EVENT` are [deliberately not
handled by
libuv](https://github.com/libuv/libuv/blob/v1.x/src/win/signal.c) (and
therefore Node.js): they are only sent to services, which have their own
SCM notification path. Interactive console processes don't receive them,
so there is nothing to wire up.
## Testing
`test/js/node/process/process-signal-windows.test.ts`:
- **SIGHUP**: child allocs its own console
(`FreeConsole`+`AllocConsole`), posts `WM_CLOSE` to its own
`GetConsoleWindow()` → conhost sends `CTRL_CLOSE_EVENT` → libuv
dispatches `SIGHUP`
- **SIGBREAK**: child in its own process group, parent calls
`GenerateConsoleCtrlEvent(CTRL_BREAK_EVENT, pid)`
The first commit on this branch contains only the tests (expected to
fail on Windows CI = red); the second commit adds the fix (= green).
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@​users.noreply.github.com></li><li><a href="https://github.com/oven-sh/bun/commit/79c3d4d169f2bf80db69d54617f4d1a5cad55278"><code>79c3d4</code></a> Fix TLA sibling import running before shared dep settles (#30259) (#30262)
Fixes #30259. Bumps WebKit to `88b2f7a2159c913f7dd0d73c0e88d66138cd67ba`
(oven-sh/WebKit#215, merged).
## What
A static import that re-imports a TLA dep already visited by a sibling
earlier in the **same** `Evaluate()` pass was skipping the spec-mandated
wait and running with the dep's post-`await` bindings still in TDZ:
```ts
// root.ts
import { foo } from "./await.ts"; // await.ts → EvaluatingAsync, suspended at `await 0`
import "./child.ts"; // child.ts visits await.ts, skips wait, runs too early
// await.ts
await 0;
export const foo = 123;
// child.ts
import { foo } from "./await.ts";
console.log(foo); // ReferenceError: Cannot access 'foo' before initialization
Why
The Bun-specific skip at innerModuleEvaluation 11.c.v (which avoids
self-deadlock when require(esm) / dynamic import re-enters a TLA
module from inside its own suspension) used pendingAsyncDependencies == 0 as the discriminator for "body already entered". A TLA dep with no
async deps of its own also has count 0 after suspending at its first await within the same DFS, so the discriminator matched the sibling
case.
Snapshot vm.moduleAsyncEvaluationCount() at the start of each Evaluate() and thread it through innerModuleEvaluation as asyncOrderWatermark. Only skip the wait when the cycle root's asyncEvaluationOrder()predates the watermark (i.e. it became EvaluatingAsync in a priorEvaluate() — the re-entrant deadlock
case). Siblings that transition during the current DFS get order >= watermark and keep the spec wait.
Tests
Two regression tests in test/js/bun/resolve/dynamic-import-tla-cycle.test.ts:
direct sibling re-import (the issue repro)
indirectly-shared TLA dep through different parents (guards against
discriminating by "asyncParentModule on stack")
Existing re-entrancy / deadlock-avoidance tests in the same file
continue to pass.
c5a2f8 MessageEvent: name Locker so m_data lock is actually held (#30290)
Found by Fuzzilli (fingerprint 8ffe5d0751a9d31e).
What
MessageEvent::initMessageEvent() and MessageEvent::memoryCost() both
intend to hold m_concurrentDataAccessLock while accessing the m_data
variant, but were written as:
Locker { m_concurrentDataAccessLock };
This constructs an unnamed temporary that destructs (and unlocks) at the
end of the full-expression — the lock is never held over the critical
section. Upstream WebKit names the locker (Locker locker { … }); the Locker<Lock> specialization is not [[nodiscard]], so there was no
compiler diagnostic.
Why it crashes
memoryCost() is called from JSMessageEvent::visitChildrenImpl on the GC marker thread and does std::visit(..., m_data). initMessageEvent() on the mutator thread reassigns m_data = JSValueTag {}. When m_data previously held a Ref<SerializedScriptValue> (MessageEvents delivered through BroadcastChannel / MessagePort), the visitor can observe a torn
variant and call data->memoryCost() on a moved-from Ref:
ASSERTION FAILED: m_ptr
wtf/Ref.h(165) : T *WTF::Ref<WebCore::SerializedScriptValue>::operator->() const
In release builds this is a SIGSEGV with no ASAN report (not a heap UAF
— a torn inline variant read).
Repro
With BUN_JSC_collectContinuously=1: receive ~100 MessageEvents via
BroadcastChannel, then call initMessageEvent() on each in a tight
loop. Before this change it crashed ~20% of the time; after, 0/20 runs
crash. Added as test/js/web/broadcastchannel/message-event-init-gc.test.ts.
The constructor checked for undefined/null but passed any other
value (numbers, strings, booleans, bigints, symbols) straight to Options.parseFromJS, which calls JSValue.get and asserts the target
is an object.
Now the constructor rejects any non-object argument with the existing
"Terminal constructor requires an options object" error.
Fingerprint: 2bd73cb78c0aa42e
611740 tty: update process.stdin.isRaw after setRawMode succeeds on Windows (#30288)
What
process.stdin.isRaw now reflects the current raw mode on Windows after process.stdin.setRawMode(flag) succeeds, matching Node.js and Bun on
POSIX.
In src/js/node/tty.ts, the Windows fd === 0 branch of ReadStream.prototype.setRawMode did:
if(this.fd===0){consterr=ttySetMode(flag);if(err){this.emit("error",newError("setRawMode failed with errno: "+err));}returnthis;// ← returns on BOTH error and success}// ...this.isRaw=flag;// ← never reached for Windows stdin
so this.isRaw = flag at the end of the method was unreachable for process.stdin on Windows. The other two paths (Windows fd !== 0 and
POSIX) only return early on error and fall through to the assignment.
Any code that reads isRaw to decide whether to restore cooked mode on
teardown (e.g. readline's _setRawMode, which consults input.isRaw)
would always restore cooked mode on Windows, which can leave a pending
cooked ReadConsole blocking the next raw-mode consumer until the user
presses Enter.
Fix
Restructure so the fd === 0 case only returns early on error and
otherwise falls through to the shared this.isRaw = flag assignment —
same shape as the other branches.
Verification
Added a test in test/js/node/tty.test.ts that spawns a child in a Bun.Terminal (so stdin is a real TTY on both POSIX and
Windows/ConPTY), calls setRawMode(true) / setRawMode(false), and
asserts isRaw tracks the mode and the method returns this.
bun bd test test/js/node/tty.test.ts
4 pass
0 fail
The affected code path is guarded by process.platform === "win32"
(inlined at bundle time), so the fail-before/pass-after delta is only
observable on Windows CI. On POSIX the test locks in the existing
(already correct) behaviour.
The constant-folded test expression and the pruned dead branch are
correct, but the empty else {} remnant is left behind. The --define
docs guide also
promised full collapse to just console.log("success!"); — which only
ever happened with --minify-syntax.
Cause
visitStmt's s_if handler had an "else trim" gated on minify_syntax:
// Trim unnecessary "else" clausesif (p.options.features.minify_syntax) {
if (data.no!=nulland@​as(Stmt.Tag, data.no.?.data) ==.s_empty) {
data.no=null;
}
}
Two issues:
Gated on minify_syntax, even though the thing it's cleaning up (an else body emptied by DCE statement-pruning) is produced by DCE itself.
Checked for .s_empty, but visitSingleStmtBlock only converts an
empty block to .s_empty when minify_syntax is on — otherwise the
pruned body stays as an .s_block with stmts.len == 0.
The scaffolding collapse (if (true) { A } → A) intentionally stays
behind --minify-syntax since tests (e.g. transpiler.test.js's DCE
block, "if (undefined) { let y = Math.random(); }" → "if (undefined) {}") lock that contract in, matching esbuild. This change is strictly
about cleaning up the ugly empty-else remnant.
Fix
Gate the else-trim on dead_code_elimination and treat an .s_block
with zero stmts as equivalent to .s_empty:
if (p.options.features.dead_code_elimination) {
if (data.no) |no_stmt| {
constno_is_empty=switch (no_stmt.data) {
.s_empty=>true,
.s_block=>|block|block.stmts.len==0,
else=>false,
};
if (no_is_empty) {
data.no=null;
}
}
}
Also updates the define-constant docs page so the promised before/after
output matches what the bundler actually emits, and points users to --minify-syntax / --minify for the full scaffolding collapse shown
as the final step.
Verification
test/bundler/bundler_edgecase.test.ts's DCEEmptyElseTrimmed#30271
asserts the output of the repro contains neither fail nor else. It
fails on main (output is if (true) { console.log("success"); } else {}) and passes on this branch (if (true) { console.log("success"); }).
Existing DCE tests (bundler_edgecase.test.ts's DCE*, transpiler.test.js's DCE block, esbuild/dce.test.ts, esbuild/default.test.ts, esbuild/ts.test.ts, bundler_minify.test.ts) all pass — no test depended on an else {}
remnant surviving.
Fixes #30271
ff38b2 process: drop EPOLLONESHOT for Linux pidfd exit poll (#30301)
What
On Linux, a subprocess's pidfd was registered with EPOLLONESHOT. If
the event was dropped in user-space after epoll_wait returned it
(before onUpdate ran), the kernel had already disarmed the fd, so the
process's 'exit' event never fired — it only arrived when the next
unrelated timer happened to wake the loop.
Observed in the wild as a 5s afterAll timeout in anthropic-experimental/sandbox-runtime on GH Actions ubuntu-24.04
x86 runners (~80% hit rate): two socat bridges with stdio: 'ignore'
are SIGTERM'd together; both exit within a few ms, but one's 'exit'
event doesn't fire until the 5000ms setTimeout fallback.
Root cause
epoll_wait returns a batch of ready events into loop->ready_polls
and disarms every EPOLLONESHOT fd in that batch before user-space sees
any of them. us_internal_dispatch_ready_polls then iterates the batch
via loop->current_ready_poll.
A poll callback can re-enter us_loop_run_bun_tick — the tick_depth
guard in loop.c acknowledges this for expect().toThrow() → waitForPromise → autoTick, and the same applies to expect().resolves/.rejects and any other waitForPromise caller.
The inner tick calls epoll_wait again, overwriting loop->ready_polls, num_ready_polls, and current_ready_poll. When
control returns to the outer dispatch loop, its remaining indices now
point at the inner tick's data (or past the new num_ready_polls), so
those events are silently skipped.
For a one-shot pidfd, a skipped event is unrecoverable: the fd is
disarmed in the kernel, onUpdate never ran so .needs_rearm was never
set, and nothing re-arms it. The next epoll_wait can't return it.
Fix
Register the pidfd level-triggered (no EPOLLONESHOT) on Linux. A pidfd
stays readable from process exit until close(), so a dropped ready_polls slot is harmless — the next epoll_wait just returns it
again. After wait4 succeeds we close the pidfd anyway (via Process.close()), which removes it from epoll, so there's no
busy-loop. rewatchPosix still defensively re-registers if wait4(WNOHANG) returns 0; with level-triggered that's a harmless CTL_MOD.
macOS/FreeBSD continue to use EV_ONESHOT with EVFILT_PROC + NOTE_EXIT, which is inherently once-per-process and auto-removed by
the kernel.
This doesn't address the underlying nested-tick ready_polls corruption
(which also affects other one-shot FilePoll owners), but pidfds are
the case where it's both silent and common — pipes/sockets report the
drop via EPOLLHUP on the next wait, whereas a disarmed pidfd has no
second signal.
Verification
test/js/bun/spawn/pidfd-exit-nested-tick.test.ts spawns 20 stdio: 'ignore' children so their pidfd events land in one epoll_wait batch,
then forces a nested autoTick from the first onExit via expect(Bun.sleep(1)).resolves.toBe(undefined).
*_jsc/ extractions: every toJS/fromJS/JSValue/JSGlobalObject-touching fn
moved out of the layer dirs into a sibling *_jsc/ file (or runtime/ or
jsc/), with a 'pub const toJS = @import(...).xToJS;' alias left on the
original struct so call sites are unchanged. Covers http, logger, sys,
string, url, css, semver, patch, sourcemap, dns, cares, js_parser,
install, bundler, sql, valkey, csrf, fmt, crash_handler, s3, resolver,
paths, ini, collections, standalone_graph, uws, boringssl, lolhtml,
options_types/schema.
After this commit: rg 'JSValue|JSGlobalObject|CallFrame' across all 75
non-jsc/-runtime dirs returns only 7 false positives (comments, an
enum-tag string, a README). bun bd + zig:check-all 61/61 green.
c8b4c3 restructure src/: pure git mv into subject-area directories
Relocates ~2050 files into per-area subdirectories. Zero content edits —
every change is a rename so git log --follow and blame survive. Build is
broken until the follow-up commit (bun.zig re-exports + build-script paths
relative @import fixups).
Top-level src/*.zig is now the 8 entry-point files only. src/bun.js/ is
gone (split into src/jsc/ + src/runtime/ + src/event_loop/). src/deps/
is gone except uucode and the libuwsockets .cpp shims.
c18740 Fix heap overflows in normalizePathWindows (#30135)
Problem
normalizePathWindows in src/sys.zig writes into pooled [32767]u16
buffers at several points without bounds-checking the input length. The
originally reported case was the dirfd-join branch, but the same pattern
exists at every write site in the function:
UTF-8→UTF-16 conversion (T == u8): convertUTF8toUTF16InBuffer
forwards only output.ptr to simdutf, which performs no output bounds
checking. Upstream path validation caps at MAX_PATH_BYTES (~98302
bytes on Windows), not PATH_MAX_WIDE (32767), so an ASCII input in
(32767, 98302] bytes writes past the conversion buffer before any
downstream check runs.
Device-path (\\.\) early-return: copies the path verbatim into buf and writes a NUL at buf[path.len] with no length check.
Absolute-path normalization: calls normalizeStringGenericTZ
into buf with .add_nt_prefix=true and .zero_terminate=true; that
function performs no bounds checking, and the NT prefix adds up to 6 u16
(\\ → \??\UNC\) plus a NUL.
Separator-free early-return: if (buf.len < path.len) permitted path.len == buf.len, after which buf[path.len] = 0 wrote one past
the end.
Relative-path dirfd join: concatenates base_path + '\' + path
into a pooled [32767]u16 buffer with no bounds check, then normalizes
into buf (same NT-prefix headroom issue as 3).
Repro
On Windows, via fs.readdirSync / fs.writeFileSync (which route
through openDirAtWindowsA / openFileAtWindowsT → normalizePathWindows):
Without this fix: out-of-bounds panic in debug/safe builds, heap
corruption in release.
Fix
Return ENAMETOOLONG before each write site when the input (or input +
NT-prefix headroom + NUL) would not fit in the destination WPathBuffer. The NT-prefix headroom (8 u16) and the error value are
hoisted as function-level constants to avoid duplication.
Test
Added Windows-gated cases in test/js/node/fs/fs-path-length.test.ts
covering each overflow site via fs.readdirSync / fs.writeFileSync.
Verified zig:check-all passes on all targets including both Windows
arches.
f116fb sql(mysql): don't double-ref query string in MySQLQuery.init (#30137)
Fixes #28799
What
MySQLQuery.init() was calling query.ref() on the bun.String it
received, but its only caller (JSMySQLQuery.createInstance) already
passes a +1-ref'd string from JSValue.toBunString(). That left every
query string at refcount 2; cleanup() derefs once, so the underlying WTFStringImpl for every MySQL query string was leaked.
This is why #28799 only reproduced with dynamic interpolation: with a
static template literal the same JS string is reused on every call (its
refcount grows but no new allocation), whereas interpolation constructs
a fresh string each time and every one of them is leaked.
How
Drop the redundant query.ref() so init() takes ownership of the
already-ref'd string. This matches how PostgresSQLQuery handles the
same pattern (direct assignment of try query.toBunString(globalThis)
to the field).
+/// Takes ownership of `query` (caller must have already ref'd it, e.g. via+/// `JSValue.toBunString`). `cleanup()` will deref it exactly once.
pub fn init(query: bun.String, bigint: bool, simple: bool) @​This() {
- query.ref();
return .{
Verification
New test test/js/sql/sql-mysql-query-string-leak.test.ts uses a mock
MySQL server (no Docker required) that OKs every simple query, runs 200
unique ~512 KiB query strings through the full create→run→finalize
lifecycle in a subprocess, and checks RSS growth.
7f2258 Fix crash in process.setgroups/hrtime with accessor-indexed arrays (#30179)
What does this PR do?
Fixes a segfault in process.setgroups() and process.hrtime() when
the array argument has an index backed by an accessor property (via Object.defineProperty with get/set), or when the array is sparse.
getIndexQuickly() assumes the array has contiguous indexed storage and
reads garbage (or crashes outright) for arrays in slow-put / sparse
mode. Switched to getIndex(), which handles all indexing types and
invokes getters, and added RETURN_IF_EXCEPTION to propagate any
exception the getter throws.
After the fix, setgroups throws ERR_INVALID_ARG_TYPE for
non-number/string getter results, and hrtime uses the getter's return
value.
Found by Fuzzilli (fingerprint ff0362403bc04fc8).
How did you verify your code works?
Verified the original fuzzer repro no longer segfaults and throws a
proper JS error.
Added test/js/node/process/process-array-accessor-crash.test.ts
covering accessor properties, throwing getters, and sparse arrays for
both setgroups and hrtime.
Existing test/js/node/test/parallel/test-process-setgroups.js, test-process-hrtime.js, and test/regression/issue/isArray-proxy-crash.test.ts still pass.
f58cd4 resolver: allow @​-prefixed subpaths under wildcard exports (#30188)
Fixes #30187.
Repro
A wildcard exports pattern "./*": "./dist/packages/*" failed to
resolve subpaths whose matched substring starts with @​:
$ bun --print 'await Bun.resolve("test-pkg/@​scope/sub/index.js", process.cwd())'error: Cannot find module 'test-pkg/@​scope/sub/index.js' from '/tmp/test'
Node resolves it fine. The file exists on disk — it's purely a specifier
parsing bug.
Motivating real-world case: ember-source@​6.12's package.json is { "exports": { "./*": "./dist/packages/*" } } and its internal
subpackages live at dist/packages/@​ember/*, dist/packages/@​glimmer/*, dist/packages/@​simple-dom/*. Every one of those subpaths failed in
Bun.
Cause
ESModule.Package.parse in src/resolver/package_json.zig scanned the entire specifier for @​ to split off pkg@​version:
For test-pkg/@​scope/sub/index.js this found the @​ at index 9 (start
of @​scope) and misparsed it as a version delimiter — the name became test-pkg/, the "version" became scope, and the subpath parse read
from the wrong offset. The resolver then went looking for a package
called test-pkg/ in node_modules, didn't find it, and errored.
Same shape for any @​ past the first /: pkg/with@​sign/sub failed
too.
Fix
parseName already bounds the name to the first / (or the second /
for scoped names). A version delimiter @​ can only appear within
that span — e.g. in test-pkg@​1.0.0/sub, the @​ is at index 8 and package.name.len is 14. Restrict the search to specifier[offset..package.name.len] and any @​ outside it is subpath
content, so the no-version branch fires and parseSubpath gets the
correct slice.
Verification
All three failing forms from the issue now resolve:
$ bun --print 'await Bun.resolve("test-pkg/plain/index.js", process.cwd())'.../test-pkg/dist/packages/plain/index.js
$ bun --print 'await Bun.resolve("test-pkg/@​scope/sub/index.js", process.cwd())'.../test-pkg/dist/packages/@​scope/sub/index.js
$ bun --print 'await Bun.resolve("test-pkg/with@​sign/sub/index.js", process.cwd())'.../test-pkg/dist/packages/with@​sign/sub/index.js
ember-source-shaped specifiers (@​ember/*, @​glimmer/*, @​simple-dom/* via a wildcard) all resolve.
New tests in test/js/bun/resolve/resolve.test.ts:
@​-prefixed subpath under a plain package (test-pkg/@​scope/sub)
f8fee8 test --isolate: retarget NapiEnv at new global; --parallel: abort on worker panic, never retry (#30216)
What
bun test --isolate / --parallel crashes when a test file loads a
native addon whose deferred napi finalizers outlive the file. The --parallel coordinator then silently retries the file once, which
masks the panic and lets the run exit 0.
Fixes #30205, #30191. Supersedes #30214 (same NapiEnv fix, but without
the coordinator change, the cleanup_hooks retarget, or a test that
actually reproduces on unpatched main).
Reproduction
git clone https://github.com/workglow-dev/libs &&cd libs
bun i && bun run build:packages
bun test --timeout=30000 --parallel=4 packages/test/src/test/{util,task}/*.test.ts
On main (d484fd6e), 3–4 workers crash per run with either
and in release builds the segfaults at 0x68 / 0xD0 reported in
#30205.
Root cause
Frame-pointer walk from the assertion:
#3 Bun::NapiHandleScope::open(Zig::GlobalObject*, bool)
#4 NapiHandleScope__open
#6 napi.Finalizer.run
#7 napi.NapiFinalizerTask.runOnJSThread
#10 event_loop.tick
#11 event_loop.waitForPromise
#13 VirtualMachine.loadEntryPointForTestRunner ← next test file
NapiEnv::m_globalObject is a raw Zig::GlobalObject*. For
non-experimental addons (nm_version != NAPI_VERSION_EXPERIMENTAL,
which is ~every real-world addon — sharp, better-sqlite3, etc.), napi_wrap/napi_create_external finalizers are deferred to the
event loop as NapiFinalizerTask rather than run inside GC sweep.
Objects rooted on the old global (module graph, globalThis.*) only
become collectable when Zig__GlobalObject__createForTestIsolation runs gcUnprotect(oldGlobal). The DeferGC from #29573 ends at that
function's }, so the next GC runs there, collects those objects, and
enqueues their finalizers. Those tasks then run on the very next eventLoop().tick() — inside loadEntryPointForTestRunner's waitForPromise for file N+1. Finalizer.run opens a NapiHandleScope
via env->globalObject(), which reads NapiHandleScopeImplStructure()
off the dead cell and writes m_currentNapiHandleScopeImpl on it →
write barrier on an unmarked cell.
The --parallel coordinator's reapWorker then re-queued the file once
(retries[idx] < 1) into a fresh worker with no stale NapiEnv, which
passed — so the run reported 0 fail despite multiple Bun panics in the
log.
Fix
NapiEnv retarget (ZigGlobalObject.cpp, napi.h): Zig__GlobalObject__createForTestIsolation now calls newGlobal->adoptNapiEnvsForTestIsolation(oldGlobal) before gcUnprotect. Each NapiEnv::m_globalObject is repointed at the new
global and the Ref<NapiEnv>s are moved over, so late finalizers open
handle scopes on a live global and the envs stay owned after the old
global is swept. VirtualMachine.swapGlobalForTestIsolation also
repoints rare_data.cleanup_hooks[*].globalThis so CleanupHook.eql()
stays accurate.
No retry, abort on panic (Coordinator.zig): removed the per-file
retry. A worker that dies mid-file is counted as one failure. If it died
by a fatal signal (SIGILL/SIGTRAP/SIGABRT/SIGBUS/SIGFPE/SIGSEGV/SIGSYS —
Bun's own @​trap(), a JSC/WTF assertion, or native-addon crash), the
whole run aborts with error: a test worker process crashed with <SIG> while running <file>. process.exit() / SIGKILL are still just a
per-file failure and the run continues.
Verification
test/regression/issue/30205.test.ts — 4 tests. Adds a tiny
non-experimental addon (isolate_finalizer_addon.c) and a fixture
pattern (Bun.gc(true) + module-scope await 0 + objects rooted on globalThis) that crashes 8/8 on unpatched main and passes 8/8
with this change.
workglow-dev/libs full 201-file unit suite: 3× clean --parallel=4
runs (was 3–4 crashes/run).
Gate: git stash -- src/ && bun bd test test/regression/issue/30205.test.ts → 3/4 fail; with fix → 4/4 pass.
test/cli/test/isolation.test.ts, test/regression/issue/29519.test.ts → pass (one pre-existing unrelated
timeout in isolation.test.ts, same as #29573).
test/cli/test/parallel.test.ts → all tests I touched pass; the 3
timing-sensitive scale-up/work-steal tests that fail in this container
fail identically on unmodified main.
191edc image: preserve ICC profile through WebP decode/encode (#30211)
Closes #30197. Follow-up to #30201, which added ICC carry-through for
JPEG and PNG but left WebP dropping the profile because libwebpmux/libwebpdemux weren't linked.
Repro
// any JPEG/PNG with an embedded ICC profile — P3, Adobe RGB, Jpegli XYBawaitBun.file("p3.png").image().webp().write("out.webp");// out.webp had no ICCP chunk → viewers reinterpret as sRGB → colours shift
And the reverse direction: a WebP carrying an ICCP chunk lost it on
decode, so webp → png/jpeg also shifted colour.
Cause
WebP stores ICC profiles in an ICCP chunk inside a VP8X RIFF container
that wraps the VP8/VP8L bitstream. WebPDecodeRGBA/WebPEncodeRGBA
only touch the bitstream chunk; reading or writing sibling chunks needs
the separate demux/mux APIs, and Bun only compiled src/{dec,enc,dsp,utils}.
Fix
Build (scripts/build/deps/libwebp.ts): add src/demux/*.c and src/mux/*.c from the same libwebp checkout. Plain C, no new deps, same
include paths.
Decode (src/image/codec_webp.zig): after WebPDecodeRGBA, run WebPDemux on the original bytes, check WEBP_FF_FORMAT_FLAGS & ICCP_FLAG, and WebPDemuxGetChunk("ICCP") the profile into Decoded.icc_profile (duped into bun.default_allocator to match
JPEG/PNG ownership). A plain VP8/VP8L WebP with no VP8X wrapper falls
through with null.
Encode: webp.encode now takes icc_profile: ?[]const u8. When null/empty, keep the existing one-shot WebPEncodeRGBA fast path
(bare VP8/VP8L, no VP8X). When set, pass the bitstream through WebPMuxSetImage + WebPMuxSetChunk("ICCP") + WebPMuxAssemble to
produce a VP8X-wrapped file and hand the assembled buffer to JS with WebPFree as the finaliser.
codecs.zig / Image.zig / bun.d.ts comments updated to drop the
"WebP loses the profile" caveat.
Verification
New tests in the existing describe("ICC profile") block of test/js/bun/image/image.test.ts walk the output RIFF container to find
the ICCP fourcc and compare the payload byte-for-byte:
PNG iCCP → WebP lossy → ICCP chunk present, VP8X flag bit set
When a specifier contains non-ASCII characters, specifier.toUTF8() in resolveMaybeNeedsTrailingSlash heap-allocates a UTF-8 buffer (because
the underlying WTF string is Latin-1 or UTF-16 and needs converting).
For http://, https://, and // prefixes the resolver marks the
specifier as external and returns a Path.init(import_path) that points
directly into that temporary buffer.
resolveMaybeNeedsTrailingSlash then wrapped that slice in a borrowing bun.String.init(result.path) and freed the buffer via defer specifier_utf8.deinit() before returning. Callers in both Zig
(doResolveWithArgs) and C++ (moduleLoaderResolve, moduleLoaderImportModule) subsequently read poisoned memory when
formatting or converting the result to a JS string.
The query_string out-param had already been fixed to clone in the same
way; result.path needed the same treatment.
How
Clone result.path into an owned bun.String via bun.String.cloneUTF8.
The hardcoded-builtin branch that returned specifier now returns specifier.dupeRef() so all success paths return an owned string.
All callers (doResolveWithArgs, NodeModuleModule.findPath, and the
two C++ Zig__GlobalObject__resolve call sites) now deref() the
successful result after use.
This also fixes a pre-existing leak where onResolveJSC (plugin
onResolve) returned an owned WTFStringImpl that was never deref'd.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Updated Packages