Skip to content
Closed
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
049909e
Exit unsettled top-level await instead of hanging / busy-looping
robobun Apr 21, 2026
5bc2b07
[autofix.ci] apply automated fixes
autofix-ci[bot] Apr 21, 2026
06cdf76
Narrow the TLA loop-exit check: handle-work, not isEventLoopAlive
robobun Apr 21, 2026
4648123
doc: route-through comment references waitForPromiseOrLoopExit
robobun Apr 21, 2026
d2121d5
ci: retrigger gate (previous run hit network issue fetching parallel …
robobun Apr 21, 2026
fa4ac9e
test: put elapsed check before exitCode so hangs surface the timing diff
robobun Apr 21, 2026
f58a6b9
hasAnyHandleWork: also check concurrent_tasks queue
robobun Apr 21, 2026
baa2629
waitForPromiseOrLoopExit: drain microtasks before the liveness check
robobun Apr 21, 2026
1bcbbf5
loadEntryPoint (watcher): drain microtasks before the liveness check
robobun Apr 21, 2026
4c949c4
Only drain microtasks before the liveness break, not every iteration
robobun Apr 21, 2026
8caa0c8
ci: retrigger (previous runs hit pre-existing bun-install + cron flakes)
robobun Apr 21, 2026
cea4a97
test(29546): ignore stderr on the unhandled-rejection test
robobun May 4, 2026
68ddf5b
reload: don't defer hot reload on a stale-pending TLA promise
robobun May 4, 2026
b91669e
reload: drain microtasks on pending, then proceed if still stuck
robobun May 4, 2026
004d1d6
ci: retrigger (pre-existing test-http-emit-close flake on windows 201…
robobun May 4, 2026
847650d
reload: preserve defer for ref'd-await TLAs, proceed only when abandoned
robobun May 4, 2026
4febd06
test(29546): buffer across chunk boundaries in --hot stdout reader
robobun May 4, 2026
69efd22
test(29546): give the --hot reload test 30s on non-debug / infinity o…
robobun May 4, 2026
f201087
[autofix.ci] apply automated fixes
autofix-ci[bot] May 4, 2026
0ce2096
reload: drain microtasks before the forever-timer liveness check
robobun May 4, 2026
5215594
hot-reload: parity with uv_loop_alive + consume deferred flag on aban…
robobun May 4, 2026
e86f538
reload: drain microtasks after reloadEntryPoint so out-of-tick() call…
robobun May 5, 2026
52a136f
ci: retrigger (flake on test-http-should-emit-close + hot-sourcemap)
robobun May 5, 2026
921ad42
ci: retrigger (hot.test.ts sourcemap is a documented cross-PR flake)
robobun May 5, 2026
f8cf302
ci: retrigger (BuildKite agent creation failures — fleet issue)
robobun May 5, 2026
39f1d1a
ci: retrigger (fleet-wide flakes, see also #30250/30257/30258/30262/3…
robobun May 5, 2026
360dfa4
reportException: tail-recurse after reload() to see the fresh promise
robobun May 5, 2026
b78410c
ci: retrigger (13 buildkite agents expired on build 51653)
robobun May 5, 2026
8f3c598
ci: retrigger (test-http-should-emit-close — ~100% flake on main, see…
robobun May 5, 2026
93e2313
comment: refer to the local `promise` by name, not line number
robobun May 5, 2026
6d0fec0
ci: retrigger (11 cpp-build lanes expired — fleet outage, see also #3…
robobun May 5, 2026
207afb3
onBeforeExit: drain microtasks after dispatch so TLA continuations co…
robobun May 5, 2026
ab4b3b4
Revert onBeforeExit microtask drain — breaks test-event-capture-rejec…
robobun May 5, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
120 changes: 116 additions & 4 deletions src/jsc/VirtualMachine.zig
Original file line number Diff line number Diff line change
Expand Up @@ -720,7 +720,33 @@
var promise = this.pending_internal_promise orelse return;

switch (promise.status()) {
.pending => return,
.pending => {
// A prior `reload()` deferred because a ref'd await was in
// flight, and the await has now resolved into another
// pending state that the loop can no longer make progress
// on (e.g. `await Bun.sleep(N).then(() => new Promise(() =>
// {}))`). In that case `loadEntryPoint`'s watcher loop has
// already bailed out with the promise still `.pending` and
// no work to settle it — `hot_reload_deferred` would remain
// stranded and the deferred save silently dropped. Consume
// the flag here and let `reload()` re-evaluate (it'll
// observe the abandoned state and proceed). Guarded on the
// same "nothing ref'd but forever_timer" check `reload()`
// itself uses so we don't interrupt live ref'd work.
//
// Tail-recurse after the reload so the fresh
// `pending_internal_promise` is re-read from the caller
// rather than the stale local `promise` above. Without
// this, a new module body that synchronously rejects
// (top-level `throw`, transpile error) would leave the
// fresh `.rejected` unreported until the next loop wakeup
// — `tickPossiblyForever` blocks in `loop.tick()` first.
if (this.hot_reload_deferred and !this.eventLoop().hasAnyHandleWorkIgnoringForeverTimer()) {
this.reload(null);
return this.reportExceptionInHotReloadedModuleIfNeeded();
}
return;
Comment thread
robobun marked this conversation as resolved.
Comment thread
robobun marked this conversation as resolved.
},
.rejected => if (this.pending_internal_promise_reported_at != this.hot_reload_counter) {
this.pending_internal_promise_reported_at = this.hot_reload_counter;
this.unhandledRejection(this.global, promise.result(this.global.vm()), promise.toJS());
Expand Down Expand Up @@ -782,8 +808,49 @@
if (this.pending_internal_promise) |p| {
switch (p.status()) {
.pending => {
this.hot_reload_deferred = true;
return;
// Normally defer. A watcher event arrived while either:
// (a) the C++ module loader / transpile chain is still
// draining on the microtask queue,
// (b) a user ref'd await (timer / network / fs) is
// pending and will settle the promise.
// Both cases need the current continuation to run first;
// tearing down the module registry now would let the two
// chains interleave through one registry, or leave a
// zombie continuation firing against the freshly-reset
// global.
//
// The defer is consumed by
// `reportExceptionInHotReloadedModuleIfNeeded` once the
// promise settles. But `reportException…` early-returns
// on `.pending`, so if the promise NEVER settles (an
// abandoned top-level await — `await new Promise(() =>
// {})` or an unref'd `AbortSignal.timeout`) the deferral
// would be permanent and --hot would go silently dead.
//
// Drain microtasks first to collapse case (a): if a
// loader chain microtask flips the status to settled,
// one of the other arms handles it. If status is still
// `.pending` after the drain, check whether any source
// of work could advance it — using the
// `hasAnyHandleWorkIgnoringForeverTimer` variant because
// the --hot main loop holds a ref'd `forever_timer` on
// Windows that would otherwise make plain
// `hasAnyHandleWork`/`isActive` permanently true.
this.eventLoop().drainMicrotasksWithGlobal(this.global, this.jsc_vm) catch {};
switch (p.status()) {
.pending => if (this.eventLoop().hasAnyHandleWorkIgnoringForeverTimer()) {
// case (b): ref'd I/O will settle the promise.
this.hot_reload_deferred = true;
return;
},
// Fell through case (a) during the drain — let the
// matching arm decide.
.rejected => if (this.pending_internal_promise_reported_at != this.hot_reload_counter) {
this.hot_reload_deferred = true;
return;
},
.fulfilled => {},
}
Comment thread
robobun marked this conversation as resolved.
},
.rejected => if (this.pending_internal_promise_reported_at != this.hot_reload_counter) {
this.hot_reload_deferred = true;
Expand Down Expand Up @@ -821,6 +888,22 @@
}
// reloadEntryPoint() stores into pending_internal_promise on every return path.
_ = this.reloadEntryPoint(this.main) catch @panic("Failed to reload");

// `reloadEntryPoint` returns while the fetch/link/evaluate chain is
// still queued as JSC microtasks. When this `reload()` is invoked
// from inside `tick()` (the common single-save path), the
// `HotReloadTask` handler returns → `tick()`'s else-branch drains.
// But out-of-`tick()` callers — `reportExceptionInHotReloadedModule
// IfNeeded` (the stranded-flag consumer, see field doc on
// `hot_reload_deferred`) and the pre-existing `bun.js.zig` dispatch
// after an initial-load rejection — would return to the main loop
// with the new module body still unexecuted. `tickPossiblyForever`
// then blocks in `loop.tick()` before `this.tick()` ever drains the
// microtask queue, so the new module doesn't evaluate until the
// next loop wakeup (another save, or `forever_timer` firing in up
// to 4 minutes). Drain here to guarantee the reload takes effect
// before `reload()` returns, covering every call site.
this.eventLoop().drainMicrotasksWithGlobal(this.global, this.jsc_vm) catch {};
}

pub inline fn nodeFS(this: *VirtualMachine) *Node.fs.NodeFS {
Expand Down Expand Up @@ -864,6 +947,16 @@

pub fn onBeforeExit(this: *VirtualMachine) void {
this.exit_handler.dispatchOnBeforeExit();
// A `process.on('beforeExit', …)` handler may queue JSC microtasks
// (e.g. by resolving a TLA that `waitForPromiseOrLoopExit` bailed
// out on). `isEventLoopAlive()` can't see those — it checks handles,
// tasks, and refs, not the microtask queue — so without this drain
// the inner `while` below would skip and the continuation would be
// silently dropped before process exit. The analogous gap for
// `unhandledRejection` handlers is covered by the drain at the tail
// of `waitForPromiseOrLoopExit` (see event_loop.zig); this is the
// same pattern one frame later.
this.eventLoop().drainMicrotasksWithGlobal(this.global, this.jsc_vm) catch {};

Check warning on line 959 in src/jsc/VirtualMachine.zig

View check run for this annotation

Claude / Claude Code Review

Non-watcher path: TLA continuation that throws inside onBeforeExit drain is silently swallowed (exit 0)

nit: sibling to the resolved comment at event_loop.zig:700 (the *resolve* case, fixed by 207afb3) — if a `process.on('beforeExit', …)` handler resolves an abandoned TLA whose continuation then *throws* (e.g. `const r = await later; throw new Error('boom')`), this drain settles the module `JSInternalPromise` to `.rejected` and nothing reports it: the non-watcher path's only reporter (bun.js.zig:402-425) already ran while the promise was `.pending`, `JSInternalPromise` rejections don't flow throug
Comment thread
claude[bot] marked this conversation as resolved.
Outdated
var dispatch = false;
while (true) {
while (this.isEventLoopAlive()) : (dispatch = true) {
Expand All @@ -873,6 +966,7 @@

if (dispatch) {
this.exit_handler.dispatchOnBeforeExit();
this.eventLoop().drainMicrotasksWithGlobal(this.global, this.jsc_vm) catch {};
dispatch = false;

if (this.isEventLoopAlive()) continue;
Expand Down Expand Up @@ -2467,6 +2561,24 @@
if (this.pending_internal_promise.?.status() == .pending) {
this.eventLoop().autoTick();
}

// Top-level await with no ref'd handle to resolve it:
// bail so POSIX doesn't burn 100% CPU and Windows doesn't
// hang on `uv_run(NOWAIT)` skipping its loop body. See
// EventLoop.waitForPromiseOrLoopExit for the full rationale
// (and for why this deliberately avoids `isEventLoopAlive`,
// which short-circuits on `unhandled_error_counter != 0`).
if (this.pending_internal_promise.?.status() == .pending and !this.eventLoop().hasAnyHandleWork()) {
// Drain microtasks a user `process.on('unhandledRejection',
// …)` handler may have queued inside `autoTick`'s
// `handleRejectedPromises()` — see waitForPromiseOrLoopExit.
// If the drain runs the continuation that settles the promise,
// re-check and fall through to the loop condition.
this.eventLoop().drainMicrotasksWithGlobal(this.global, this.jsc_vm) catch break;
if (this.pending_internal_promise.?.status() == .pending and !this.eventLoop().hasAnyHandleWork()) {
break;
}
Comment thread
robobun marked this conversation as resolved.
Comment thread
robobun marked this conversation as resolved.
}
Comment thread
robobun marked this conversation as resolved.
}
},
else => {},
Expand All @@ -2477,7 +2589,7 @@
}

this.eventLoop().performGC();
this.waitForPromise(.{ .internal = promise });
this.eventLoop().waitForPromiseOrLoopExit(.{ .internal = promise });
}

return this.pending_internal_promise.?;
Comment thread
robobun marked this conversation as resolved.
Expand Down
128 changes: 128 additions & 0 deletions src/jsc/event_loop.zig
Original file line number Diff line number Diff line change
Expand Up @@ -575,6 +575,134 @@
}
}

/// Whether the event loop has any source of work that could still resolve a
/// pending promise — active uv/uws handles, queued tasks (main or
/// concurrent), concurrent refs, or immediates.
///
/// This is intentionally NOT `VirtualMachine.isEventLoopAlive()`:
/// `isEventLoopAlive()` short-circuits on `unhandled_error_counter != 0`
/// (it is the "should the main event loop keep running" predicate, and a
/// fatal error stops it). For the TLA wait path we only care about work
/// that could still wake the loop — a side-path unhandled rejection must
/// NOT abandon an in-flight fetch whose resolution will complete the TLA.
///
/// `concurrent_tasks.isEmpty()` is a single atomic acquire-load; it's safe
/// from the JS thread without locking. Including it closes a narrow race
/// where a thread (e.g. the hot-reloader watcher at `hot_reloader.zig:278`)

Check warning on line 591 in src/jsc/event_loop.zig

View check run for this annotation

Claude / Claude Code Review

Hardcoded cross-file line number 'hot_reloader.zig:278' in doc-comment will drift

nit: the doc-comment on `hasAnyHandleWork()` references "the hot-reloader watcher at `hot_reloader.zig:278`" — a hardcoded cross-file line number that will drift on any edit above that line. This is the same drift hazard you already accepted and fixed in this PR for the "line 720" reference in VirtualMachine.zig (commit 93e2313). For consistency, replace with a stable symbolic reference like "in `HotReloader.Task.enqueue()`".

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 nit: the doc-comment on hasAnyHandleWork() references "the hot-reloader watcher at hot_reloader.zig:278" — a hardcoded cross-file line number that will drift on any edit above that line. This is the same drift hazard you already accepted and fixed in this PR for the "line 720" reference in VirtualMachine.zig (commit 93e2313). For consistency, replace with a stable symbolic reference like "in HotReloader.Task.enqueue()".

Extended reasoning...

What the issue is. The new doc-comment on hasAnyHandleWork() (src/jsc/event_loop.zig:591) reads: "Including it closes a narrow race where a thread (e.g. the hot-reloader watcher at hot_reloader.zig:278) pushes to the concurrent queue without bumping concurrent_ref…". The hot_reloader.zig:278 reference is a hardcoded cross-file line number embedded in a doc-comment. Today line 278 is that.reloader.enqueueTaskConcurrent(&that.concurrent_task); inside HotReloader.Task.enqueue(), so the reference is accurate — but it will silently go stale the moment anything is inserted or removed above that line in hot_reloader.zig.

Why this is worth a one-line fix in this PR. This is exactly the same drift hazard that was already flagged and fixed in this very PR. Resolved inline-comment 3187088494 pointed out that "the stale capture at line 720 above" in VirtualMachine.zig would rot, and commit 93e2313 ("comment: refer to the local promise by name, not line number") changed it to "the stale local promise above". The author has demonstrated agreement that hardcoded line numbers in comments are a maintenance hazard worth fixing. Consistency within the same PR argues for the same treatment here — and a cross-file line reference is strictly more brittle than a same-function one, since hot_reloader.zig churns independently of event_loop.zig and there is no "above" hint to fall back on.

Why existing conventions don't excuse it. A grep of src/jsc/*.zig shows this is the only file.zig:NNN cross-file doc-comment reference in the directory, so it is not an established convention being followed — it's a one-off introduced by this PR.

Step-by-step proof of the drift.

  1. Today: hot_reloader.zig:278 is that.reloader.enqueueTaskConcurrent(&that.concurrent_task); inside Task.enqueue(). The doc-comment in event_loop.zig:591 says "at hot_reloader.zig:278". Correct.
  2. A future PR adds a single field, import, or helper function anywhere in hot_reloader.zig lines 1–277. Every line below shifts down by 1.
  3. hot_reloader.zig:278 is now whatever the old line 277 was — perhaps that.next = null; or a closing brace. The doc-comment in event_loop.zig still says "at hot_reloader.zig:278".
  4. A reader following the comment opens hot_reloader.zig to line 278, finds something unrelated to enqueueTaskConcurrent, and is left guessing what the "e.g." was supposed to point at.
  5. No compiler, linter, or formatter will ever catch this. The comment is never updated because nothing enforces it.

Impact. Purely cosmetic — no runtime effect whatsoever. The reference lives inside an "e.g." parenthetical, so it isn't load-bearing for understanding the function's contract. This should not block the PR.

How to fix. Replace "the hot-reloader watcher at hot_reloader.zig:278" with "the hot-reloader watcher in HotReloader.Task.enqueue()" (or just "the hot-reloader watcher's enqueueTaskConcurrent"). The function name is a stable symbolic anchor that survives line-number churn, and if it's ever renamed, grep/IDE will surface every reference. One-word edit, zero risk.

/// pushes to the concurrent queue without bumping `concurrent_ref` and the
/// other liveness signals happen to be zero between the tick drain and the
/// check.
pub fn hasAnyHandleWork(this: *EventLoop) bool {
const vm = this.virtual_machine;
return vm.event_loop_handle.?.isActive() or
vm.active_tasks > 0 or
this.tasks.count > 0 or
this.hasPendingRefs() or
!this.concurrent_tasks.isEmpty() or
this.immediate_tasks.items.len > 0 or
this.next_immediate_tasks.items.len > 0;
}
Comment thread
robobun marked this conversation as resolved.
Comment thread
robobun marked this conversation as resolved.

/// Like `hasAnyHandleWork`, but ignores the outer `--hot` main-loop's
/// `forever_timer` when counting active uv handles.
///
/// On Windows, `tickPossiblyForever` creates `forever_timer` as a ref'd
/// `uv_timer_t` specifically so `uv_run(UV_RUN_ONCE)` blocks on file-
/// watcher wakeups without immediately returning on a dead loop. Because
/// the handle is ref'd, `uv_loop_alive()` (and therefore
/// `event_loop_handle.isActive()`) reports the loop as "alive" even when
/// there's no real work to settle a pending promise. Callers that need
/// to distinguish "only forever_timer is holding the loop open" from
/// "a user-held ref'd handle is holding the loop open" should use this
/// variant. On POSIX, uws timers don't bump the uws `active` counter, so
/// the forever_timer is invisible to `isActive()` and this is equivalent
/// to `hasAnyHandleWork`.
///
/// Windows parity with `uv_loop_alive()`: also consider
/// `active_reqs.count`, `pending_reqs_tail`, and `endgame_handles`. A
/// native addon submitting a raw uv request via `napi_get_uv_event_loop`
/// bumps `active_reqs.count` without touching `active_handles` or any
/// Bun-side liveness signal, and we must not drop such work on the floor
/// (the false-negative direction is unrecoverable: we'd tear down the
/// module registry while the addon's uv_after_work_cb was still pending).
pub fn hasAnyHandleWorkIgnoringForeverTimer(this: *EventLoop) bool {
if (comptime Environment.isWindows) {
const vm = this.virtual_machine;
// Subtract the forever_timer's contribution from
// `active_handles` — it's always 1 ref'd handle when present.
// Note this is a heuristic: another handle closing as we sample
// would underreport, but the fall-back on a false positive is
// simply "defer the reload once more", which is recoverable.
const uv_loop = vm.event_loop_handle.?;
const active = uv_loop.active_handles;
const forever_timer_contribution: u32 = if (this.forever_timer != null) 1 else 0;
const effective_active = if (active > forever_timer_contribution) active - forever_timer_contribution else 0;
return effective_active > 0 or
uv_loop.active_reqs.count > 0 or
uv_loop.pending_reqs_tail != null or
uv_loop.endgame_handles != null or
vm.active_tasks > 0 or
this.tasks.count > 0 or
this.hasPendingRefs() or
!this.concurrent_tasks.isEmpty() or
this.immediate_tasks.items.len > 0 or
this.next_immediate_tasks.items.len > 0;
Comment thread
robobun marked this conversation as resolved.
}
return this.hasAnyHandleWork();
}

/// Like `waitForPromise`, but returns early when the event loop has nothing
/// left that could resolve the promise (see `hasAnyHandleWork`). Used by the
/// top-level-await entry points where the promise may be "unsettled" — e.g.
/// awaiting an abort event whose only source is an unref'd
/// `AbortSignal.timeout()` timer.
///
/// Without this, POSIX busy-loops at 100% CPU until the unref'd timer fires
/// and Windows hangs forever (`uv_run(UV_RUN_NOWAIT)` early-returns when
/// `uv__loop_alive()` is false, so unref'd Bun timers never fire via the uv
/// scheduler). Matches Node.js, which also exits an unsettled top-level
/// await without waiting on unref'd handles.
///
/// Callers that require a resolved promise on return should keep using
/// `waitForPromise` — this variant is specifically for the top-level-entry
/// path, which is prepared to observe a still-pending promise.
pub fn waitForPromiseOrLoopExit(this: *EventLoop, promise: jsc.AnyPromise) void {
const jsc_vm = this.virtual_machine.jsc_vm;
switch (promise.status()) {
.pending => {
while (promise.status() == .pending) {
if (jsc_vm.executionForbidden()) {
break;
}
this.tick();

if (promise.status() == .pending) {
this.autoTick();
}

if (promise.status() == .pending and !this.hasAnyHandleWork()) {
// About to break because nothing holds the loop alive.
// `autoTick` just called `handleRejectedPromises()`, which
// may have run a user `process.on('unhandledRejection',
// …)` handler that resolved the awaited promise via a
// JSC microtask. In `.bun` mode with a registered
// handler, that path returns WITHOUT a
// `defer drainMicrotasks`, so the continuation hasn't
// run yet and `hasAnyHandleWork` can't see it. Drain
// once and re-check the status — if the microtask
// settled the promise, fall through to the loop
// condition and exit normally instead of breaking.
this.drainMicrotasksWithGlobal(this.global, jsc_vm) catch return;
if (promise.status() == .pending and !this.hasAnyHandleWork()) {
break;
}
}
Comment thread
robobun marked this conversation as resolved.
}
Comment thread
robobun marked this conversation as resolved.
},
else => {},
}
}
Comment thread
robobun marked this conversation as resolved.

pub fn waitForPromiseWithTermination(this: *EventLoop, promise: jsc.AnyPromise) void {
const worker = this.virtual_machine.worker orelse @panic("EventLoop.waitForPromiseWithTermination: worker is not initialized");
switch (promise.status()) {
Comment thread
claude[bot] marked this conversation as resolved.
Expand Down
Loading
Loading