Error instead of crashing on deeply nested expressions in the transpiler by robobun · Pull Request #31242 · oven-sh/bun

robobun · 2026-05-23T00:09:28Z

Problem

Parser fuzzing found that long chains of prefix unary operators kill the process with a hard SIGSEGV instead of a JavaScript error:

new Bun.Transpiler({ loader: "js" }).transformSync("- ".repeat(20000) + "1");

40 KB of - is a trivial DoS payload against anything that transpiles untrusted code (bun run, bun build, Bun.Transpiler). Other deep shapes hit the same thing: f(f(f(…))) nested ~6–12K deep and [[[…]]]/ternary chains at lower depths on smaller thread stacks. In contrast, unclosed [/{/( nesting already fails cleanly with Maximum call stack size exceeded in a few milliseconds.

Cause

parse_expr_common bounds the parser's own recursion with StackCheck::is_safe_to_recurse(), but that check is calibrated for the parser's stack frames. The visit pass (visit_expr_in_out → e_unary/e_call/…) and the printer (print_expr) use noticeably more stack per AST level and had no check at all, so an AST that parses under the limit still blows the stack one pass later. (This is the expression-side twin of the statement-side issue documented on P::parse_stmt_depth.)

The ">10 s parse" reading from the fuzzer is the crash path, not the parser: below the overflow threshold parse time is linear (20K chained minuses ≈ 7 ms in a release build).

Fix

Same dynamic guard, applied to the passes that were missing it:

visit_expr_in_out now checks is_safe_to_recurse() and, when the stack is nearly exhausted, logs the existing Maximum call stack size exceeded error (once) and stops descending; _parse already halts on logged errors right after the visit pass.
print_expr does the same check and bails out; the print entry points (print_ast, print_json, print_with_writer_and_platform, print_common_js) turn that into an error instead of returning truncated output.
The recursive side-effect analysis helpers that can walk the full-depth AST (expr_can_be_removed_if_unused, simplify_unused_expr, SideEffects::to_boolean, SideEffects::is_primitive_with_side_effects) report the same stack-overflow error (via P::report_stack_overflow, once per parse) and stop recursing, so the transform fails with Maximum call stack size exceeded instead of degrading silently. ExprData::known_primitive (in bun_ast, no logger available) gets its own stack check and answers Unknown, i.e. skips the optimization.
Once the error is reported the visit pass skips the remaining expressions, push_scope_for_visit_pass re-syncs the parse-recorded scope order past subtrees the bail skipped (so the "Scope mismatch while visiting" sanity panic can't fire), and the DCE analyses skip entirely so they never read identifier refs the visit pass didn't resolve.
Expr::join_with_left_associative_op (used when simplifying unused ternaries) threads a stack check through its re-association and falls back to the unflattened node near the limit.
The binding path gets the same treatment as the expression path (parse_binding, visit_binding, print_binding for nested destructuring patterns), and --minify's substitute_single_use_symbol_in_expr reports the overflow instead of walking iteratively-built chains off the stack.

No hard depth constant: per-level frame sizes differ ~4× between release and sanitizer builds and thread stacks range from 4 MB workers to the main thread, so a fixed cap would either reject inputs that work today or still crash under ASAN. The dynamic check keeps today's capacity and turns every overflow into a clean error.

Verification

Before: transformSync("- ".repeat(20000) + "1") → SIGSEGV (release and ASAN debug builds); nested-call and nested-array inputs crash at various depths, including on 4 MB worker stacks.
After: every shape × depth × stack-size combination tested (unary/!/void chains, nested calls, nested arrays, ternary chains at 1.5K–100K depth, 4 MB and 8 MB stacks, ASAN debug build) either transpiles or throws Maximum call stack size exceeded / a print error — no signals, and errors return in milliseconds.
New tests in test/bundler/transpiler/transpiler.test.js next to the existing stack-overflow tests spawn a subprocess and assert it exits cleanly (no SIGSEGV) across shapes and depths; they fail on the previous build (child dies with SIGSEGV) and pass with this change.
test/bundler/transpiler/, bundler_minify.test.ts, and bundler_edgecase.test.ts pass with the debug build.

#7717 reports illegal hardware instruction while bundling vite-prebundled swapkit/libsodium code, and was diagnosed by a maintainer as a stack overflow from expression-parse/visit recursion on parser threads — the same failure class this PR guards against. The original reproduction can't be re-run today (the @swapkit/* RC packages it pins were unpublished), so this PR doesn't claim to close it: with this change that kind of overflow surfaces as a Maximum call stack size exceeded build error instead of killing the process.

The expression parser bounds its own recursion with a stack check, but its frames are smaller than the visit pass's and the printer's, so ASTs like 20,000 chained unary minuses or thousands of nested call arguments parsed fine and then overflowed the native stack (SIGSEGV) in the passes that follow. Add the same is_safe_to_recurse() guard to visit_expr_in_out (reporting the existing "Maximum call stack size exceeded" error) and to print_expr (the print entry points now return an error instead of truncated output), and give the recursive side-effect analysis helpers conservative early-outs.

coderabbitai · 2026-05-23T00:09:35Z

Warning

Review limit reached

@robobun, we couldn't start this review because you've used your available PR reviews for now.

Your plan currently allows 2 reviews/hour. Refill in 5 minutes and 50 seconds.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more review capacity refills, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: c3358545-9bc5-4e2d-94a5-8c75c5e86aa9

📥 Commits

Reviewing files that changed from the base of the PR and between 73a07bd and 5f169d0.

📒 Files selected for processing (8)

src/ast/expr.rs
src/js_parser/p.rs
src/js_parser/parse/mod.rs
src/js_parser/scan/scan_side_effects.rs
src/js_parser/visit/mod.rs
src/js_parser/visit/visit_expr.rs
src/js_printer/lib.rs
test/bundler/transpiler/transpiler.test.js

Walkthrough

This PR adds comprehensive stack-overflow protection throughout Bun's JavaScript compiler pipeline. It introduces recursion-safe checks in AST primitive inference, the parser visitor, side-effect analysis, and the JS printer, ensuring deeply nested expressions fail with clean diagnostics instead of crashes.

Changes

Stack-overflow protection across compiler pipeline

Layer / File(s)	Summary
Parser stack-overflow reporting infrastructure `src/js_parser/p.rs`	New `reported_stack_overflow` field and `report_stack_overflow(loc)` method ensure one-time logging of "Maximum call stack size exceeded" errors during parsing; `P::init` initializes the flag.
Parser visitor and DCE recursion guards `src/js_parser/visit/...`, `src/js_parser/p.rs`	Visit pass adds an early-return recursion guard; `push_scope_for_visit_pass` skips unconsumed scope-order entries after bail-outs; DCE helper `expr_can_be_removed_if_unused_without_dce_check` guards recursion and reports overflow.
AST expression primitive inference with stack guards `src/ast/expr.rs`	Introduces stack-checked helpers (`known_primitive_with_check`, `merge_known_primitive_with_check`) and uses them across re-association, conditionals, binaries (`
Side-effects analysis with stack-overflow guards `src/js_parser/scan/scan_side_effects.rs`	`simplify_unused_expr`, `is_primitive_with_side_effects`, and `to_boolean` now accept `P` and `Loc` where needed, check `stack_check` before recursing, report overflow when unsafe, and avoid unsafe simplifications.
JS printer stack-overflow detection and error handling `src/js_printer/lib.rs`	`Printer` now tracks `bun_core::StackCheck` and `stack_overflowed`; `print_expr` guards recursion and sets the flag when unsafe; entrypoints (`print_ast`, `print_json`, `print_with_writer_and_platform`, `print_common_js`) convert the flag into a `StackOverflow` error.
Regression tests for deep expression handling `test/bundler/transpiler/transpiler.test.js`	Two tests spawn Bun on deeply nested expressions and long unary chains to assert processes exit without signal crashes and allow expected stack-overflow diagnostics or clean exit codes.

Suggested reviewers

Jarred-Sumner
dylan-conway
RiskyMH

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: adding error handling instead of crashes for deeply nested expressions in the transpiler.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The pull request description thoroughly addresses both required sections: it clearly explains the problem (deeply nested expressions causing SIGSEGV), the solution (dynamic stack checks via StackCheck::is_safe_to_recurse()), and detailed verification across multiple test scenarios and build configurations.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

robobun · 2026-05-23T00:09:40Z

^{Updated 11:00 AM PT - May 23rd, 2026}

✅ @robobun, your commit 5f169d0351892104accdb2b8f045ccc011b750df passed in Build #57313! 🎉

🧪 To try this PR locally:

bunx bun-pr 31242

That installs a local version of the PR into your bun-31242 executable, so you can run:

bun-31242 --bun

github-actions · 2026-05-23T00:12:31Z

Found 1 issue this PR may fix:

illegal hardware instruction on M1 max cpu #7717 - Reports an illegal hardware instruction crash caused by deeply nested subexpressions (100+ deep) in ASMJS-generated code (libsodium). A maintainer confirmed it's a stack overflow due to recursion in expression parsing/visiting — exactly the passes this PR adds stack guards to.

If this is helpful, copy the block below into the PR description to auto-close this issue on merge.

Fixes #7717

🤖 Generated with Claude Code

…tly falling back Per review: when expr_can_be_removed_if_unused, simplify_unused_expr, or SideEffects::to_boolean hit the stack limit they now log the same "Maximum call stack size exceeded" error (once per parse, via the new P::report_stack_overflow helper) so the parse fails, instead of silently returning a conservative answer.

…primitive and is_primitive_with_side_effects - push_scope_for_visit_pass: when the visit pass bailed out on a stack overflow, skip ahead to the entry the push is asking for instead of panicking with "Scope mismatch while visiting" on the already-failing parse; visit_expr_in_out also stops visiting once the error is reported. - ExprData::known_primitive recurses through ??/&&/||/+ operands and ternaries that are built iteratively, so it gets its own stack check and answers Unknown near the limit (bun_ast has no logger to report through). - SideEffects::is_primitive_with_side_effects now takes the parser and reports the stack overflow like the other analysis helpers. - Tests: add the scope-desync and iterative-chain shapes to the crash matrix, and bound the exit code in the bun -e test so a crash on Windows (no signal codes) can't pass.

…urate error locations, accept the printer error in the test - known_primitive initializes one StackCheck and threads it through the recursion (and merge_known_primitive) instead of re-initializing per level. - is_primitive_with_side_effects takes the expression's Loc so the reported error points at the operand instead of the lexer's EOF position; to_boolean reports with Loc::EMPTY since ExprData carries no location. - The crash-matrix test also accepts the printer guard's error string so it doesn't depend on which pass hits its limit first on a given build.

…it bail-out - Expr::join_with_left_associative_op re-associates one frame per level of same-op nesting; thread a StackCheck through the recursion and fall back to the unflattened shape when the stack is nearly exhausted. - Once the visit pass has bailed out, identifier refs in unvisited subtrees are still parse-time slices, so expr_can_be_removed_if_unused and simplify_unused_expr now skip their analysis entirely instead of reading them. - Add both reproduction shapes to the crash-matrix test.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/ast/expr.rs`:
- Around line 800-809: The block that matches Data::EBinary and mutates
comma.right must return the modified expression immediately; after updating
comma.right via Self::join_with_left_associative_op_with_check(op, comma.right,
b, stack_check) you should return the updated `a` so the function does not fall
through and create a new binary `a op b`. Locate the match on `Data::EBinary(mut
comma)` and, when `comma.op == crate::OpCode::BinComma` and you assign to
`comma.right`, add a `return a;` (or otherwise return the mutated `a`) so the
AST remains `(a, b op c)` instead of constructing ` (a, b op c) op c`.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f351de95-f5c5-40f9-84f0-17d83c2c9551

📥 Commits

Reviewing files that changed from the base of the PR and between d569ad0 and 73a07bd.

📒 Files selected for processing (4)

src/ast/expr.rs
src/js_parser/p.rs
src/js_parser/scan/scan_side_effects.rs
test/bundler/transpiler/transpiler.test.js

…single-use substitution Per review: deeply nested destructuring patterns recurse through parse_binding/visit_binding/print_binding without touching the guarded expression paths, and --minify's substitute_single_use_symbol_in_expr walks iteratively-built binary chains. Give all four the same stack check as their expression-side counterparts, and add the destructuring and minify substitution shapes to the crash-matrix test.

With eleven shapes at three depths the subprocess takes just over the 5s default under the ASAN debug build.

The comma branch updates the comma's right side in place but then fell through to build another "a op b" node on top, duplicating b's side effects ("(f(), g()) ? 1 : h();" simplified to code that can call h() twice). Return the mutated expression instead, matching the documented "(a, b) op c => a, b op c" transform.

Jarred-Sumner

Remove all the comments you added.

robobun · 2026-05-23T08:28:53Z

Done — removed in 3a14d17.

substitute_single_use_symbol_in_expr gets the same reported_stack_overflow check as the other analysis helpers so it never reads identifier refs the visit pass didn't resolve.

robobun · 2026-05-23T09:38:20Z

CI note: the only failed job on the current build (57283, still running) is Ubuntu 25.04 x64 test-bun, where test/js/bun/util/v8-heap-snapshot.test.ts was killed by SIGKILL (out-of-memory on the runner) — unrelated to this change; every parser/transpiler lane that has finished is green. A single job retry on that lane should clear it.

…p-expr-stack-overflow

robobun · 2026-05-23T11:20:44Z

Merged current main into the branch (5f169d0) so the binary-size comparison is against the same base as the canary — the "+11 MB on Linux" in the last build was the branch's older base missing main's compressed-libicudata change (#31200), not this diff (all non-Linux targets were slightly smaller). Guards and tests re-validated on the merged base; transpiler suite green.

…er (#31333) ### Problem Fuzzing found a second transpiler stack overflow (`sig:SIGSEGV:nostack`): ~600 nested `{` blocks crash the process. ```js new Bun.Transpiler({ loader: "tsx", target: "bun", minifyWhitespace: true, deadCodeElimination: true }) .transformSync("{".repeat(600) + 'class Test1 { static "prop1" = 0; }' + "}".repeat(600)); ``` #31242 guarded the **expression** recursion (`visit_expr_in_out`, `print_expr`, DCE helpers), but the **statement** recursion was left unguarded. Nested blocks stay under `MAX_STMT_DEPTH` (1000) in `parse_stmt`, then the visit pass recurses through `visit_stmts → visit_and_append_stmt → s_block → visit_stmts` with no stack check — each level stacks several multi-KB frames, so a few hundred levels exhaust the thread's stack (reproduces at depth 800 on a debug build's 8 MB main stack; smaller stacks crash at 600): ``` #5 visit_stmts src/js_parser/visit/mod.rs:1280 #6 s_block src/js_parser/visit/visit_stmt.rs:1627 #7 visit_and_append_stmt src/js_parser/visit/visit_stmt.rs:108 #8 visit_stmts src/js_parser/visit/mod.rs:1336 ... (repeats until SIGSEGV) ``` ### Fix Guard the statement recursion the same way the expression recursion already is: - `visit_and_append_stmt` now checks `stack_check.is_safe_to_recurse()` (plus the `reported_stack_overflow` fast-path) and reports "Maximum call stack size exceeded" instead of descending, mirroring `visit_expr_in_out`. - `print_stmt` and `print_if` (which self-recurses for `else if` chains without passing through `print_stmt`) get the same guard `print_expr`/`print_binding` already have, so a deep AST printed on a thread with less stack headroom errors instead of overflowing. - Removed the `MAX_STMT_DEPTH`/`parse_stmt_depth` hard cap from `parse_stmt` (review feedback): recursion depth in every phase is now governed by `StackCheck` alone, matching the Zig parser. - Guarded `hoist_symbols` the same way: it walks the scope tree before the visit pass at the full depth the parser allowed, and was only kept safe previously by the now-removed cap (the 15k-deep `lots-of-for-loop.js` fixture overflowed it in release builds otherwise). With this, every arbitrarily-nestable AST recursion (statements, expressions, bindings) is stack-checked in all three phases (parse, visit, print); deep inputs throw a catchable `Maximum call stack size exceeded` error. ### Verification New test `deeply nested statement blocks error instead of crashing the process` in `test/bundler/transpiler/transpiler.test.js` transpiles nested-block and `else if`-chain shapes at depths 600/800/990 (below the parse-time cap, deep enough to overflow an unguarded visitor) in a subprocess and asserts it exits cleanly. - Without the fix: the subprocess dies with SIGSEGV at depth 800+ (debug build), so the test fails. - With the fix: `bun bd test test/bundler/transpiler/transpiler.test.js` → 147 pass, 0 fail; the repro above now throws `Maximum call stack size exceeded`.

github-actions Bot added the claude label May 23, 2026

Jarred-Sumner reviewed May 23, 2026

View reviewed changes

Comment thread src/js_parser/scan/scan_side_effects.rs

Jarred-Sumner reviewed May 23, 2026

View reviewed changes

Comment thread src/js_parser/scan/scan_side_effects.rs