Skip to content

Error instead of crashing on deeply nested expressions in the transpiler#31242

Merged
Jarred-Sumner merged 12 commits into
mainfrom
farm/499eacc8/fix-deep-expr-stack-overflow
May 24, 2026
Merged

Error instead of crashing on deeply nested expressions in the transpiler#31242
Jarred-Sumner merged 12 commits into
mainfrom
farm/499eacc8/fix-deep-expr-stack-overflow

Conversation

@robobun

@robobun robobun commented May 23, 2026

Copy link
Copy Markdown
Collaborator

Problem

Parser fuzzing found that long chains of prefix unary operators kill the process with a hard SIGSEGV instead of a JavaScript error:

new Bun.Transpiler({ loader: "js" }).transformSync("- ".repeat(20000) + "1");

40 KB of - is a trivial DoS payload against anything that transpiles untrusted code (bun run, bun build, Bun.Transpiler). Other deep shapes hit the same thing: f(f(f(…))) nested ~6–12K deep and [[[…]]]/ternary chains at lower depths on smaller thread stacks. In contrast, unclosed [/{/( nesting already fails cleanly with Maximum call stack size exceeded in a few milliseconds.

Cause

parse_expr_common bounds the parser's own recursion with StackCheck::is_safe_to_recurse(), but that check is calibrated for the parser's stack frames. The visit pass (visit_expr_in_oute_unary/e_call/…) and the printer (print_expr) use noticeably more stack per AST level and had no check at all, so an AST that parses under the limit still blows the stack one pass later. (This is the expression-side twin of the statement-side issue documented on P::parse_stmt_depth.)

The ">10 s parse" reading from the fuzzer is the crash path, not the parser: below the overflow threshold parse time is linear (20K chained minuses ≈ 7 ms in a release build).

Fix

Same dynamic guard, applied to the passes that were missing it:

  • visit_expr_in_out now checks is_safe_to_recurse() and, when the stack is nearly exhausted, logs the existing Maximum call stack size exceeded error (once) and stops descending; _parse already halts on logged errors right after the visit pass.
  • print_expr does the same check and bails out; the print entry points (print_ast, print_json, print_with_writer_and_platform, print_common_js) turn that into an error instead of returning truncated output.
  • The recursive side-effect analysis helpers that can walk the full-depth AST (expr_can_be_removed_if_unused, simplify_unused_expr, SideEffects::to_boolean, SideEffects::is_primitive_with_side_effects) report the same stack-overflow error (via P::report_stack_overflow, once per parse) and stop recursing, so the transform fails with Maximum call stack size exceeded instead of degrading silently. ExprData::known_primitive (in bun_ast, no logger available) gets its own stack check and answers Unknown, i.e. skips the optimization.
  • Once the error is reported the visit pass skips the remaining expressions, push_scope_for_visit_pass re-syncs the parse-recorded scope order past subtrees the bail skipped (so the "Scope mismatch while visiting" sanity panic can't fire), and the DCE analyses skip entirely so they never read identifier refs the visit pass didn't resolve.
  • Expr::join_with_left_associative_op (used when simplifying unused ternaries) threads a stack check through its re-association and falls back to the unflattened node near the limit.
  • The binding path gets the same treatment as the expression path (parse_binding, visit_binding, print_binding for nested destructuring patterns), and --minify's substitute_single_use_symbol_in_expr reports the overflow instead of walking iteratively-built chains off the stack.

No hard depth constant: per-level frame sizes differ ~4× between release and sanitizer builds and thread stacks range from 4 MB workers to the main thread, so a fixed cap would either reject inputs that work today or still crash under ASAN. The dynamic check keeps today's capacity and turns every overflow into a clean error.

Verification

  • Before: transformSync("- ".repeat(20000) + "1") → SIGSEGV (release and ASAN debug builds); nested-call and nested-array inputs crash at various depths, including on 4 MB worker stacks.
  • After: every shape × depth × stack-size combination tested (unary/!/void chains, nested calls, nested arrays, ternary chains at 1.5K–100K depth, 4 MB and 8 MB stacks, ASAN debug build) either transpiles or throws Maximum call stack size exceeded / a print error — no signals, and errors return in milliseconds.
  • New tests in test/bundler/transpiler/transpiler.test.js next to the existing stack-overflow tests spawn a subprocess and assert it exits cleanly (no SIGSEGV) across shapes and depths; they fail on the previous build (child dies with SIGSEGV) and pass with this change.
  • test/bundler/transpiler/, bundler_minify.test.ts, and bundler_edgecase.test.ts pass with the debug build.

Related

#7717 reports illegal hardware instruction while bundling vite-prebundled swapkit/libsodium code, and was diagnosed by a maintainer as a stack overflow from expression-parse/visit recursion on parser threads — the same failure class this PR guards against. The original reproduction can't be re-run today (the @swapkit/* RC packages it pins were unpublished), so this PR doesn't claim to close it: with this change that kind of overflow surfaces as a Maximum call stack size exceeded build error instead of killing the process.

The expression parser bounds its own recursion with a stack check, but its
frames are smaller than the visit pass's and the printer's, so ASTs like
20,000 chained unary minuses or thousands of nested call arguments parsed
fine and then overflowed the native stack (SIGSEGV) in the passes that
follow.

Add the same is_safe_to_recurse() guard to visit_expr_in_out (reporting the
existing "Maximum call stack size exceeded" error) and to print_expr (the
print entry points now return an error instead of truncated output), and
give the recursive side-effect analysis helpers conservative early-outs.
@coderabbitai

coderabbitai Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@robobun, we couldn't start this review because you've used your available PR reviews for now.

Your plan currently allows 2 reviews/hour. Refill in 5 minutes and 50 seconds.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more review capacity refills, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: c3358545-9bc5-4e2d-94a5-8c75c5e86aa9

📥 Commits

Reviewing files that changed from the base of the PR and between 73a07bd and 5f169d0.

📒 Files selected for processing (8)
  • src/ast/expr.rs
  • src/js_parser/p.rs
  • src/js_parser/parse/mod.rs
  • src/js_parser/scan/scan_side_effects.rs
  • src/js_parser/visit/mod.rs
  • src/js_parser/visit/visit_expr.rs
  • src/js_printer/lib.rs
  • test/bundler/transpiler/transpiler.test.js

Walkthrough

This PR adds comprehensive stack-overflow protection throughout Bun's JavaScript compiler pipeline. It introduces recursion-safe checks in AST primitive inference, the parser visitor, side-effect analysis, and the JS printer, ensuring deeply nested expressions fail with clean diagnostics instead of crashes.

Changes

Stack-overflow protection across compiler pipeline

Layer / File(s) Summary
Parser stack-overflow reporting infrastructure
src/js_parser/p.rs
New reported_stack_overflow field and report_stack_overflow(loc) method ensure one-time logging of "Maximum call stack size exceeded" errors during parsing; P::init initializes the flag.
Parser visitor and DCE recursion guards
src/js_parser/visit/..., src/js_parser/p.rs
Visit pass adds an early-return recursion guard; push_scope_for_visit_pass skips unconsumed scope-order entries after bail-outs; DCE helper expr_can_be_removed_if_unused_without_dce_check guards recursion and reports overflow.
AST expression primitive inference with stack guards
src/ast/expr.rs
Introduces stack-checked helpers (known_primitive_with_check, merge_known_primitive_with_check) and uses them across re-association, conditionals, binaries (`
Side-effects analysis with stack-overflow guards
src/js_parser/scan/scan_side_effects.rs
simplify_unused_expr, is_primitive_with_side_effects, and to_boolean now accept P and Loc where needed, check stack_check before recursing, report overflow when unsafe, and avoid unsafe simplifications.
JS printer stack-overflow detection and error handling
src/js_printer/lib.rs
Printer now tracks bun_core::StackCheck and stack_overflowed; print_expr guards recursion and sets the flag when unsafe; entrypoints (print_ast, print_json, print_with_writer_and_platform, print_common_js) convert the flag into a StackOverflow error.
Regression tests for deep expression handling
test/bundler/transpiler/transpiler.test.js
Two tests spawn Bun on deeply nested expressions and long unary chains to assert processes exit without signal crashes and allow expected stack-overflow diagnostics or clean exit codes.

Suggested reviewers

  • Jarred-Sumner
  • dylan-conway
  • RiskyMH
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding error handling instead of crashes for deeply nested expressions in the transpiler.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The pull request description thoroughly addresses both required sections: it clearly explains the problem (deeply nested expressions causing SIGSEGV), the solution (dynamic stack checks via StackCheck::is_safe_to_recurse()), and detailed verification across multiple test scenarios and build configurations.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@robobun

robobun commented May 23, 2026

Copy link
Copy Markdown
Collaborator Author
Updated 11:00 AM PT - May 23rd, 2026

@robobun, your commit 5f169d0351892104accdb2b8f045ccc011b750df passed in Build #57313! 🎉


🧪   To try this PR locally:

bunx bun-pr 31242

That installs a local version of the PR into your bun-31242 executable, so you can run:

bun-31242 --bun

@github-actions

Copy link
Copy Markdown
Contributor

Found 1 issue this PR may fix:

  1. illegal hardware instruction on M1 max cpu #7717 - Reports an illegal hardware instruction crash caused by deeply nested subexpressions (100+ deep) in ASMJS-generated code (libsodium). A maintainer confirmed it's a stack overflow due to recursion in expression parsing/visiting — exactly the passes this PR adds stack guards to.

If this is helpful, copy the block below into the PR description to auto-close this issue on merge.

Fixes #7717

🤖 Generated with Claude Code

Comment thread src/js_parser/scan/scan_side_effects.rs
Comment thread src/js_parser/scan/scan_side_effects.rs
Comment thread src/js_parser/visit/visit_expr.rs Outdated
Comment thread src/js_parser/p.rs Outdated
Comment thread src/js_parser/scan/scan_side_effects.rs Outdated
Comment thread test/bundler/transpiler/transpiler.test.js
…tly falling back

Per review: when expr_can_be_removed_if_unused, simplify_unused_expr, or
SideEffects::to_boolean hit the stack limit they now log the same
"Maximum call stack size exceeded" error (once per parse, via the new
P::report_stack_overflow helper) so the parse fails, instead of silently
returning a conservative answer.
Comment thread test/bundler/transpiler/transpiler.test.js Outdated
Comment thread src/js_parser/scan/scan_side_effects.rs
…primitive and is_primitive_with_side_effects

- push_scope_for_visit_pass: when the visit pass bailed out on a stack
  overflow, skip ahead to the entry the push is asking for instead of
  panicking with "Scope mismatch while visiting" on the already-failing
  parse; visit_expr_in_out also stops visiting once the error is reported.
- ExprData::known_primitive recurses through ??/&&/||/+ operands and
  ternaries that are built iteratively, so it gets its own stack check and
  answers Unknown near the limit (bun_ast has no logger to report through).
- SideEffects::is_primitive_with_side_effects now takes the parser and
  reports the stack overflow like the other analysis helpers.
- Tests: add the scope-desync and iterative-chain shapes to the crash
  matrix, and bound the exit code in the bun -e test so a crash on Windows
  (no signal codes) can't pass.
Comment thread src/js_parser/scan/scan_side_effects.rs
Comment thread src/ast/expr.rs Outdated
robobun and others added 2 commits May 23, 2026 01:51
…urate error locations, accept the printer error in the test

- known_primitive initializes one StackCheck and threads it through the
  recursion (and merge_known_primitive) instead of re-initializing per level.
- is_primitive_with_side_effects takes the expression's Loc so the reported
  error points at the operand instead of the lexer's EOF position; to_boolean
  reports with Loc::EMPTY since ExprData carries no location.
- The crash-matrix test also accepts the printer guard's error string so it
  doesn't depend on which pass hits its limit first on a given build.
Comment thread src/js_parser/visit/visit_expr.rs
Comment thread src/js_parser/scan/scan_side_effects.rs Outdated
Comment thread src/js_parser/p.rs Outdated
…it bail-out

- Expr::join_with_left_associative_op re-associates one frame per level of
  same-op nesting; thread a StackCheck through the recursion and fall back
  to the unflattened shape when the stack is nearly exhausted.
- Once the visit pass has bailed out, identifier refs in unvisited subtrees
  are still parse-time slices, so expr_can_be_removed_if_unused and
  simplify_unused_expr now skip their analysis entirely instead of reading
  them.
- Add both reproduction shapes to the crash-matrix test.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/ast/expr.rs`:
- Around line 800-809: The block that matches Data::EBinary and mutates
comma.right must return the modified expression immediately; after updating
comma.right via Self::join_with_left_associative_op_with_check(op, comma.right,
b, stack_check) you should return the updated `a` so the function does not fall
through and create a new binary `a op b`. Locate the match on `Data::EBinary(mut
comma)` and, when `comma.op == crate::OpCode::BinComma` and you assign to
`comma.right`, add a `return a;` (or otherwise return the mutated `a`) so the
AST remains `(a, b op c)` instead of constructing ` (a, b op c) op c`.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f351de95-f5c5-40f9-84f0-17d83c2c9551

📥 Commits

Reviewing files that changed from the base of the PR and between d569ad0 and 73a07bd.

📒 Files selected for processing (4)
  • src/ast/expr.rs
  • src/js_parser/p.rs
  • src/js_parser/scan/scan_side_effects.rs
  • test/bundler/transpiler/transpiler.test.js

Comment thread src/ast/expr.rs
robobun added 2 commits May 23, 2026 05:57
…single-use substitution

Per review: deeply nested destructuring patterns recurse through
parse_binding/visit_binding/print_binding without touching the guarded
expression paths, and --minify's substitute_single_use_symbol_in_expr walks
iteratively-built binary chains. Give all four the same stack check as their
expression-side counterparts, and add the destructuring and minify
substitution shapes to the crash-matrix test.
With eleven shapes at three depths the subprocess takes just over the 5s
default under the ASAN debug build.
Comment thread src/ast/expr.rs
The comma branch updates the comma's right side in place but then fell
through to build another "a op b" node on top, duplicating b's side
effects ("(f(), g()) ? 1 : h();" simplified to code that can call h()
twice). Return the mutated expression instead, matching the documented
"(a, b) op c => a, b op c" transform.

@Jarred-Sumner Jarred-Sumner left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove all the comments you added.

@robobun

robobun commented May 23, 2026

Copy link
Copy Markdown
Collaborator Author

Done — removed in 3a14d17.

Comment thread src/js_parser/p.rs Outdated
substitute_single_use_symbol_in_expr gets the same reported_stack_overflow
check as the other analysis helpers so it never reads identifier refs the
visit pass didn't resolve.
@robobun

robobun commented May 23, 2026

Copy link
Copy Markdown
Collaborator Author

CI note: the only failed job on the current build (57283, still running) is Ubuntu 25.04 x64 test-bun, where test/js/bun/util/v8-heap-snapshot.test.ts was killed by SIGKILL (out-of-memory on the runner) — unrelated to this change; every parser/transpiler lane that has finished is green. A single job retry on that lane should clear it.

@robobun

robobun commented May 23, 2026

Copy link
Copy Markdown
Collaborator Author

Merged current main into the branch (5f169d0) so the binary-size comparison is against the same base as the canary — the "+11 MB on Linux" in the last build was the branch's older base missing main's compressed-libicudata change (#31200), not this diff (all non-Linux targets were slightly smaller). Guards and tests re-validated on the merged base; transpiler suite green.

Comment thread test/bundler/transpiler/transpiler.test.js
@Jarred-Sumner Jarred-Sumner merged commit 41aa672 into main May 24, 2026
79 checks passed
@Jarred-Sumner Jarred-Sumner deleted the farm/499eacc8/fix-deep-expr-stack-overflow branch May 24, 2026 02:25
Jarred-Sumner pushed a commit that referenced this pull request May 24, 2026
…er (#31333)

### Problem

Fuzzing found a second transpiler stack overflow
(`sig:SIGSEGV:nostack`): ~600 nested `{` blocks crash the process.

```js
new Bun.Transpiler({ loader: "tsx", target: "bun", minifyWhitespace: true, deadCodeElimination: true })
  .transformSync("{".repeat(600) + 'class Test1 { static "prop1" = 0; }' + "}".repeat(600));
```

#31242 guarded the **expression** recursion (`visit_expr_in_out`,
`print_expr`, DCE helpers), but the **statement** recursion was left
unguarded. Nested blocks stay under `MAX_STMT_DEPTH` (1000) in
`parse_stmt`, then the visit pass recurses through `visit_stmts →
visit_and_append_stmt → s_block → visit_stmts` with no stack check —
each level stacks several multi-KB frames, so a few hundred levels
exhaust the thread's stack (reproduces at depth 800 on a debug build's 8
MB main stack; smaller stacks crash at 600):

```
#5  visit_stmts                 src/js_parser/visit/mod.rs:1280
#6  s_block                     src/js_parser/visit/visit_stmt.rs:1627
#7  visit_and_append_stmt       src/js_parser/visit/visit_stmt.rs:108
#8  visit_stmts                 src/js_parser/visit/mod.rs:1336
... (repeats until SIGSEGV)
```

### Fix

Guard the statement recursion the same way the expression recursion
already is:

- `visit_and_append_stmt` now checks `stack_check.is_safe_to_recurse()`
(plus the `reported_stack_overflow` fast-path) and reports "Maximum call
stack size exceeded" instead of descending, mirroring
`visit_expr_in_out`.
- `print_stmt` and `print_if` (which self-recurses for `else if` chains
without passing through `print_stmt`) get the same guard
`print_expr`/`print_binding` already have, so a deep AST printed on a
thread with less stack headroom errors instead of overflowing.
- Removed the `MAX_STMT_DEPTH`/`parse_stmt_depth` hard cap from
`parse_stmt` (review feedback): recursion depth in every phase is now
governed by `StackCheck` alone, matching the Zig parser.
- Guarded `hoist_symbols` the same way: it walks the scope tree before
the visit pass at the full depth the parser allowed, and was only kept
safe previously by the now-removed cap (the 15k-deep
`lots-of-for-loop.js` fixture overflowed it in release builds
otherwise).

With this, every arbitrarily-nestable AST recursion (statements,
expressions, bindings) is stack-checked in all three phases (parse,
visit, print); deep inputs throw a catchable `Maximum call stack size
exceeded` error.

### Verification

New test `deeply nested statement blocks error instead of crashing the
process` in `test/bundler/transpiler/transpiler.test.js` transpiles
nested-block and `else if`-chain shapes at depths 600/800/990 (below the
parse-time cap, deep enough to overflow an unguarded visitor) in a
subprocess and asserts it exits cleanly.

- Without the fix: the subprocess dies with SIGSEGV at depth 800+ (debug
build), so the test fails.
- With the fix: `bun bd test test/bundler/transpiler/transpiler.test.js`
→ 147 pass, 0 fail; the repro above now throws `Maximum call stack size
exceeded`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants