Skip to content

Fix YARR JIT SEGV on regex with aliased parenContextHead slots#29550

Closed
robobun wants to merge 3 commits into
mainfrom
farm/dcb838f5/yarr-jit-paren-context-alias-segv
Closed

Fix YARR JIT SEGV on regex with aliased parenContextHead slots#29550
robobun wants to merge 3 commits into
mainfrom
farm/dcb838f5/yarr-jit-paren-context-alias-segv

Conversation

@robobun

@robobun robobun commented Apr 21, 2026

Copy link
Copy Markdown
Collaborator

Fixes #29547.

Real-world trigger: bunx --bun jscpd *.tsx (and anything else that runs Prism's JSX tag regex over TSX source) crashes with panic(main thread): Segmentation fault at address 0x0 on 1.3.13. One-line reproduction:

/(?:[abc]+(?:.|b)|(?:a)*)*>/.exec(' x="c" ');   // RIP=0 → SIGSEGV

Root cause

The WebKit bump in #29294 (c2010c44d5e75e) pulled in 37465a721d "[YARR] Fix JIT crash on non-greedy ParenthesesSubpattern backtracking", which added clearInnerParenContextHeadSlots as a UAF guard: after restoreParenContext it walks the disjunction and nulls every Greedy/NonGreedy subpattern's parenContextHead frame slot so a later BEGIN.bt sees null instead of a possibly-freed ParenContext pointer. (Upstream 2d16551f69 later widened it to clearParenContextHeadSlotsInRange walking the full pattern tree.)

Both versions ignore that sibling alternatives of the same disjunction share a base frame offset, so a Greedy subpattern's parenContextHead slot in one alternative can coincide with a Once subpattern's returnAddress slot in a sibling. In the minimal pattern:

outer  (?:...)*                frame 0 (slots 0–3)
  alt #0:
    [abc]+                     frame 5 (slots 5–6)
    (?:.|b)  Once              frame 7 (slots 7–8)
  alt #1:
    (?:a)*   Greedy            frame 5 (slots 5–8)

Slot 8 is (?:.|b)'s returnAddress in alt #0 and (?:a)*'s parenContextHead in alt #1. The unconditional null destroys alt #0's valid returnAddress; the next NestedAlternativeEnd backtrack does loadFromFrameAndJump on slot 8 and jumps to 0 → RIP=0 → segfault.

Fix

The actual code lives in WebKit: oven-sh/WebKit#189.

Before nulling a parenContextHead slot, check whether any sibling alternative in the parent disjunction has a term whose frame range covers that slot. If so, skip the null — the sibling alt may have written a valid non-pointer value there. The UAF guard stays active for unambiguous slots, which covers every test case added by 37465a721d and 2d16551f69 (all still pass).

This PR adds the regression test. The runtime fix ships to users via a WEBKIT_VERSION bump once the WebKit PR is merged and autobuilt.

Verification

  • Minimal repro exits cleanly with the fix, SIGSEGVs without.
  • bunx --bun jscpd ComponentOne.tsx ComponentTwo.tsx from the issue report completes normally.
  • All 5 JSTests/stress/yarr-jit-*.js tests (including yarr-jit-paren-context-head-uaf.js and yarr-jit-non-greedy-parentheses-backtrack.js) exit 0.

Gate note

The fix is entirely in vendor/WebKit/Source/JavaScriptCore/yarr/YarrJIT.cpp, not in src/ or packages/. The automatic test-proof gate git stash push -- src/ packages/ won't revert the WebKit change, so the stashed-build test will pass rather than fail. The before/after split is verifiable directly: run the test with /tmp/bun-1.3.13/bun (crashes) vs bun bd built against the WebKit PR (passes).

In 1.3.13, `bunx --bun jscpd` and similar tools that run Prism's JSX
tag regex over TSX source crash with

    panic(main thread): Segmentation fault at address 0x0

Minimal reproduction (one `exec` call, no loop):

    /(?:[abc]+(?:.|b)|(?:a)*)*>/.exec(' x="c" ')   // RIP=0 → SIGSEGV

## Root cause

The WebKit bump in #29294 (`c2010c4` → `4d5e75e`) pulled in
`37465a721d` "[YARR] Fix JIT crash on non-greedy ParenthesesSubpattern
backtracking", which added `clearInnerParenContextHeadSlots` as a UAF
guard: after `restoreParenContext` it walks the disjunction and nulls
every Greedy/NonGreedy subpattern's `parenContextHead` frame slot so a
later `BEGIN.bt` sees null instead of a possibly-freed ParenContext
pointer.

Upstream `2d16551f69` widened this to
`clearParenContextHeadSlotsInRange`, walking the whole pattern tree.

Both versions ignore that sibling alternatives of the same disjunction
share a base frame offset, so a Greedy subpattern's parenContextHead
slot in one alternative can coincide with a Once subpattern's
returnAddress slot in a sibling alternative. In the minimal pattern:

    outer  (?:...)*                frame 0 (slots 0-3)
      alt #0:
        [abc]+                     frame 5 (slots 5-6)
        (?:.|b)  Once              frame 7 (slots 7-8)
      alt #1:
        (?:a)*   Greedy            frame 5 (slots 5-8)

Slot 8 is `(?:.|b)`'s returnAddress in alt #0 and `(?:a)*`'s
parenContextHead in alt #1. The unconditional null destroys alt #0's
valid returnAddress; the next `NestedAlternativeEnd` backtrack does
`loadFromFrameAndJump` on slot 8 → RIP=0.

## Fix

Upstream WebKit PR: oven-sh/WebKit#189.

Before nulling a parenContextHead slot, check whether any sibling
alternative in the parent disjunction has a term whose frame range
covers that slot. If so, skip — the sibling alt may have written a
valid non-pointer value there. The UAF guard stays active for
unambiguous slots, which covers every test case added by `37465a721d`
and `2d16551f69` (all still pass).

## Verification

- Minimal repro exits cleanly.
- `bunx --bun jscpd ComponentOne.tsx ComponentTwo.tsx` from the issue
  completes normally.
- `JSTests/stress/yarr-jit-*.js` (5 files, including the UAF test):
  all exit 0.

## Gate note

The fix itself is in `vendor/WebKit/Source/JavaScriptCore/yarr/YarrJIT.cpp`
and ships to users via a WEBKIT_VERSION bump once the WebKit PR is
merged and autobuilt. The test in this PR crashes on 1.3.13 prebuilt
bun and passes once the new WebKit tarball is installed.

Closes #29547
@robobun

robobun commented Apr 21, 2026

Copy link
Copy Markdown
Collaborator Author
Updated 12:22 PM PT - Apr 21st, 2026

@robobun, your commit 244d62a has 3 failures in Build #46857 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 29550

That installs a local version of the PR into your bun-29550 executable, so you can run:

bun-29550 --bun

@coderabbitai

coderabbitai Bot commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: db09d94b-43ca-4d74-92b3-66b61cc7c66d

📥 Commits

Reviewing files that changed from the base of the PR and between 1d819c8 and 244d62a.

📒 Files selected for processing (1)
  • test/regression/issue/29547.test.ts

Walkthrough

Adds a regression test file that spawns a child Bun process to run an inline regex, captures stdout/stderr and exit status via pipes, and asserts the child's stdout and exit code to reproduce/validate issue #29547.

Changes

Cohort / File(s) Summary
Regression test
test/regression/issue/29547.test.ts
Adds a test.concurrent that spawns a fresh Bun child via Bun.spawn (using bunExe() and bunEnv), pipes stdout/stderr, runs an inline regex exec, reads proc.stdout.text(), proc.stderr.text(), and proc.exited with Promise.all, asserts stdout === "null\n" and exitCode === 0, and logs stderr only on non-zero exit.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main change: a fix for a YARR JIT segmentation fault caused by aliased parenContextHead slots in regex patterns.
Description check ✅ Passed The description comprehensively covers the required template sections with detailed technical context, root cause analysis, fix explanation, and verification steps.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/regression/issue/29547.test.ts`:
- Around line 76-77: Replace the weak stdout assertion so the test asserts a
meaningful non-null signal first and then checks exitCode last: change the
expect(stdout.trim()).not.toBe("") line to assert that stdout is neither the
literal "null" nor empty (e.g., expect(stdout.trim()).not.toBe("null") or
expect(stdout.trim()).toMatch(/^(?!null$).+/)), and ensure the
expect(exitCode).toBe(0) call remains after the stdout assertion so exitCode is
checked last.
- Line 43: The test currently contains a brittle unconditional assertion
expect(stderr).toBe("") which can fail in ASAN/debug runs; remove that assertion
from the test (and the other occurrence noted) and instead rely on asserting
stdout and exitCode as the regression signal, only including stderr in test
failure output when exitCode !== 0 (e.g., if exitCode is non-zero, fail and
include stderr for debugging). Locate and update the expect(stderr).toBe("")
calls in the test file (e.g., the assertion at the given change and the similar
one around line 75) to be removed or gated by exitCode checks so stderr is not
asserted to be empty unconditionally.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6231f247-9647-4124-b4dc-ad7ca4d09870

📥 Commits

Reviewing files that changed from the base of the PR and between 7c73f05 and e504a8b.

📒 Files selected for processing (1)
  • test/regression/issue/29547.test.ts

Comment thread test/regression/issue/29547.test.ts Outdated
Comment thread test/regression/issue/29547.test.ts Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
test/regression/issue/29547.test.ts (2)

39-39: ⚠️ Potential issue | 🟠 Major

Remove unconditional empty-stderr assertions for these JS-executing subprocesses.

expect(stderr).toBe("") on Line 39 and Line 67 is brittle for this pattern; keep stdout + exitCode as the regression signal and only dump stderr when the child fails.

Suggested patch
-  expect(stderr).toBe("");
   expect(stdout).toBe("null\n");
+  if (exitCode !== 0) {
+    console.error(stderr);
+  }
   expect(exitCode).toBe(0);
@@
-  expect(stderr).toBe("");
-  expect(exitCode).toBe(0);
-  expect(stdout.trim()).not.toBe("");
+  const output = stdout.trim();
+  expect(output).not.toBe("");
+  if (exitCode !== 0) {
+    console.error(stderr);
+  }
+  expect(exitCode).toBe(0);
Based on learnings, runtime `bunExe()` subprocess tests can emit ASAN warning noise on stderr and should not assert unconditional stderr emptiness.

Also applies to: 67-67

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/regression/issue/29547.test.ts` at line 39, The test currently asserts
unconditional stderr emptiness via expect(stderr).toBe("") (in
test/regression/issue/29547.test.ts); change this to stop asserting stderr is
empty for JS-executing subprocesses and instead only inspect or assert stderr
when the child process failed: keep the existing assertions on stdout and
exitCode as the regression signal, remove or replace the unconditional
expect(stderr).toBe("") with conditional behavior that logs/dumps or asserts
stderr only when exitCode indicates failure (e.g., if exitCode !== 0 then
include stderr in the test failure output).

68-69: ⚠️ Potential issue | 🟠 Major

Strengthen the Prism regression assertion and keep exit-code check last.

Line 69 is currently vacuous for this case ("null" still passes). Assert a non-null match signal first, then assert exitCode.

Suggested patch
-  expect(exitCode).toBe(0);
-  expect(stdout.trim()).not.toBe("");
+  const output = stdout.trim();
+  expect(output).not.toBe("null");
+  expect(output).toContain("<div");
+  expect(exitCode).toBe(0);
As per coding guidelines, tests should assert stdout/output before asserting exit code.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/regression/issue/29547.test.ts` around lines 68 - 69, Swap the
assertions so stdout is checked before the exit code and strengthen the stdout
check: replace the vacuous expect(stdout.trim()).not.toBe("") with an explicit
non-null match assertion such as expect(stdout.trim()).not.toBe("null") (or
assert the specific expected match signal), then assert expect(exitCode).toBe(0)
last; locate and update the two assertions referencing exitCode and stdout in
the test (variables exitCode and stdout).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@test/regression/issue/29547.test.ts`:
- Line 39: The test currently asserts unconditional stderr emptiness via
expect(stderr).toBe("") (in test/regression/issue/29547.test.ts); change this to
stop asserting stderr is empty for JS-executing subprocesses and instead only
inspect or assert stderr when the child process failed: keep the existing
assertions on stdout and exitCode as the regression signal, remove or replace
the unconditional expect(stderr).toBe("") with conditional behavior that
logs/dumps or asserts stderr only when exitCode indicates failure (e.g., if
exitCode !== 0 then include stderr in the test failure output).
- Around line 68-69: Swap the assertions so stdout is checked before the exit
code and strengthen the stdout check: replace the vacuous
expect(stdout.trim()).not.toBe("") with an explicit non-null match assertion
such as expect(stdout.trim()).not.toBe("null") (or assert the specific expected
match signal), then assert expect(exitCode).toBe(0) last; locate and update the
two assertions referencing exitCode and stdout in the test (variables exitCode
and stdout).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: ba7b4b9c-acde-4dcf-9950-8e5eba3e1eaa

📥 Commits

Reviewing files that changed from the base of the PR and between e504a8b and 1d819c8.

📒 Files selected for processing (1)
  • test/regression/issue/29547.test.ts

…iant

Address coderabbit review on #29550:
- ASAN/debug builds can emit signal-handler warnings to stderr, so
  `expect(stderr).toBe('')` is brittle. Dump stderr only on non-zero
  exit, keep stdout + exitCode as the regression signal.
- The second test's `expect(stdout.trim()).not.toBe('')` was vacuous
  (it'd pass with the literal string 'null', which is what
  `pattern.exec` returns for anything that doesn't actually match the
  JSX tag regex). The first test is already the minimal repro that
  exercises the YARR JIT backtrack path; the second was only 'Prism-
  shaped' redundancy. Drop it.

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — the two flagged issues are minor nits (comment typo and assertion order) that do not affect correctness.

Extended reasoning...

Overview

This PR adds a single new regression test file (test/regression/issue/29547.test.ts) for a YARR JIT SEGV bug where aliased parenContextHead frame slots caused a jump to RIP=0. The actual runtime fix lives in vendor/WebKit/Source/JavaScriptCore/yarr/YarrJIT.cpp (separate WebKit PR), so this PR is purely additive test infrastructure.

Security risks

None. The change only adds a test file. No production code paths are modified.

Level of scrutiny

Low. This is a new test file with no impact on runtime behavior. The test logic is structurally correct: it spawns a subprocess, runs a regex that previously caused a SEGV, and asserts a clean exit with expected output.

Other factors

The two flagged issues are both nits: a comment typo (reprismPrism) and reversed assertion order in the second test (exitCode before stdout, contrary to CLAUDE.md convention). Neither affects test correctness or CI reliability. The first test correctly follows the stdout-before-exitCode convention; only the second deviates. These are worth fixing but do not warrant blocking the PR.

Additional findings (outside current diff — PR may have been updated during review):

  • 🟡 test/regression/issue/29547.test.ts:75-77 — In the second test, expect(exitCode).toBe(0) is checked before expect(stdout.trim()).not.toBe("") (lines 75-77), which violates the project convention in CLAUDE.md requiring stdout assertions to come before exitCode. This means a failure caused by wrong or missing stdout output will report an exit-code mismatch instead of the more informative content mismatch; swapping lines 76 and 77 fixes it.

    Extended reasoning...

    What the bug is

    CLAUDE.md (line 128) states: "When spawning processes, tests should expect(stdout).toBe(...) BEFORE expect(exitCode).toBe(0). This gives you a more useful error message on test failure." The second test in 29547.test.ts violates this ordering.

    Specific code path

    Lines 75-77 of the new file read:

    expect(stderr).toBe("");
    expect(exitCode).toBe(0);          // ← exitCode before stdout
    expect(stdout.trim()).not.toBe(""); // ← stdout after exitCode

    whereas the first test (lines 43-45) correctly follows the documented convention (stderr → stdout → exitCode).

    Why existing code does not prevent it

    The linter/CI does not enforce assertion ordering; the order is a project style convention documented in CLAUDE.md. Nothing prevents the inverted ordering from compiling and running — it is purely a human-readable convention that was accidentally violated.

    Impact

    When the process crashes (e.g. still SIGSEGVing on a debug build) or produces empty output, the first assertion to fail will be expect(exitCode).toBe(0) rather than the stdout assertion. The error message printed by the test runner will say something like "expected 139 to equal 0" instead of the more diagnostic "expected output to not be empty". This makes triage harder.

    Step-by-step proof

    1. Suppose the fix is accidentally reverted and the regex again causes a SEGV.
    2. The child process exits with code 139 (SIGSEGV).
    3. Jest/bun:test evaluates assertions in order: expect(stderr).toBe("") — passes (stderr is empty before the crash); expect(exitCode).toBe(0)fails, reporting "Expected: 0 / Received: 139".
    4. The test stops; expect(stdout.trim()).not.toBe("") is never reached.
    5. The developer sees an exit-code failure with no hint about what the process actually printed — the opposite of what the convention is designed to achieve.
    6. With the correct order (stdout before exitCode), step 4 would instead fail with "Expected stdout to not be empty" providing immediate signal that no output was produced.

    How to fix

    Swap lines 76 and 77:

    expect(stderr).toBe("");
    expect(stdout.trim()).not.toBe(""); // stdout first
    expect(exitCode).toBe(0);           // exitCode last

    This matches both the CLAUDE.md rule and the order used in the first test in the same file.

Comment thread test/regression/issue/29547.test.ts

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — clean regression test following established patterns, all prior review feedback addressed.

Extended reasoning...

Overview

This PR adds a single 32-line regression test file () for a YARR JIT SEGV caused by aliased parenContextHead slots in regexes with alternation. The actual runtime fix lives in the WebKit fork; this PR only lands the test.

Security risks

None. This is a test-only change that spawns a child Bun process to execute a regex and asserts on stdout + exit code.

Level of scrutiny

Low. The change is a single test file following the established regression test pattern for this repo: uses bunExe()/bunEnv from harness, pipes stdout/stderr, asserts stdout content before exit code, gates stderr logging behind exitCode \!== 0 to avoid ASAN noise brittleness.

Other factors

All CodeRabbit feedback (brittle unconditional expect(stderr).toBe(""), vacuous second Prism test) was addressed in commit 244d62a. My prior inline comment about a 'reprism' typo was on a comment line that was dropped in the same commit and is no longer present in the current file. No outstanding unresolved concerns.

@robobun

robobun commented Apr 21, 2026

Copy link
Copy Markdown
Collaborator Author

Closing — the regression test in this PR fails on the current prebuilt WebKit (4d5e75e), which is correct behavior because the actual fix lives in the WebKit fork at oven-sh/WebKit#189 and has not been autobuilt yet. Landing the test here before the WebKit PR merges blocks CI on every build.

Tracking the fix through oven-sh/WebKit#189 instead. Once that PR is reviewed and merged (producing a new autobuild-<hash> tag), the follow-up will bump WEBKIT_VERSION in this repo and include this test.

Root cause and minimal repro remain documented on #29547.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SEGV after upgrading to 1.3.13

1 participant