Fix YARR JIT SEGV on regex with aliased parenContextHead slots#29550
Fix YARR JIT SEGV on regex with aliased parenContextHead slots#29550robobun wants to merge 3 commits into
Conversation
In 1.3.13, `bunx --bun jscpd` and similar tools that run Prism's JSX
tag regex over TSX source crash with
panic(main thread): Segmentation fault at address 0x0
Minimal reproduction (one `exec` call, no loop):
/(?:[abc]+(?:.|b)|(?:a)*)*>/.exec(' x="c" ') // RIP=0 → SIGSEGV
## Root cause
The WebKit bump in #29294 (`c2010c4` → `4d5e75e`) pulled in
`37465a721d` "[YARR] Fix JIT crash on non-greedy ParenthesesSubpattern
backtracking", which added `clearInnerParenContextHeadSlots` as a UAF
guard: after `restoreParenContext` it walks the disjunction and nulls
every Greedy/NonGreedy subpattern's `parenContextHead` frame slot so a
later `BEGIN.bt` sees null instead of a possibly-freed ParenContext
pointer.
Upstream `2d16551f69` widened this to
`clearParenContextHeadSlotsInRange`, walking the whole pattern tree.
Both versions ignore that sibling alternatives of the same disjunction
share a base frame offset, so a Greedy subpattern's parenContextHead
slot in one alternative can coincide with a Once subpattern's
returnAddress slot in a sibling alternative. In the minimal pattern:
outer (?:...)* frame 0 (slots 0-3)
alt #0:
[abc]+ frame 5 (slots 5-6)
(?:.|b) Once frame 7 (slots 7-8)
alt #1:
(?:a)* Greedy frame 5 (slots 5-8)
Slot 8 is `(?:.|b)`'s returnAddress in alt #0 and `(?:a)*`'s
parenContextHead in alt #1. The unconditional null destroys alt #0's
valid returnAddress; the next `NestedAlternativeEnd` backtrack does
`loadFromFrameAndJump` on slot 8 → RIP=0.
## Fix
Upstream WebKit PR: oven-sh/WebKit#189.
Before nulling a parenContextHead slot, check whether any sibling
alternative in the parent disjunction has a term whose frame range
covers that slot. If so, skip — the sibling alt may have written a
valid non-pointer value there. The UAF guard stays active for
unambiguous slots, which covers every test case added by `37465a721d`
and `2d16551f69` (all still pass).
## Verification
- Minimal repro exits cleanly.
- `bunx --bun jscpd ComponentOne.tsx ComponentTwo.tsx` from the issue
completes normally.
- `JSTests/stress/yarr-jit-*.js` (5 files, including the UAF test):
all exit 0.
## Gate note
The fix itself is in `vendor/WebKit/Source/JavaScriptCore/yarr/YarrJIT.cpp`
and ships to users via a WEBKIT_VERSION bump once the WebKit PR is
merged and autobuilt. The test in this PR crashes on 1.3.13 prebuilt
bun and passes once the new WebKit tarball is installed.
Closes #29547
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughAdds a regression test file that spawns a child Bun process to run an inline regex, captures stdout/stderr and exit status via pipes, and asserts the child's stdout and exit code to reproduce/validate issue Changes
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/regression/issue/29547.test.ts`:
- Around line 76-77: Replace the weak stdout assertion so the test asserts a
meaningful non-null signal first and then checks exitCode last: change the
expect(stdout.trim()).not.toBe("") line to assert that stdout is neither the
literal "null" nor empty (e.g., expect(stdout.trim()).not.toBe("null") or
expect(stdout.trim()).toMatch(/^(?!null$).+/)), and ensure the
expect(exitCode).toBe(0) call remains after the stdout assertion so exitCode is
checked last.
- Line 43: The test currently contains a brittle unconditional assertion
expect(stderr).toBe("") which can fail in ASAN/debug runs; remove that assertion
from the test (and the other occurrence noted) and instead rely on asserting
stdout and exitCode as the regression signal, only including stderr in test
failure output when exitCode !== 0 (e.g., if exitCode is non-zero, fail and
include stderr for debugging). Locate and update the expect(stderr).toBe("")
calls in the test file (e.g., the assertion at the given change and the similar
one around line 75) to be removed or gated by exitCode checks so stderr is not
asserted to be empty unconditionally.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6231f247-9647-4124-b4dc-ad7ca4d09870
📒 Files selected for processing (1)
test/regression/issue/29547.test.ts
There was a problem hiding this comment.
♻️ Duplicate comments (2)
test/regression/issue/29547.test.ts (2)
39-39:⚠️ Potential issue | 🟠 MajorRemove unconditional empty-
stderrassertions for these JS-executing subprocesses.
expect(stderr).toBe("")on Line 39 and Line 67 is brittle for this pattern; keepstdout+exitCodeas the regression signal and only dumpstderrwhen the child fails.Based on learnings, runtime `bunExe()` subprocess tests can emit ASAN warning noise on stderr and should not assert unconditional stderr emptiness.Suggested patch
- expect(stderr).toBe(""); expect(stdout).toBe("null\n"); + if (exitCode !== 0) { + console.error(stderr); + } expect(exitCode).toBe(0); @@ - expect(stderr).toBe(""); - expect(exitCode).toBe(0); - expect(stdout.trim()).not.toBe(""); + const output = stdout.trim(); + expect(output).not.toBe(""); + if (exitCode !== 0) { + console.error(stderr); + } + expect(exitCode).toBe(0);Also applies to: 67-67
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/regression/issue/29547.test.ts` at line 39, The test currently asserts unconditional stderr emptiness via expect(stderr).toBe("") (in test/regression/issue/29547.test.ts); change this to stop asserting stderr is empty for JS-executing subprocesses and instead only inspect or assert stderr when the child process failed: keep the existing assertions on stdout and exitCode as the regression signal, remove or replace the unconditional expect(stderr).toBe("") with conditional behavior that logs/dumps or asserts stderr only when exitCode indicates failure (e.g., if exitCode !== 0 then include stderr in the test failure output).
68-69:⚠️ Potential issue | 🟠 MajorStrengthen the Prism regression assertion and keep exit-code check last.
Line 69 is currently vacuous for this case (
"null"still passes). Assert a non-null match signal first, then assertexitCode.As per coding guidelines, tests should assert stdout/output before asserting exit code.Suggested patch
- expect(exitCode).toBe(0); - expect(stdout.trim()).not.toBe(""); + const output = stdout.trim(); + expect(output).not.toBe("null"); + expect(output).toContain("<div"); + expect(exitCode).toBe(0);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/regression/issue/29547.test.ts` around lines 68 - 69, Swap the assertions so stdout is checked before the exit code and strengthen the stdout check: replace the vacuous expect(stdout.trim()).not.toBe("") with an explicit non-null match assertion such as expect(stdout.trim()).not.toBe("null") (or assert the specific expected match signal), then assert expect(exitCode).toBe(0) last; locate and update the two assertions referencing exitCode and stdout in the test (variables exitCode and stdout).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@test/regression/issue/29547.test.ts`:
- Line 39: The test currently asserts unconditional stderr emptiness via
expect(stderr).toBe("") (in test/regression/issue/29547.test.ts); change this to
stop asserting stderr is empty for JS-executing subprocesses and instead only
inspect or assert stderr when the child process failed: keep the existing
assertions on stdout and exitCode as the regression signal, remove or replace
the unconditional expect(stderr).toBe("") with conditional behavior that
logs/dumps or asserts stderr only when exitCode indicates failure (e.g., if
exitCode !== 0 then include stderr in the test failure output).
- Around line 68-69: Swap the assertions so stdout is checked before the exit
code and strengthen the stdout check: replace the vacuous
expect(stdout.trim()).not.toBe("") with an explicit non-null match assertion
such as expect(stdout.trim()).not.toBe("null") (or assert the specific expected
match signal), then assert expect(exitCode).toBe(0) last; locate and update the
two assertions referencing exitCode and stdout in the test (variables exitCode
and stdout).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: ba7b4b9c-acde-4dcf-9950-8e5eba3e1eaa
📒 Files selected for processing (1)
test/regression/issue/29547.test.ts
…iant Address coderabbit review on #29550: - ASAN/debug builds can emit signal-handler warnings to stderr, so `expect(stderr).toBe('')` is brittle. Dump stderr only on non-zero exit, keep stdout + exitCode as the regression signal. - The second test's `expect(stdout.trim()).not.toBe('')` was vacuous (it'd pass with the literal string 'null', which is what `pattern.exec` returns for anything that doesn't actually match the JSX tag regex). The first test is already the minimal repro that exercises the YARR JIT backtrack path; the second was only 'Prism- shaped' redundancy. Drop it.
There was a problem hiding this comment.
LGTM — the two flagged issues are minor nits (comment typo and assertion order) that do not affect correctness.
Extended reasoning...
Overview
This PR adds a single new regression test file (test/regression/issue/29547.test.ts) for a YARR JIT SEGV bug where aliased parenContextHead frame slots caused a jump to RIP=0. The actual runtime fix lives in vendor/WebKit/Source/JavaScriptCore/yarr/YarrJIT.cpp (separate WebKit PR), so this PR is purely additive test infrastructure.
Security risks
None. The change only adds a test file. No production code paths are modified.
Level of scrutiny
Low. This is a new test file with no impact on runtime behavior. The test logic is structurally correct: it spawns a subprocess, runs a regex that previously caused a SEGV, and asserts a clean exit with expected output.
Other factors
The two flagged issues are both nits: a comment typo (reprism → Prism) and reversed assertion order in the second test (exitCode before stdout, contrary to CLAUDE.md convention). Neither affects test correctness or CI reliability. The first test correctly follows the stdout-before-exitCode convention; only the second deviates. These are worth fixing but do not warrant blocking the PR.
Additional findings (outside current diff — PR may have been updated during review):
-
🟡
test/regression/issue/29547.test.ts:75-77— In the second test,expect(exitCode).toBe(0)is checked beforeexpect(stdout.trim()).not.toBe("")(lines 75-77), which violates the project convention in CLAUDE.md requiring stdout assertions to come before exitCode. This means a failure caused by wrong or missing stdout output will report an exit-code mismatch instead of the more informative content mismatch; swapping lines 76 and 77 fixes it.Extended reasoning...
What the bug is
CLAUDE.md (line 128) states: "When spawning processes, tests should
expect(stdout).toBe(...)BEFOREexpect(exitCode).toBe(0). This gives you a more useful error message on test failure." The second test in29547.test.tsviolates this ordering.Specific code path
Lines 75-77 of the new file read:
expect(stderr).toBe(""); expect(exitCode).toBe(0); // ← exitCode before stdout expect(stdout.trim()).not.toBe(""); // ← stdout after exitCode
whereas the first test (lines 43-45) correctly follows the documented convention (stderr → stdout → exitCode).
Why existing code does not prevent it
The linter/CI does not enforce assertion ordering; the order is a project style convention documented in CLAUDE.md. Nothing prevents the inverted ordering from compiling and running — it is purely a human-readable convention that was accidentally violated.
Impact
When the process crashes (e.g. still SIGSEGVing on a debug build) or produces empty output, the first assertion to fail will be
expect(exitCode).toBe(0)rather than the stdout assertion. The error message printed by the test runner will say something like "expected 139 to equal 0" instead of the more diagnostic "expected output to not be empty". This makes triage harder.Step-by-step proof
- Suppose the fix is accidentally reverted and the regex again causes a SEGV.
- The child process exits with code 139 (SIGSEGV).
- Jest/bun:test evaluates assertions in order:
expect(stderr).toBe("")— passes (stderr is empty before the crash);expect(exitCode).toBe(0)— fails, reporting "Expected: 0 / Received: 139". - The test stops;
expect(stdout.trim()).not.toBe("")is never reached. - The developer sees an exit-code failure with no hint about what the process actually printed — the opposite of what the convention is designed to achieve.
- With the correct order (stdout before exitCode), step 4 would instead fail with "Expected stdout to not be empty" providing immediate signal that no output was produced.
How to fix
Swap lines 76 and 77:
expect(stderr).toBe(""); expect(stdout.trim()).not.toBe(""); // stdout first expect(exitCode).toBe(0); // exitCode last
This matches both the CLAUDE.md rule and the order used in the first test in the same file.
There was a problem hiding this comment.
LGTM — clean regression test following established patterns, all prior review feedback addressed.
Extended reasoning...
Overview
This PR adds a single 32-line regression test file () for a YARR JIT SEGV caused by aliased parenContextHead slots in regexes with alternation. The actual runtime fix lives in the WebKit fork; this PR only lands the test.
Security risks
None. This is a test-only change that spawns a child Bun process to execute a regex and asserts on stdout + exit code.
Level of scrutiny
Low. The change is a single test file following the established regression test pattern for this repo: uses bunExe()/bunEnv from harness, pipes stdout/stderr, asserts stdout content before exit code, gates stderr logging behind exitCode \!== 0 to avoid ASAN noise brittleness.
Other factors
All CodeRabbit feedback (brittle unconditional expect(stderr).toBe(""), vacuous second Prism test) was addressed in commit 244d62a. My prior inline comment about a 'reprism' typo was on a comment line that was dropped in the same commit and is no longer present in the current file. No outstanding unresolved concerns.
|
Closing — the regression test in this PR fails on the current prebuilt WebKit ( Tracking the fix through oven-sh/WebKit#189 instead. Once that PR is reviewed and merged (producing a new Root cause and minimal repro remain documented on #29547. |
Fixes #29547.
Real-world trigger:
bunx --bun jscpd *.tsx(and anything else that runs Prism's JSX tag regex over TSX source) crashes withpanic(main thread): Segmentation fault at address 0x0on 1.3.13. One-line reproduction:Root cause
The WebKit bump in #29294 (
c2010c4→4d5e75e) pulled in37465a721d"[YARR] Fix JIT crash on non-greedy ParenthesesSubpattern backtracking", which addedclearInnerParenContextHeadSlotsas a UAF guard: afterrestoreParenContextit walks the disjunction and nulls every Greedy/NonGreedy subpattern'sparenContextHeadframe slot so a laterBEGIN.btsees null instead of a possibly-freed ParenContext pointer. (Upstream2d16551f69later widened it toclearParenContextHeadSlotsInRangewalking the full pattern tree.)Both versions ignore that sibling alternatives of the same disjunction share a base frame offset, so a Greedy subpattern's parenContextHead slot in one alternative can coincide with a Once subpattern's returnAddress slot in a sibling. In the minimal pattern:
Slot 8 is
(?:.|b)'sreturnAddressin alt #0 and(?:a)*'sparenContextHeadin alt #1. The unconditional null destroys alt #0's valid returnAddress; the nextNestedAlternativeEndbacktrack doesloadFromFrameAndJumpon slot 8 and jumps to 0 →RIP=0→ segfault.Fix
The actual code lives in WebKit: oven-sh/WebKit#189.
Before nulling a
parenContextHeadslot, check whether any sibling alternative in the parent disjunction has a term whose frame range covers that slot. If so, skip the null — the sibling alt may have written a valid non-pointer value there. The UAF guard stays active for unambiguous slots, which covers every test case added by37465a721dand2d16551f69(all still pass).This PR adds the regression test. The runtime fix ships to users via a
WEBKIT_VERSIONbump once the WebKit PR is merged and autobuilt.Verification
bunx --bun jscpd ComponentOne.tsx ComponentTwo.tsxfrom the issue report completes normally.JSTests/stress/yarr-jit-*.jstests (includingyarr-jit-paren-context-head-uaf.jsandyarr-jit-non-greedy-parentheses-backtrack.js) exit 0.Gate note
The fix is entirely in
vendor/WebKit/Source/JavaScriptCore/yarr/YarrJIT.cpp, not insrc/orpackages/. The automatic test-proof gategit stash push -- src/ packages/won't revert the WebKit change, so the stashed-build test will pass rather than fail. The before/after split is verifiable directly: run the test with/tmp/bun-1.3.13/bun(crashes) vsbun bdbuilt against the WebKit PR (passes).