Fix use-after-free in Bun.Transpiler async transform error messages#30229
Fix use-after-free in Bun.Transpiler async transform error messages#30229robobun wants to merge 2 commits into
Conversation
When transform() fails with parse errors, the error message text is allocated in a per-task arena that is destroyed before then() runs on the JS thread. Reading the log messages in then() accessed freed memory, producing corrupted error text in release builds and an ASAN use-after-poison crash in debug builds. Deep-clone the log messages into the default allocator before the arena is torn down so the error text survives until the promise is rejected.
There was a problem hiding this comment.
I didn't find any bugs and the defer ordering looks correct, but this is a memory-lifetime fix in Zig touching arena/allocator interaction, so it's worth a human sanity-check (in particular whether the cloned msg contents are eventually freed, since Log.deinit only does clearAndFree on the list).
Extended reasoning...
Overview
This PR fixes a use-after-free in Bun.Transpiler's async transform() error path. TransformTask.run() parses source using a per-task MimallocArena; log messages produced by the parser hold string slices pointing into that arena. The arena is torn down when run() returns, but then() later reads this.log on the JS thread to build the rejection error, which accessed freed memory. The fix adds a defer that deep-clones every Msg in this.log.msgs into bun.default_allocator before the arena is destroyed. A new regression test exercises 100 concurrent failing transforms and asserts the error text is intact.
Security risks
None. This is an internal memory-ownership correction on an error path; no new inputs, parsing, auth, or network surface.
Level of scrutiny
Moderate. The change is only 4 lines of production code, but it's lifetime-sensitive Zig where defer ordering matters. I verified that the new defer is registered after defer arena.deinit() and therefore executes before the arena is freed, so the clone reads valid memory. Msg.clone deep-clones data.text, location (file/line_text), and recursively clones notes via bun.clone, so no arena pointers should survive.
Other factors
One thing worth a maintainer's eye: Log.deinit (called from TransformTask.deinit) only does msgs.clearAndFree() and does not free individual Msg contents. Previously those contents lived in the arena and were reclaimed by arena.deinit(); now they live in bun.default_allocator. If log.toJS / BuildMessage.create doesn't take ownership of those strings, this trades a UAF for a small per-error leak. That's strictly an improvement over corrupted output / ASAN crashes, but a reviewer familiar with logger.Msg ownership conventions should confirm whether that's acceptable or already the norm elsewhere. Given this is non-trivial memory management rather than a mechanical change, I'm deferring rather than auto-approving.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughDefer-clone transpiler log messages to the default allocator and explicitly deinitialize per-message state on task teardown. Added two regression tests that run 100 concurrent async transforms to assert parse and redeclaration error messages remain correct. ChangesTranspiler log memory-safety and tests
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Review rate limit: 2/5 reviews remaining, refill in 28 minutes and 53 seconds. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/js/bun/transpiler/transpiler-transform-error-uaf.test.ts`:
- Around line 6-13: The test currently uses .catch(e => e) which converts
rejections into fulfilled values and can mask unexpected successes; change the
logic to use Promise.allSettled on the array of transpiler.transform(...) calls
(or replace .catch with no handler and await each with try/catch), then for each
result assert result.status === "rejected" and that String(result.reason)
contains "Unexpected". Update references to the promises array and
transpiler.transform calls accordingly so each invocation's rejection status is
asserted before checking the error message.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3a87ce2f-30c5-46bc-b092-c86f89f101d2
📒 Files selected for processing (2)
src/bun.js/api/JSTranspiler.zigtest/js/bun/transpiler/transpiler-transform-error-uaf.test.ts
| const promises: Promise<unknown>[] = []; | ||
| for (let i = 0; i < 100; i++) { | ||
| promises.push(transpiler.transform("const x = ;", "js").catch(e => e)); | ||
| } | ||
|
|
||
| const errors = await Promise.all(promises); | ||
| for (const err of errors) { | ||
| expect(String(err)).toContain("Unexpected"); |
There was a problem hiding this comment.
Assert rejection status explicitly to prevent false positives.
On Line 8 and Line 22, .catch(e => e) turns failures and successes into fulfilled values. The "x" assertion can pass even if transform() unexpectedly resolves. Verify each call is actually rejected before checking message text.
Suggested fix
- const promises: Promise<unknown>[] = [];
+ const promises: Promise<unknown>[] = [];
for (let i = 0; i < 100; i++) {
- promises.push(transpiler.transform("const x = ;", "js").catch(e => e));
+ promises.push(transpiler.transform("const x = ;", "js"));
}
- const errors = await Promise.all(promises);
- for (const err of errors) {
- expect(String(err)).toContain("Unexpected");
+ const results = await Promise.allSettled(promises);
+ for (const result of results) {
+ expect(result.status).toBe("rejected");
+ if (result.status === "rejected") {
+ expect(String(result.reason)).toContain("Unexpected");
+ }
}
@@
- const promises: Promise<unknown>[] = [];
+ const promises: Promise<unknown>[] = [];
for (let i = 0; i < 100; i++) {
- promises.push(transpiler.transform("const x = 1; const x = 2;", "js").catch(e => e));
+ promises.push(transpiler.transform("const x = 1; const x = 2;", "js"));
}
- const errors = await Promise.all(promises);
- for (const err of errors) {
- expect(String(err)).toContain(`"x"`);
+ const results = await Promise.allSettled(promises);
+ for (const result of results) {
+ expect(result.status).toBe("rejected");
+ if (result.status === "rejected") {
+ expect(String(result.reason)).toContain(`"x"`);
+ }
}Also applies to: 20-27
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/js/bun/transpiler/transpiler-transform-error-uaf.test.ts` around lines 6
- 13, The test currently uses .catch(e => e) which converts rejections into
fulfilled values and can mask unexpected successes; change the logic to use
Promise.allSettled on the array of transpiler.transform(...) calls (or replace
.catch with no handler and await each with try/catch), then for each result
assert result.status === "rejected" and that String(result.reason) contains
"Unexpected". Update references to the promises array and transpiler.transform
calls accordingly so each invocation's rejection status is asserted before
checking the error message.
|
This PR may be a duplicate of:
🤖 Generated with Claude Code |
|
Duplicate of #29958, which already has maintainer feedback incorporated and green CI. Closing in favor of that one. |
| defer for (this.log.msgs.items) |*msg| { | ||
| msg.* = bun.handleOom(msg.clone(bun.default_allocator)); | ||
| }; |
There was a problem hiding this comment.
🔴 This introduces a memory leak: msg.clone(bun.default_allocator) deep-clones Location.file and Location.line_text (via Location.clone, src/logger.zig:113-123), but Location.deinit is a no-op (src/logger.zig:149), so the new msg.deinit(bun.default_allocator) in TransformTask.deinit never frees them. Every rejected async transform() now leaks at least two bun.default_allocator allocations per error message (and per note); previously these strings lived in the per-task arena and were reclaimed by arena.deinit(). Either make Location.deinit free file/line_text, or free them explicitly in the TransformTask.deinit loop.
Extended reasoning...
What the bug is
The PR fixes a use-after-free by deep-cloning each log message out of the per-task MimallocArena into bun.default_allocator before the arena is torn down, and then freeing those clones in TransformTask.deinit. However, the clone path duplicates more than the deinit path frees: Msg.clone → Data.clone → Location.clone calls allocator.dupe(u8, this.file) and allocator.dupe(u8, this.line_text.?) (src/logger.zig:113-123), but Location.deinit is literally pub fn deinit(_: *Location, _: std.mem.Allocator) void {} with the comment "don't really know what's safe to deinit here!" (src/logger.zig:149). Data.deinit (src/logger.zig:209-215) calls loc.deinit() (no-op) and then frees only data.text. So the cloned location.file and location.line_text strings are leaked.
Code path that triggers it
TransformTask.run()parses invalid source; the lexer/parser appends an errorMsgwhosedata.locationis populated byLocation.initOrNull(src/logger.zig:163-193) —file = source.path.text(e.g."input.js") andline_text = <slice of source.contents>.- The new
defer for (this.log.msgs.items) |*msg| msg.* = bun.handleOom(msg.clone(bun.default_allocator));runs, which callsData.clone→Location.clone, duping bothfileandline_textintobun.default_allocator. The same happens for every note (Msg.cloneclones each note'sDatatoo). TransformTask.deinit()runsfor (this.log.msgs.items) |*msg| msg.deinit(bun.default_allocator);→Data.deinitfreesdata.textand calls the no-opLocation.deinit; thenMsg.deinitfrees the notes slice.fileandline_textare never freed.
Why existing code doesn't prevent it
Location.deinit has always been a no-op because in every other caller the Location strings are borrowed slices (into source.path.text / source.contents or into an arena). Before this PR, TransformTask was in the same boat: the strings lived in the per-task MimallocArena and were reclaimed wholesale by arena.deinit(). This PR is the first place that puts Location.clone's heap-owned copies into bun.default_allocator and expects msg.deinit to clean them up — but msg.deinit was never designed to do that.
Impact
Every rejected async transform() leaks at least two small heap allocations (the "input.<ext>" filename and the offending source line) per error message, plus two more per note (e.g. the redeclaration error in the new test has a note with its own location). For the new regression test alone that's ~400+ leaked allocations per run. It's a small per-call leak, but it is a new leak introduced by a PR whose purpose is memory correctness, and it accumulates unbounded in long-running processes that repeatedly transpile invalid input.
Step-by-step proof
Take the second test case, transpiler.transform("const x = 1; const x = 2;", "js"):
- Parser emits a redeclaration error
Msgwithdata.location = { file: "input.js", line_text: "const x = 1; const x = 2;", ... }and one noteDatawith its own location pointing at the original declaration. - The new
deferclones the message:Location.clonedupes"input.js"(8 bytes) and"const x = 1; const x = 2;"(25 bytes) fordata.location, and again for the note's location — fourbun.default_allocator.dupecalls. then()rejects the promise, thendeinit()runs:msg.deinit→data.deinitfreesdata.text, callsLocation.deinit(no-op); iterates notes, frees each note'stext, callsLocation.deinit(no-op); frees the notes slice.- The four duped location strings are never passed to
allocator.free. With 100 iterations × 4 allocations ≈ 400 leaked allocations for this one test.
How to fix
Either:
- Free the location fields explicitly in the new
TransformTask.deinitloop (and for each note), since this is the only place that owns heap-allocatedLocationstrings inbun.default_allocator; or - Use the
StringBuilder-basedMsg.cloneWithBuilderpath so all cloned strings share a single allocation that can be freed in one shot; or - Make
Location.deinitactually freefileandline_text(riskier — other callers pass borrowed slices and would double-free/invalid-free).
What
Bun.Transpiler.prototype.transform()rejects with corrupted error messages (or crashes under ASAN) when parsing fails and multiple transforms run concurrently.Why
TransformTask.run()creates a per-taskMimallocArenaand sets it as the transpiler's allocator. When parsing fails, the parser and lexer allocate error message text (and notes) in that arena viaallocPrint(p.allocator, ...). The arena is destroyed viadefer arena.deinit()whenrun()returns from the worker thread.Later,
then()runs on the JS thread and callsthis.log.toJS(), which clones each message by readingmsg.data.text. By then the text points into the destroyed arena, so the read is a use-after-free. In release builds this surfaces as garbage bytes in the rejection message; in ASAN debug builds it aborts withuse-after-poison.How
Before the arena is torn down, deep-clone each log message into
bun.default_allocatorso the text survives until the promise is rejected.Msg.clone()already recursively copiesdata.text,location, andnotes.Test
Added
test/js/bun/transpiler/transpiler-transform-error-uaf.test.tswhich queues 100 concurrenttransform()calls with invalid source and asserts every rejection contains the expected parser error text. Before this change the test fails with garbage bytes in the message (and segfaults) on the stock release build, and aborts under ASAN.