Fix use-after-free in Bun.Transpiler async transform() parse errors#30020
Fix use-after-free in Bun.Transpiler async transform() parse errors#30020robobun wants to merge 2 commits into
Bun.Transpiler async transform() parse errors#30020Conversation
TransformTask.run() allocates a MimallocArena for parsing on the worker thread and frees it when run() returns. Parse/lexer errors allocate their message text via that arena, but the log outlives run() and is read by then() on the JS thread when rejecting the promise. This caused the rejection reason to contain freed memory (garbage text in release builds, ASAN use-after-poison in debug builds). Deep-clone log messages into default_allocator before the arena is freed.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughDeferred cleanup was added to TransformTask.run to clone parser/lexer error messages into the longer-lived allocator when any log messages exist. A concurrent test was added to ensure transform() parse errors retain stable message content and positions across many simultaneous invocations. Changes
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Review rate limit: 4/5 reviews remaining, refill in 12 minutes. Comment |
|
This PR may be a duplicate of:
🤖 Generated with Claude Code |
|
Duplicate of #29958, which already addresses this same UAF and has been reworked per review to avoid the StringBuilder leak that the |
| // Error messages added by the parser/lexer may contain text allocated | ||
| // in the arena. Deep-clone them into default_allocator before the arena | ||
| // is freed so they remain valid when then() reads them on the JS thread. | ||
| defer if (this.log.msgs.items.len > 0) { | ||
| var new_log = logger.Log.init(bun.default_allocator); | ||
| new_log.level = this.log.level; | ||
| bun.handleOom(this.log.appendToWithRecycled(&new_log, true)); | ||
| this.log = new_log; | ||
| }; |
There was a problem hiding this comment.
🟡 Minor: appendToWithRecycled allocates a StringBuilder buffer and a notes_buf in bun.default_allocator to hold the cloned message text, but Log.deinit() only does msgs.clearAndFree() and never frees those buffers — so each async transform() parse error leaks the cloned error text. This is the same pre-existing pattern as every other appendToWithRecycled caller and trading a UAF for a small error-path leak is clearly the right call here; just flagging it as a candidate for a follow-up that gives Log proper ownership of cloned string buffers.
Extended reasoning...
What the bug is
The new defer block in TransformTask.run() calls this.log.appendToWithRecycled(&new_log, true) to deep-clone any parser/lexer error messages out of the per-task arena into bun.default_allocator before the arena is freed. This correctly fixes the use-after-free. However, the buffers that appendToWithRecycled allocates to hold the cloned text are never freed, so they leak.
Code path
In src/logger.zig, cloneToWithRecycled() (called by appendToWithRecycled) does two allocations against other.msgs.allocator — which here is bun.default_allocator because new_log was created with logger.Log.init(bun.default_allocator):
- line 818:
try string_builder.allocate(other.msgs.allocator);— one contiguous buffer holding all clonedtext,location.file,location.line_text, etc. - line 819:
var notes_buf = try other.msgs.allocator.alloc(Data, notes_count);— backing storage for cloned notes.
These buffers are only referenced via slices stored inside the cloned Msg entries; Log itself keeps no separate handle to them.
TransformTask.deinit() calls this.log.deinit(), and Log.deinit() (logger.zig:840) is just log.msgs.clearAndFree(). That frees the ArrayList(Msg) backing array but does not call Msg.deinit() on the entries, and does not free the StringBuilder buffer or notes_buf. There is even an explicit TODO at logger.zig:846 noting that deinit "does not de-initialize the log".
Why nothing else frees it
After run() returns, then() runs on the JS thread and calls this.log.toJS() → Msg.clone(), which allocator.dupes the text a second time into default_allocator for the BuildMessage. So the first clone made by appendToWithRecycled is consumed only by being copied again, and is then orphaned when this.log.deinit() drops the msgs array.
Step-by-step example
transpiler.transform("const x = ;;;")schedules aTransformTask.- On the worker thread,
run()createsarena, sets it as the transpiler allocator, parses, and the lexer pushes aMsgwhosedata.text = "Unexpected ;"andlocation.line_text = "const x = ;;;"are allocated inarena. - The new
deferfires:new_log = Log.init(default_allocator);appendToWithRecycled(&new_log, true)allocates a ~few-dozen-byteStringBuilderbuffer indefault_allocator, copies "Unexpected ;", the file path, and the line text into it, and rewritesnew_log.msgs.items[0]to point at those slices.this.log = new_log. arena.deinit()runs — fine, nothing points into it anymore (UAF fixed).- On the JS thread,
then()→log.toJS()→Msg.clone()dupes the text again intodefault_allocatorfor theBuildMessage, thendeinit()runs. this.log.deinit()→msgs.clearAndFree()frees the 1-elementMsgarray. TheStringBuilderbuffer and (empty here, but in general)notes_buffrom step 3 are never freed.
Repeat 64 times (as the regression test does) → 64 small leaks.
Impact
Small and bounded: it only triggers on the error path of async transform(), and each leak is roughly the size of the error message text + line text + file path (tens of bytes). For typical usage this is negligible. It could add up for long-running tooling that repeatedly calls transform() on broken input (e.g. a watch loop over a file with a syntax error), but it will not affect correctness.
Crucially, this PR replaces a use-after-free (a real memory-safety bug producing garbage error messages / ASAN crashes) with a small error-path leak — strictly an improvement.
Why this is a nit, not a blocker
Log has no ownership model for message string content. appendToWithRecycled is the canonical pattern in this codebase for moving log messages out of a dying arena, and every existing caller (e.g. BundleThread.zig) has the identical leak. This PR is following established convention; fixing it properly requires Log to track owned string buffers (or for Log.deinit() to walk msgs and free data/notes), which is out of scope for a targeted UAF fix.
Possible follow-up fix
Either (a) have cloneToWithRecycled stash the StringBuilder.ptr[0..cap] and notes_buf on Log so deinit() can free them, or (b) make Log.deinit() iterate msgs.items and call msg.deinit(log.msgs.allocator) before clearAndFree(). (b) is risky because many call sites push arena-backed or static text into logs, so (a) — explicit ownership of the clone buffers — is probably safer.
What does this PR do?
Fixes a use-after-free when
new Bun.Transpiler().transform(code)(the async variant) is called with input that produces parse errors.Root cause
TransformTask.run()executes on a worker thread and creates aMimallocArenafor parsing, which is freed viadefer arena.deinit()whenrun()returns. The transpiler allocator is set to this arena, so when the lexer/parser add error messages (e.g. viaaddRangeError→std.fmt.allocPrint(self.allocator, ...)), the message text lives in the arena.The
Log.msgsArrayList itself is backed bybun.default_allocator, so it survives — but its entries point into the now-freed arena. Whenthen()runs on the JS thread and callsthis.log.toJS()→Msg.clone()→allocator.dupe(u8, text), it reads freed memory.In release builds this manifests as garbage error messages (e.g.
""or arbitrary bytes instead of"Unexpected ;"). In ASAN builds it's ause-after-poison.Fix
Before the arena is freed, deep-clone any log messages into
bun.default_allocatorviaappendToWithRecycledso the text remains valid forthen().How did you verify your code works?
main(error message comes back as garbage) and passes with this fix.transpiler.test.jsandtranspiler-tsconfig-uaf.test.tspass.