Skip to content

fmt: route pure-{s} allocPrint through shared concat (-287 KB .text)#30358

Closed
robobun wants to merge 2 commits into
mainfrom
farm/e6717373/allocprint-concat
Closed

fmt: route pure-{s} allocPrint through shared concat (-287 KB .text)#30358
robobun wants to merge 2 commits into
mainfrom
farm/e6717373/allocprint-concat

Conversation

@robobun

@robobun robobun commented May 7, 2026

Copy link
Copy Markdown
Collaborator

What

std.fmt.allocPrint / std.fmt.allocPrintSentinel generate a fresh Writer.print monomorphization per (fmt, args-tuple) pair. Of the 406 + 22 call sites in src/, ~35% use format strings composed solely of bare {s} placeholders and literal text — "{s}@{s}", "_{s}", "workspace:{s}", "{s}.exe", "X-Amz-Date={s}", "./{s}", "{s}/{s}", and so on — which are plain string concatenation in disguise.

How

Adds bun.fmt.allocPrint / bun.fmt.allocPrintSentinel / bun.fmt.allocPrintZ. At comptime they parse the format string; when it's purely {s} + literals (no width/precision/positional/escapes) they lower the call to a flat [N][]const u8 array fed to a single shared bun.strings.concat (or a sentinel-terminated equivalent). Any other format string forwards unchanged to std.fmt.allocPrint*.

All src/ call sites are mechanically swapped to the bun.fmt entry points, and test/internal/ban-words.test.ts now bans direct std.fmt.allocPrint( / std.fmt.allocPrintSentinel( so the convention sticks.

Binary size (linux-x64 release, stripped)

before after Δ
bun 91,569,096 B 91,257,800 B −304 KB
.text 53,962,252 B 53,675,020 B −287 KB
symbol group before after
Io.Writer.print__anon_* 5031 4364
std.fmt.allocPrint__anon_* 1111 494

Verification

  • zig:check-all passes (all targets, debug + release).
  • No behavior change: bun.fmt.allocPrint output is byte-identical to std.fmt.allocPrint (Zig test block in fmt.zig asserts parity for the fast-path patterns and verifies non-{s} formats fall through).
  • Integration tests for touched call sites pass with the release build: yarn-lock migration (16/16), bun-add (same pass/fail as main — only pre-existing git-network failures), bun-pm (12/12), bun-audit + bun-pm-why (44/44), bun-pack (70/70), es-decorators (147/147), bundler edgecase (103/103), s3-signature-order (1/1), bunshell (354/355 — same pre-existing EACCES-as-root failure as main).
  • Spot-checked bun install / bun outdated / bun build error output manually.

std.fmt.allocPrint/allocPrintSentinel generate a fresh Writer.print
monomorphization per (fmt, args-tuple) pair. ~35% of the 406+22 call
sites in src/ use format strings composed solely of bare {s}
placeholders and literal text — "{s}@{s}", "_{s}", "workspace:{s}",
"{s}.exe", "X-Amz-Date={s}", and so on — which are plain string
concatenation in disguise.

Add bun.fmt.allocPrint / allocPrintSentinel / allocPrintZ. At comptime
they detect the pure-{s} shape and lower it to a flat
[N][]const u8 array fed to a single shared bun.strings.concat (or a
sentinel-terminated equivalent). Everything else forwards unchanged to
std.fmt. Then mechanically swap all src/ call sites to the bun.fmt
entry points and ban direct std.fmt.allocPrint* use going forward.

linux-x64 release, stripped:
  before  91,569,096 B
  after   91,257,800 B   (-304 KB; .text -287 KB)

Io.Writer.print__anon_* instantiations: 5031 -> 4364
std.fmt.allocPrint__anon_* instantiations: 1111 -> 494
@coderabbitai

coderabbitai Bot commented May 7, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

Adds bun.fmt allocPrint/allocPrintZ/allocPrintSentinel with a comptime fast path and replaces std.fmt allocPrint usages across the repository. Documentation and tests ban std.fmt allocPrint variants. One behavioral tweak: stricter permessage-deflate negotiation during WebSocket upgrade.

Changes

Formatter migration and enforcement

Layer / File(s) Summary
Core: bun.fmt allocPrint family implementation
src/bun_core/fmt.zig
Introduces allocPrint(+Z/+Sentinel) with a comptime pure-{s} concat fast path and fallbacks.
Policy/tests: ban std.fmt allocPrint variants
src/CLAUDE.md, test/internal/ban-words.test.ts, test/internal/ban-limits.json
Docs reflect bun.* usage; tests add bans and limits for std.fmt allocPrint variants.
Repo-wide migration to bun.fmt
src/**, test/**
Replaces std.fmt allocPrint/allocPrintSentinel/bufPrint with bun.fmt counterparts across bake, bundler, CLI, install, runtime, parser, shell, JSC, networking, S3, etc.; WebSocket extension negotiation tightened.

@github-actions github-actions Bot added the claude label May 7, 2026
@robobun

robobun commented May 7, 2026

Copy link
Copy Markdown
Collaborator Author
Updated 1:20 AM PT - May 7th, 2026

@autofix-ci[bot], your commit 2b55f9b has 2 failures in Build #52451 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 30358

That installs a local version of the PR into your bun-30358 executable, so you can run:

bun-30358 --bun

@robobun

robobun commented May 7, 2026

Copy link
Copy Markdown
Collaborator Author

CI status (build 52451, all lanes complete):

CI's own binary-size check confirms the savings on every target — −249 KB to −368 KB vs main #52300 (darwin-aarch64 −290.7 KB, linux-x64 −336 KB, freebsd-x64 −368 KB, windows-x64 −292.5 KB, etc.).

The two red annotations are pre-existing flakes unrelated to this PR:

  • test/js/bun/test/parallel/test-http-should-emit-close-when-connection-is-aborted.ts — Windows-only timeout also present on concurrent builds 52441 / 52446 / 52447 / 52448 / 52450 from unrelated branches. Exercises node:http req.on('close') after AbortController.abort(); nothing here touches that path.
  • test/js/node/async_hooks/AsyncLocalStorage-tracking.test.tsasync-context-fs-watch.js segfault at 0xC on darwin-aarch64. Known fs.watch / FSEvents race with open fixes in fs_events(darwin): fix broken double-checked locking on fsevents_default_loop #30111 and fs.watch: fully initialize ConcurrentTask in FSWatchTask.enqueue #29950. The only node_fs.zig edit in this PR is a std.fmtbun.fmt swap on a "Failed to watch file {f}" error string (uses {f}, so it falls straight through to std.fmt.allocPrint — zero behavior change).

Remaining flaky-annotation entries (jsc-stress mprotect on Windows, hot.test.ts reload counter, fetch-http2 AtomStringImpl assertion) passed on retry.

All coderabbit inline findings are pre-existing on main and out of scope for this mechanical refactor (replied + resolved).

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find correctness issues, but this introduces a new comptime format-string parser plus a codebase-wide convention (enforced via ban-words) across 125 files in core paths — worth a human sign-off, and CI build-zig is currently red on most targets.

Extended reasoning...

Overview

This PR adds bun.fmt.allocPrint / allocPrintSentinel / allocPrintZ in src/bun_core/fmt.zig — wrappers that comptime-parse the format string and, when it consists solely of bare {s} placeholders and literals, lower the call to a shared bun.strings.concat to reduce per-call-site monomorphization. All 400+ std.fmt.allocPrint* call sites across 120+ source files (bundler, js_parser, install, resolver, runtime/server, s3 signing, shell, sql, etc.) are mechanically swapped to the new entry points, test/internal/ban-words.test.ts bans the old spellings, and src/CLAUDE.md documents the convention. One file (s3/client.zig) also drops its now-unused std import.

Security risks

None identified. The fast path is a pure string concatenation that is byte-identical to std.fmt's output for the patterns it accepts; everything else (positional args, named args like {[key]s}, width/precision, escapes, non-{s} specifiers, by-value arrays) falls through to std.fmt unchanged. No new untrusted-input parsing surface is introduced.

Level of scrutiny

High. While the per-call-site edits are mechanical s/std.fmt/bun.fmt/, the new parseConcatPlan comptime parser and concatArgSlice coercion are load-bearing for string formatting throughout the codebase, and the change establishes a new enforced convention. The diff also touches hot/critical paths (bundler linker, JS parser, package installer, S3 request signing, HTTP server). A maintainer should validate that the comptime classifier's accepted-type set exactly matches std.fmt's {s} semantics (e.g. sentinel-terminated pointers, optionals) and that the ban-words limits won't cause friction.

Other factors

CI (robobun) reports build-zig failures on essentially every target for commit 0aec1db, which contradicts the PR description's "zig:check-all passes" — that needs to be reconciled before merge regardless of review outcome. There is a follow-up autofix.ci commit but no green build reported yet. The PR includes a Zig test block asserting fast-path/std.fmt parity and fall-through classification, which is good, but the integration test claims in the description can't be verified from here. Given the breadth, the new convention being locked in via ban-words, and the red CI, this should not be auto-approved.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cli/update_interactive_command.zig (1)

450-460: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid leaking the temporary catalog_key buffer.

When pkg.catalog_name is present, Line 451 allocates catalog_key, but Line 459 inserts a duplicated copy into the map and the original allocation is never freed. Each selected catalog update leaks one buffer.

Proposed fix
-                const catalog_key = if (pkg.catalog_name) |catalog_name|
-                    try bun.fmt.allocPrint(bun.default_allocator, "{s}:{s}", .{ pkg.name, catalog_name })
-                else
-                    pkg.name;
+                const catalog_key = if (pkg.catalog_name) |catalog_name|
+                    try bun.fmt.allocPrint(bun.default_allocator, "{s}:{s}", .{ pkg.name, catalog_name })
+                else
+                    try bun.default_allocator.dupe(u8, pkg.name);
+                errdefer bun.default_allocator.free(catalog_key);
@@
-                try catalog_updates.put(try bun.default_allocator.dupe(u8, catalog_key), .{
+                try catalog_updates.put(catalog_key, .{
                     .version = try bun.default_allocator.dupe(u8, target_version),
                     .workspace_path = catalog_workspace_path,
                 });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cli/update_interactive_command.zig` around lines 450 - 460, The temporary
catalog_key buffer allocated when pkg.catalog_name is present is leaked because
you duplicate it again into catalog_updates.put; update the code to avoid
leaking by either (a) not allocating a separate temporary (compute the final key
directly for insertion), or (b) after calling catalog_updates.put, free the
original catalog_key allocation; locate the allocation and use of catalog_key in
the block around catalog_key, catalog_workspace_path and the catalog_updates.put
call and ensure the original allocation is deallocated (or eliminated) so only
the stored copy remains.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/bundler_jsc/JSBundleCompletionTask.zig`:
- Around line 375-383: The current code calls bun.fmt.allocPrint("{s}.map", ...)
and immediately passes it into bun.path.basename which returns a slice into that
heap buffer, then loses the original allocation causing a leak; fix by assigning
the allocPrint result to a temporary (e.g., full_map_path_alloc), call const
map_basename = bun.path.basename(full_map_path_alloc), and immediately defer
freeing full_map_path_alloc via bun.default_allocator.free(full_map_path_alloc)
(use bun.handleOom where appropriate), ensuring map_basename remains valid
because it is a subslice of the still-live buffer until you free at the end of
the function or after use; apply the same pattern for the other
bun.fmt.allocPrint call used to build sourcemap_full_path so all allocPrint
allocations are freed instead of leaked.

In `@src/semver/SemverQuery.zig`:
- Line 176: Group.fmt is being called without the required buf: string argument
in the line creating temp; update the call to pass the Group's input field so
fmt receives the buffer. Locate the Group.fmt method and the call site where
temp is assigned (const temp = try bun.fmt.allocPrint(... "{f}", .{this.fmt()}))
and change the invocation to this.fmt(this.input) so the existing input string
is used as the buf parameter.

In `@src/shell/builtin/cp.zig`:
- Around line 478-484: The error message uses this.src twice; change the second
format argument in the allocPrint call inside the branch that checks
bun.strings.eql(src, tgt) so the format parameters are .{ this.src, this.tgt }
(update the allocPrint invocation that currently passes .{ this.src, this.src
}), ensuring the error string reports both the original src and the
user-supplied tgt; keep usage of bun.handleOom and bun.fmt.allocPrint unchanged.

---

Outside diff comments:
In `@src/cli/update_interactive_command.zig`:
- Around line 450-460: The temporary catalog_key buffer allocated when
pkg.catalog_name is present is leaked because you duplicate it again into
catalog_updates.put; update the code to avoid leaking by either (a) not
allocating a separate temporary (compute the final key directly for insertion),
or (b) after calling catalog_updates.put, free the original catalog_key
allocation; locate the allocation and use of catalog_key in the block around
catalog_key, catalog_workspace_path and the catalog_updates.put call and ensure
the original allocation is deallocated (or eliminated) so only the stored copy
remains.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 08b840a6-4d64-4c24-b1b0-239ff37c2058

📥 Commits

Reviewing files that changed from the base of the PR and between c5a2f8f and 2b55f9b.

📒 Files selected for processing (125)
  • src/CLAUDE.md
  • src/bake/DevServer/DirectoryWatchStore.zig
  • src/bake/production.zig
  • src/bun.js.zig
  • src/bun_core/fmt.zig
  • src/bun_core/output.zig
  • src/bundler/Chunk.zig
  • src/bundler/LinkerContext.zig
  • src/bundler/ParseTask.zig
  • src/bundler/ServerComponentParseTask.zig
  • src/bundler/bundle_v2.zig
  • src/bundler/entry_points.zig
  • src/bundler/linker.zig
  • src/bundler/linker_context/MetafileBuilder.zig
  • src/bundler/linker_context/computeChunks.zig
  • src/bundler/linker_context/generateChunksInParallel.zig
  • src/bundler/linker_context/generateCodeForLazyExport.zig
  • src/bundler/linker_context/generateCompileResultForHtmlChunk.zig
  • src/bundler/linker_context/scanImportsAndExports.zig
  • src/bundler/linker_context/writeOutputFilesToDisk.zig
  • src/bundler/options.zig
  • src/bundler_jsc/JSBundleCompletionTask.zig
  • src/bundler_jsc/PluginRunner.zig
  • src/cli/Arguments.zig
  • src/cli/audit_command.zig
  • src/cli/build_command.zig
  • src/cli/bunfig.zig
  • src/cli/bunx_command.zig
  • src/cli/create_command.zig
  • src/cli/multi_run.zig
  • src/cli/pack_command.zig
  • src/cli/package_manager_command.zig
  • src/cli/pm_pkg_command.zig
  • src/cli/pm_version_command.zig
  • src/cli/publish_command.zig
  • src/cli/repl.zig
  • src/cli/run_command.zig
  • src/cli/test/ChangedFilesFilter.zig
  • src/cli/test/parallel/runner.zig
  • src/cli/test_command.zig
  • src/cli/update_interactive_command.zig
  • src/cli/upgrade_command.zig
  • src/cli/why_command.zig
  • src/crash_handler/crash_handler.zig
  • src/css/css_modules.zig
  • src/css/error.zig
  • src/css/media_query.zig
  • src/dns/dns.zig
  • src/http/AsyncHTTP.zig
  • src/http_jsc/websocket_client.zig
  • src/http_jsc/websocket_client/WebSocketUpgradeClient.zig
  • src/ini/ini.zig
  • src/install/PackageInstaller.zig
  • src/install/PackageManager/PackageJSONEditor.zig
  • src/install/PackageManager/PackageManagerLifecycle.zig
  • src/install/PackageManager/UpdateRequest.zig
  • src/install/PackageManager/patchPackage.zig
  • src/install/PackageManager/runTasks.zig
  • src/install/hosted_git_info.zig
  • src/install/lockfile/Package.zig
  • src/install/lockfile/bun.lock.zig
  • src/install/npm.zig
  • src/install/patch_install.zig
  • src/install/yarn.zig
  • src/install_jsc/install_binding.zig
  • src/interchange/toml/lexer.zig
  • src/js_parser/ast/E.zig
  • src/js_parser/ast/P.zig
  • src/js_parser/ast/Parser.zig
  • src/js_parser/ast/lowerDecorators.zig
  • src/js_parser/ast/maybe.zig
  • src/js_parser/ast/parseStmt.zig
  • src/js_parser/ast/visitExpr.zig
  • src/js_parser/lexer.zig
  • src/jsc/AsyncModule.zig
  • src/jsc/BuildMessage.zig
  • src/jsc/JSGlobalObject.zig
  • src/jsc/ResolveMessage.zig
  • src/jsc/VirtualMachine.zig
  • src/jsc/ZigStackFrame.zig
  • src/jsc/web_worker.zig
  • src/logger/logger.zig
  • src/options_types/schema.zig
  • src/patch_jsc/testing.zig
  • src/resolver/fs.zig
  • src/resolver/resolver.zig
  • src/runtime/api/BunObject.zig
  • src/runtime/api/JSTranspiler.zig
  • src/runtime/api/bun/js_bun_spawn_bindings.zig
  • src/runtime/api/cron.zig
  • src/runtime/api/html_rewriter.zig
  • src/runtime/crypto/PasswordObject.zig
  • src/runtime/dns_jsc/dns.zig
  • src/runtime/ffi/ffi.zig
  • src/runtime/node/node_fs.zig
  • src/runtime/server/NodeHTTPResponse.zig
  • src/runtime/server/RequestContext.zig
  • src/runtime/server/ServerConfig.zig
  • src/runtime/server/server.zig
  • src/runtime/socket/tls_socket_functions.zig
  • src/runtime/webcore/Blob.zig
  • src/runtime/webcore/Request.zig
  • src/runtime/webcore/fetch.zig
  • src/runtime/webcore/fetch/FetchTasklet.zig
  • src/runtime/webcore/s3/client.zig
  • src/runtime/webview/ChromeProcess.zig
  • src/s3_signing/credentials.zig
  • src/semver/SemverQuery.zig
  • src/shell/Builtin.zig
  • src/shell/builtin/cp.zig
  • src/shell/builtin/seq.zig
  • src/shell/interpreter.zig
  • src/shell/shell.zig
  • src/shell/states/Expansion.zig
  • src/shell/subproc.zig
  • src/shell_parser/braces.zig
  • src/sql_jsc/mysql/JSMySQLConnection.zig
  • src/sql_jsc/postgres/PostgresSQLConnection.zig
  • src/sql_jsc/postgres/Signature.zig
  • src/standalone_graph/StandaloneModuleGraph.zig
  • src/string/string.zig
  • src/test_runner/jest.zig
  • src/url/url.zig
  • test/internal/ban-limits.json
  • test/internal/ban-words.test.ts

Comment thread src/bundler_jsc/JSBundleCompletionTask.zig
Comment thread src/semver/SemverQuery.zig
Comment thread src/shell/builtin/cp.zig
@robobun

robobun commented May 15, 2026

Copy link
Copy Markdown
Collaborator Author

Closing: this PR is obsolete after #30412 (Rewrite Bun in Rust), which landed on main on 2026-05-14.

The .zig files this PR modifies are now porting-reference material only — per the new src/CLAUDE.md: "You will see .zig siblings next to many .rs files — those are the original implementation kept as a porting reference for behavior; they are not compiled and are not where new code goes." build.zig no longer exists on main.

The std.fmt.allocPrint monomorphization this PR consolidated (−287 KB .text / −249 to −368 KB across targets, measured against c5a2f8f) no longer contributes to the binary. The Rust equivalent would be format!() bloat, which has very different characteristics and would need a fresh analysis against the Rust build.

@robobun robobun closed this May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant