Skip to content

node-fallbacks: embed browser polyfills zstd-compressed#30347

Draft
sosukesuzuki wants to merge 2 commits into
mainfrom
claude/node-fallbacks-zstd
Draft

node-fallbacks: embed browser polyfills zstd-compressed#30347
sosukesuzuki wants to merge 2 commits into
mainfrom
claude/node-fallbacks-zstd

Conversation

@sosukesuzuki

Copy link
Copy Markdown
Contributor

Summary

The bundled node-fallbacks/*.js polyfills are embedded as plain-text
strings in __TEXT,__const (~1 MB, 477 KB of which is crypto.js alone).
They are only ever read by bun build --target=browser, so everyone who
never bundles for the browser pays the 1 MB on disk for nothing.

This PR compresses each polyfill with zstd-19 at codegen time (~85 %
ratio) and @embedFiles the .js.zst in release builds. The first
access to a given module decompresses it into the heap and caches it for
the process lifetime. Debug builds keep reading the uncompressed .js
at runtime so JS-only edits still don't trigger a Zig rebuild.

Size

before after
release binary (arm64-darwin) 61,878,048 B 61,068,976 B (−809 KB, −1.3 %)
__TEXT,__const 7,003,492 B 6,200,164 B (−803 KB)
crypto.js polyfill in binary 477,363 B 71,233 B

Performance

bun build --target=browser, hyperfine, warmup=3, runs=30, Apple M4 Max:

worst case — entry imports the 5 largest polyfills (crypto, stream,
assert, zlib, http; all decompressed on the path):

mean ± σ
before 44.6 ms ± 2.8 ms
after 45.8 ms ± 2.2 ms
ratio 1.03 ± 0.08 — within noise

no polyfillsconsole.log("hello") only: no measurable
difference. Decompression is lazy; it never runs unless a polyfill is
actually requested.

Bundler output is byte-identical before and after.

Implementation

  • src/node-fallbacks/build-fallbacks.ts — write a .js.zst next to
    each bundled .js (Bun.zstdCompressSync(..., { level: 19 })).
  • scripts/build/codegen.ts — declare both .js and .js.zst as
    ninja outputs of the codegen step.
  • build.zig — register node-fallbacks/*.js.zst as anonymous imports
    so @embedFile can resolve them (replaces the .js registrations;
    react-refresh.js is unchanged since it's referenced from
    bake.zig, not the fallback resolver).
  • src/resolver/node_fallbacks.zigcreateSourceCodeGetter lazily
    decompresses on first call and caches the result. Same pattern as
    add_completions.zig's zstd-compressed shell-completion data.

Test plan

  • bun bd test test/bundler/bundler_browser.test.ts — 12 pass / 0 fail
  • release build links and --revision works
  • bun build --target=browser output diff'd byte-identical against
    a pre-change build
  • CI

The bundled `node-fallbacks/*.js` polyfills are embedded as plain-text
strings in `__TEXT,__const` (~1 MB, 477 KB of which is `crypto.js` alone).
They are only ever read by `bun build --target=browser`, so everyone else
pays the 1 MB on disk for nothing.

Compress each polyfill with zstd-19 at codegen time (~85% ratio) and
`@embedFile` the `.js.zst` in release builds. The first access to a given
module decompresses it into the heap and caches it for the process
lifetime. Debug builds keep reading the uncompressed `.js` at runtime so
JS-only edits don't trigger a Zig rebuild.

Binary size: 61,878,048 → 61,068,976 B (−809 KB, −1.3%).
Performance (`bun build --target=browser`, hyperfine, 30 runs):
- worst case (5 largest polyfills imported): 44.6 ms ± 2.8 vs 45.8 ms ±
  2.2 — within noise (1.03 ± 0.08x).
- no polyfills imported: no difference (decompression is lazy).
Bundler output is byte-identical before and after.
@robobun

robobun commented May 7, 2026

Copy link
Copy Markdown
Collaborator
Updated 6:36 PM PT - May 7th, 2026

@sosukesuzuki, your commit 4f95d57 has 2 failures in Build #52701 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 30347

That installs a local version of the PR into your bun-30347 executable, so you can run:

bun-30347 --bun

Jarred-Sumner pushed a commit that referenced this pull request May 28, 2026
## Summary

Rust counterpart of #30347 (which targeted the old Zig implementation
and is now stale).

The bundled `node-fallbacks/*.js` browser polyfills are embedded as
plain-text strings in `.rodata` (~1 MB, 477 KB of which is `crypto.js`
alone). They are only ever read when bundling with `--target=browser`,
so everyone who never bundles for the browser pays the ~1 MB on disk for
nothing.

This PR compresses each polyfill with zstd-19 at codegen time (~85%
ratio) and embeds the `.js.zst` in release builds (`bun_codegen_embed`).
The first access to a given module decompresses it into the heap and
caches it for the process lifetime. Debug builds keep reading the
uncompressed `.js` from `BUN_CODEGEN_DIR` at runtime, so JS-only edits
still don't trigger a native rebuild.

## Size

Measured with the `btg` profile (Release + LTO, linux x64), both builds
at the same base commit, same build directory:

| | before | after | Δ |
|---|---|---|---|
| stripped binary | 74,713,200 B | 73,910,384 B | **−802,816 B
(−1.07%)** |
| `.rodata` | 22,036,416 B | 21,232,448 B | −803,968 B |
| `.text` | 52,332,682 B | 52,338,218 B | +5,536 B (lazy-decompress
getters) |
| `crypto.js` polyfill in binary | 477,051 B | 71,164 B | |

## Performance

`bun build --target=browser`, hyperfine, warmup=3, runs=30, linux x64
(Xeon Platinum 8488C), Release+LTO builds:

**worst case** — entry imports the 5 largest polyfills (crypto, stream,
assert, zlib, http; all decompressed on the path):

| | mean ± σ |
|---|---|
| before | 33.9 ms ± 0.6 ms |
| after | 34.5 ms ± 0.5 ms |
| ratio | 1.02 ± 0.02 (≈ +0.6 ms once per build for ~700 KB of one-time
zstd decompression) |

**no polyfills** — `console.log("hello")` entry: 5.1 ms vs 5.2 ms (1.01
± 0.05, within noise).

**runtime startup** — `bun -e 'console.log(1)'`: 10.1 ms vs 10.1 ms
(1.00 ± 0.04). Decompression is lazy; it never runs unless a polyfill is
actually requested.

Bundler output is byte-identical before and after (verified for an entry
importing all 23 polyfills and for the worst-case entry above).

## Implementation

- `src/node-fallbacks/build-fallbacks.ts` — write a `.js.zst` next to
each bundled `.js` (`Bun.zstdCompressSync(..., { level: 19 })`).
- `scripts/build/codegen.ts` — declare both `.js` and `.js.zst` as ninja
outputs of the codegen step (both feed the cargo edge's implicit
inputs).
- `src/resolver/node_fallbacks.rs` — `create_source_code_getter!` embeds
the `.zst` via `include_bytes!` under `cfg(bun_codegen_embed)` and
lazily decompresses into a per-module `bun_core::Once<String>` on first
call; the `cfg(not(bun_codegen_embed))` (debug) path is unchanged.
- `src/resolver/Cargo.toml` — add `bun_zstd` dependency.

`react-refresh.js` is unchanged since it's referenced from the bake dev
server, not the fallback resolver.

## Test plan

- [x] `bun bd test test/bundler/bundler_browser.test.ts` — 12 pass / 0
fail (debug build, runtime-load path)
- [x] same test file run with the Release+LTO binary (exercises the
embed + decompress path) — 12 pass / 0 fail
- [x] `bun build --target=browser` output diff'd byte-identical against
a pre-change build (all 23 polyfills)
- [ ] CI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants