-
Notifications
You must be signed in to change notification settings - Fork 5
π€ Optimize bundle size - reduce packaged app by 56% #354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Reduced main bundle from 2.68 MB to 2.14 MB (19% reduction, 682 KB β 535 KB gzipped). **Shiki syntax highlighting:** - Switch from full Shiki bundle to shiki/core - Use on-demand language loading via dynamic imports - Theme loaded dynamically (min-dark.mjs) - Eliminated 638 language grammars from main bundle **Tokenizer encodings:** - Made o200k_base (6.3 MB) and claude (1.9 MB) lazy-load on-demand - Total 8.2 MB now loaded only when specific model is used - Added encoding cache to avoid redundant loads **Mermaid diagrams:** - Lazy-load Mermaid component with React.lazy() - Moved 497 KB to separate chunk - Added Suspense boundary with loading fallback **Build config:** - Added rollup-plugin-visualizer for bundle analysis - Configured to generate stats.html in production builds Remaining bundle size (2.14 MB) includes React, Emotion, core UI components, and the Vercel AI SDK. All heavy dependencies are now code-split appropriately. _Generated with `cmux`_
Reduced packaged app size by moving frontend dependencies to devDependencies since they're bundled by Vite and don't need to be in production node_modules. **Before:** - AppImage: 192 MB - app.asar: 234 MB - Packaged node_modules: 394 packages **After:** - AppImage: 157 MB (18% reduction) - app.asar: 102 MB (56% reduction!) - Packaged node_modules: 140 packages (64% fewer) **Moved to devDependencies (renderer-only):** - React ecosystem: react, react-dom, react-compiler-runtime, react-dnd, react-dnd-html5-backend - Markdown: react-markdown, rehype-*, remark-* - UI libraries: @emotion/*, cmdk - Syntax highlighting: shiki (bundled into dist/) - Diagrams: mermaid (lazy-loaded chunk) **Kept in dependencies (main process needs):** - AI SDK: ai, @ai-sdk/anthropic, @ai-sdk/openai, ai-tokenizer - Core utils: chalk, crc-32, diff, disposablestack, escape-html, jsonc-parser, lru-cache, markdown-it, minimist, posthog-js, source-map-support, undici, write-file-atomic, zod, zod-to-json-schema - Electron: electron-updater _Generated with `cmux`_
3c0582b
to
9489f9c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
π‘ Codex Review
Here are some automated review suggestions for this pull request.
βΉοΈ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with π.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
rollupOptions: { | ||
external: [ | ||
// Externalize tokenizer encodings - these are large and should be lazy-loaded | ||
"ai-tokenizer/encoding/o200k_base", | ||
"ai-tokenizer/encoding/claude", | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not externalize tokenizer encoding modules
The new Rollup config marks ai-tokenizer/encoding/o200k_base
and ai-tokenizer/encoding/claude
as external
. At runtime loadEncoding()
uses import("ai-tokenizer/encoding/...")
from the renderer (see src/utils/main/tokenizer.ts
). Because these modules are excluded from the bundle and devDependencies are not shipped with the packaged app, the browser will attempt to fetch /ai-tokenizer/encoding/β¦
which does not exist, leading to "Failed to fetch dynamically imported module" and falling back to approximations. Leave these modules bundled so that Vite emits lazy chunks instead of removing them entirely.
Useful? React with πΒ / π.
countTokens: (text: string) => { | ||
// If tokenizer already loaded, use synchronous path for accurate counts | ||
if (tokenizerModules) { | ||
return countTokensCached(text, () => { | ||
try { | ||
return countTokensWithLoadedModules(text, modelString, tokenizerModules!); | ||
} catch (error) { | ||
// Unexpected error during tokenization, fallback to approximation | ||
console.error("Failed to tokenize, falling back to approximation:", error); | ||
return Math.ceil(text.length / 4); | ||
} | ||
}); | ||
} | ||
|
||
// Tokenizer not yet loaded - use async path (returns approximation immediately) | ||
// Always use async path since encodings are loaded on-demand | ||
return countTokensCached(text, async () => { | ||
await loadTokenizerModules(); | ||
try { | ||
return countTokensWithLoadedModules(text, modelString, tokenizerModules!); | ||
return await countTokensWithLoadedModules(text, modelString, tokenizerModules!); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restore synchronous token counts after modules load
After refactoring countTokensWithLoadedModules
to be async, getTokenizerForModel().countTokens
always invokes countTokensCached
with an async function (lines 209β214). As a result countTokensCached
never receives a synchronous value even once the tokenizer and encodings are in memory, so every new text returns only the length/4 approximation and the accurate count is never surfaced to callers unless the exact same string is queried again. Token budgets in the UI and truncation logic will therefore consistently undercount and can exceed model limits. Keep the synchronous path once modules and encodings are loaded, or return a Promise so callers can await the real count.
Useful? React with πΒ / π.
Add eslint-disable comments for intentional dynamic imports used for code-splitting: - Mermaid component lazy loading - Shiki language grammars on-demand - Shiki WASM engine and theme These dynamic imports are necessary for bundle optimization and are not hiding circular dependencies - they're explicit performance optimizations.
The React.lazy approach was causing E2E tests to timeout. While it reduced the main bundle slightly, the benefits don't outweigh the test failures. Keeping the Shiki and tokenizer optimizations which provide the bulk of the bundle size reduction without breaking tests.
Summary
Significantly reduced bundle sizes through lazy-loading and proper dependency categorization.
Results
JavaScript Bundles (Vite output)
Packaged Electron App
Changes
1. Lazy-load Shiki syntax highlighting
shiki
bundle toshiki/core
2. Lazy-load tokenizer encodings
3. Move renderer dependencies to devDependencies
4. Add bundle analysis tooling
Technical Details
Shiki changes:
src/utils/highlighting/shikiHighlighter.ts
- UsecreateHighlighterCore
withcreateOnigurumaEngine
src/utils/highlighting/highlightDiffChunk.ts
- Dynamic import language grammars viaimport('shiki/langs/${lang}.mjs')
Tokenizer changes:
src/utils/main/tokenizer.ts
- Load encodings on-demand, cache loaded encodings to avoid redundant importsDependency changes:
package.json
- Moved 16 renderer-only packages from dependencies to devDependenciesTesting
Generated with
cmux