🤖 Optimize bundle size - reduce packaged app by 56% #354

kylecarbs · 2025-10-20T22:18:58Z

Summary

Significantly reduced bundle sizes through lazy-loading and proper dependency categorization.

Results

JavaScript Bundles (Vite output)

Main bundle: 2.68 MB → 2.14 MB (19% reduction, 682 KB → 535 KB gzipped)
Tokenizers (8.2 MB) lazy-loaded on-demand
Shiki language grammars lazy-loaded on-demand

Packaged Electron App

AppImage: 192 MB → 157 MB (18% reduction)
app.asar: 234 MB → 102 MB (56% reduction!)
node_modules packages: 394 → 140 (64% fewer)

Changes

1. Lazy-load Shiki syntax highlighting

Switched from full shiki bundle to shiki/core
Load language grammars on-demand via dynamic imports
Theme loaded dynamically (min-dark.mjs)
Eliminated 638 language grammars from main bundle

2. Lazy-load tokenizer encodings

Made o200k_base (6.3 MB) and claude (1.9 MB) load on-demand
8.2 MB now only loaded when specific model is used
Added encoding cache to avoid redundant loads

3. Move renderer dependencies to devDependencies

React, Emotion, Mermaid, Shiki, and other frontend libs moved to devDependencies
These are bundled by Vite, so they don't need to be in production node_modules
Only main process dependencies (AI SDK, utils) remain in dependencies

4. Add bundle analysis tooling

Added rollup-plugin-visualizer for production builds
Generates stats.html for bundle composition analysis

Technical Details

Shiki changes:

src/utils/highlighting/shikiHighlighter.ts - Use createHighlighterCore with createOnigurumaEngine
src/utils/highlighting/highlightDiffChunk.ts - Dynamic import language grammars via import('shiki/langs/${lang}.mjs')

Tokenizer changes:

src/utils/main/tokenizer.ts - Load encodings on-demand, cache loaded encodings to avoid redundant imports

Dependency changes:

package.json - Moved 16 renderer-only packages from dependencies to devDependencies

Testing

✅ All existing tests pass (3 pre-existing failures unrelated to changes)
✅ Bundle builds successfully
✅ Packaged AppImage created and tested
Note: Initially tried lazy-loading Mermaid with React.lazy but reverted due to E2E test timeouts

Generated with cmux

Reduced main bundle from 2.68 MB to 2.14 MB (19% reduction, 682 KB → 535 KB gzipped). **Shiki syntax highlighting:** - Switch from full Shiki bundle to shiki/core - Use on-demand language loading via dynamic imports - Theme loaded dynamically (min-dark.mjs) - Eliminated 638 language grammars from main bundle **Tokenizer encodings:** - Made o200k_base (6.3 MB) and claude (1.9 MB) lazy-load on-demand - Total 8.2 MB now loaded only when specific model is used - Added encoding cache to avoid redundant loads **Mermaid diagrams:** - Lazy-load Mermaid component with React.lazy() - Moved 497 KB to separate chunk - Added Suspense boundary with loading fallback **Build config:** - Added rollup-plugin-visualizer for bundle analysis - Configured to generate stats.html in production builds Remaining bundle size (2.14 MB) includes React, Emotion, core UI components, and the Vercel AI SDK. All heavy dependencies are now code-split appropriately. _Generated with `cmux`_

Reduced packaged app size by moving frontend dependencies to devDependencies since they're bundled by Vite and don't need to be in production node_modules. **Before:** - AppImage: 192 MB - app.asar: 234 MB - Packaged node_modules: 394 packages **After:** - AppImage: 157 MB (18% reduction) - app.asar: 102 MB (56% reduction!) - Packaged node_modules: 140 packages (64% fewer) **Moved to devDependencies (renderer-only):** - React ecosystem: react, react-dom, react-compiler-runtime, react-dnd, react-dnd-html5-backend - Markdown: react-markdown, rehype-*, remark-* - UI libraries: @emotion/*, cmdk - Syntax highlighting: shiki (bundled into dist/) - Diagrams: mermaid (lazy-loaded chunk) **Kept in dependencies (main process needs):** - AI SDK: ai, @ai-sdk/anthropic, @ai-sdk/openai, ai-tokenizer - Core utils: chalk, crc-32, diff, disposablestack, escape-html, jsonc-parser, lru-cache, markdown-it, minimist, posthog-js, source-map-support, undici, write-file-atomic, zod, zod-to-json-schema - Electron: electron-updater _Generated with `cmux`_

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2025-10-20T22:23:13Z

vite.config.ts

    rollupOptions: {
+      external: [
+        // Externalize tokenizer encodings - these are large and should be lazy-loaded
+        "ai-tokenizer/encoding/o200k_base",
+        "ai-tokenizer/encoding/claude",
+      ],


Do not externalize tokenizer encoding modules

The new Rollup config marks ai-tokenizer/encoding/o200k_base and ai-tokenizer/encoding/claude as external. At runtime loadEncoding() uses import("ai-tokenizer/encoding/...") from the renderer (see src/utils/main/tokenizer.ts). Because these modules are excluded from the bundle and devDependencies are not shipped with the packaged app, the browser will attempt to fetch /ai-tokenizer/encoding/… which does not exist, leading to "Failed to fetch dynamically imported module" and falling back to approximations. Leave these modules bundled so that Vite emits lazy chunks instead of removing them entirely.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2025-10-20T22:23:13Z

src/utils/main/tokenizer.ts

    countTokens: (text: string) => {
-      // If tokenizer already loaded, use synchronous path for accurate counts
-      if (tokenizerModules) {
-        return countTokensCached(text, () => {
-          try {
-            return countTokensWithLoadedModules(text, modelString, tokenizerModules!);
-          } catch (error) {
-            // Unexpected error during tokenization, fallback to approximation
-            console.error("Failed to tokenize, falling back to approximation:", error);
-            return Math.ceil(text.length / 4);
-          }
-        });
-      }
-
-      // Tokenizer not yet loaded - use async path (returns approximation immediately)
+      // Always use async path since encodings are loaded on-demand
      return countTokensCached(text, async () => {
        await loadTokenizerModules();
        try {
-          return countTokensWithLoadedModules(text, modelString, tokenizerModules!);
+          return await countTokensWithLoadedModules(text, modelString, tokenizerModules!);


Restore synchronous token counts after modules load

After refactoring countTokensWithLoadedModules to be async, getTokenizerForModel().countTokens always invokes countTokensCached with an async function (lines 209‑214). As a result countTokensCached never receives a synchronous value even once the tokenizer and encodings are in memory, so every new text returns only the length/4 approximation and the accurate count is never surfaced to callers unless the exact same string is queried again. Token budgets in the UI and truncation logic will therefore consistently undercount and can exceed model limits. Keep the synchronous path once modules and encodings are loaded, or return a Promise so callers can await the real count.

Useful? React with 👍 / 👎.

Add eslint-disable comments for intentional dynamic imports used for code-splitting: - Mermaid component lazy loading - Shiki language grammars on-demand - Shiki WASM engine and theme These dynamic imports are necessary for bundle optimization and are not hiding circular dependencies - they're explicit performance optimizations.

The React.lazy approach was causing E2E tests to timeout. While it reduced the main bundle slightly, the benefits don't outweigh the test failures. Keeping the Shiki and tokenizer optimizations which provide the bulk of the bundle size reduction without breaking tests.

kylecarbs added 2 commits October 20, 2025 18:19

kylecarbs force-pushed the bundle-size branch from 3c0582b to 9489f9c Compare October 20, 2025 22:19

chatgpt-codex-connector bot reviewed Oct 20, 2025

View reviewed changes

kylecarbs added 2 commits October 20, 2025 18:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🤖 Optimize bundle size - reduce packaged app by 56% #354

🤖 Optimize bundle size - reduce packaged app by 56% #354

Uh oh!

kylecarbs commented Oct 20, 2025 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Oct 20, 2025

Uh oh!

chatgpt-codex-connector bot Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

🤖 Optimize bundle size - reduce packaged app by 56% #354

Are you sure you want to change the base?

🤖 Optimize bundle size - reduce packaged app by 56% #354

Uh oh!

Conversation

kylecarbs commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Results

JavaScript Bundles (Vite output)

Packaged Electron App

Changes

1. Lazy-load Shiki syntax highlighting

2. Lazy-load tokenizer encodings

3. Move renderer dependencies to devDependencies

4. Add bundle analysis tooling

Technical Details

Testing

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kylecarbs commented Oct 20, 2025 •

edited

Loading