fix(website-server): Add tiktoken as external dependency in esbuild bundle#1065
fix(website-server): Add tiktoken as external dependency in esbuild bundle#1065
Conversation
Use find command to locate tree-sitter WASM files regardless of node_modules hoisting. This fixes Cloud Run deployment failures where WASM files were not found at the expected path.
…undle When bundling the server with esbuild, tiktoken was being inlined which caused the WASM file path resolution to fail. tiktoken loads its WASM files from __dirname at runtime, so it must be kept as an external dependency to preserve correct path resolution. This fixes the Cloud Run deployment error: "Missing tiktoken_bg.wasm" 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThe Dockerfile is modified to consolidate WASM file handling. Tiktoken is treated as an external dependency in esbuild bundling, a new build step collects WASM files into a centralized directory (dist-bundled/wasm), and the runtime image configuration is updated to reference this new location instead of directly copying tree-sitter WASM files. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @yamadashy, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves a critical Cloud Run deployment issue where the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1065 +/- ##
=======================================
Coverage 90.24% 90.24%
=======================================
Files 124 124
Lines 9605 9605
Branches 1754 1754
=======================================
Hits 8668 8668
Misses 937 937 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Code Review
This pull request correctly addresses the runtime error with tiktoken by marking it as an external dependency for esbuild and copying the module into the final Docker image. This ensures that tiktoken can correctly resolve the path to its WASM file at runtime. The changes also improve the handling of tree-sitter WASM files by using a more robust find command to locate them within node_modules, which is a good enhancement. I have one suggestion to improve the clarity and precision of the new command for collecting WASM files.
| # Collect WASM files from wherever they are in node_modules | ||
| RUN mkdir -p dist-bundled/wasm && \ | ||
| find node_modules -name "*.wasm" -path "*tree-sitter-wasms/out/*" -exec cp {} dist-bundled/wasm/ \; |
There was a problem hiding this comment.
This comment is slightly misleading as this command only collects tree-sitter WASM files, while tiktoken's WASM file is handled by copying its entire module. For clarity and robustness, the comment should be more specific. Additionally, the find command can be made more precise by using a single, more specific -path pattern that directly references the @repomix/tree-sitter-wasms package. Quoting the pattern is also important to prevent unexpected shell globbing.
# Collect tree-sitter WASM files from node_modules.
RUN mkdir -p dist-bundled/wasm && \
find node_modules -path '*/@repomix/tree-sitter-wasms/out/*.wasm' -exec cp {} dist-bundled/wasm/ \;
Code Review: PR #1065SummaryThis PR fixes a Cloud Run deployment error caused by tiktoken's WASM file ( ✅ What looks good
|
| Potential Issue | Mitigation |
|---|---|
| tiktoken updates breaking WASM paths | If tiktoken changes how it resolves WASM files in future versions, this fix would need updating. Consider pinning the tiktoken version or adding a health check. |
| Image size increase | Copying the entire node_modules/tiktoken directory includes dev files. Could use npm prune --production before copying, but this is low risk. |
| Missing tiktoken dependencies | If tiktoken has peer/optional dependencies that are also externalized by esbuild, they would need similar treatment. Current testing suggests this is not an issue. |
✅ Verdict
Approve - This is a well-targeted fix for a specific deployment issue. The approach is sound and follows established patterns in the codebase.
🤖 Review by Claude
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
website/server/Dockerfile (2)
25-27: Consider verifying that WASM files are found.The
findcommand silently succeeds even if no files match the pattern. If tree-sitter WASM files are missing at runtime, it would be better to catch this during the build phase.🔎 Suggested enhancement to verify files are found
# Collect WASM files from wherever they are in node_modules RUN mkdir -p dist-bundled/wasm && \ - find node_modules -name "*.wasm" -path "*tree-sitter-wasms/out/*" -exec cp {} dist-bundled/wasm/ \; + find node_modules -name "*.wasm" -path "*tree-sitter-wasms/out/*" -exec cp {} dist-bundled/wasm/ \; && \ + [ "$(ls -A dist-bundled/wasm)" ] || { echo "ERROR: No WASM files found"; exit 1; }
28-29: Minor: Extra blank line.There's an unnecessary consecutive blank line here.
🔎 Formatting cleanup
find node_modules -name "*.wasm" -path "*tree-sitter-wasms/out/*" -exec cp {} dist-bundled/wasm/ \; - # ==============================================================================
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
website/server/Dockerfile
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
- GitHub Check: Build and run (macos-latest, 24.x)
- GitHub Check: Build and run (windows-latest, 20.x)
- GitHub Check: Build and run (macos-latest, 20.x)
- GitHub Check: Build and run (windows-latest, 22.x)
- GitHub Check: Build and run (windows-latest, 24.x)
- GitHub Check: Build and run (windows-latest, 25.x)
- GitHub Check: Build and run (ubuntu-latest, 20.x)
- GitHub Check: Build and run (ubuntu-latest, 24.x)
- GitHub Check: Test (macos-latest, 25.x)
- GitHub Check: Test (windows-latest, 25.x)
- GitHub Check: Test (windows-latest, 22.x)
- GitHub Check: Test (windows-latest, 20.x)
- GitHub Check: Build and run with Bun (windows-latest, latest)
- GitHub Check: Test with Bun (windows-latest, latest)
- GitHub Check: claude-review
- GitHub Check: Cloudflare Pages
🔇 Additional comments (4)
website/server/Dockerfile (4)
20-23: LGTM! External tiktoken configuration is correct.The addition of
--external:tiktokenproperly prevents bundling, preserving the__dirname-based WASM file resolution that tiktoken requires. The explanatory comment is clear and helpful.
40-41: LGTM! Comment accurately reflects the bundled artifacts.The updated comment clearly indicates that both the bundled server and WASM files are being copied.
51-51: LGTM! Environment variable correctly points to the new WASM directory.The
REPOMIX_WASM_DIRupdate aligns with the new centralized WASM file location established in the build stage.
43-45: The current implementation correctly preserves tiktoken's WASM files. Since theCOPYdirective copies the entire tiktoken module directory, all internal files—includingtiktoken_bg.wasmand related tokenizer data—are preserved in the runtime image. The Dockerfile already demonstrates awareness of tiktoken's WASM loading requirements (line 20 comment), and the strategy of marking tiktoken as external in esbuild ensures the module remains intact for runtime loading via__dirname.
Summary
Fixes Cloud Run deployment error caused by missing
tiktoken_bg.wasmfile.When bundling the server with esbuild, tiktoken was being inlined which caused the WASM file path resolution to fail at runtime. tiktoken loads its WASM files from
__dirname, so it must be kept as an external dependency to preserve correct path resolution.Changes
--external:tiktokento esbuild commandThis is a follow-up fix for #1056.
Checklist
npm run testnpm run lint🤖 Generated with Claude Code