fix(metrics): Use resolveEncodingAsync for bundler-compatible token counting#1417
fix(metrics): Use resolveEncodingAsync for bundler-compatible token counting#1417
Conversation
⚡ Performance Benchmark
Details
History37d6572 refactor(metrics): Delegate encoding resolution to gpt-tokenizer's resolveEncodingAsync
09b8398 refactor(metrics): Delegate encoding resolution to gpt-tokenizer's resolveEncodingAsync
ac29951 refactor(metrics): Use static import map for gpt-tokenizer encodings
12b409f fix(website): Add gpt-tokenizer to external deps for server bundle
|
📝 WalkthroughWalkthroughRefactored the encoding module loading mechanism in TokenCounter by replacing dynamic template-literal imports with a static Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Deploying repomix with
|
| Latest commit: |
37d6572
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://909ccc58.repomix.pages.dev |
| Branch Preview URL: | https://fix-website-gpt-tokenizer-bu.repomix.pages.dev |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1417 +/- ##
=======================================
Coverage 87.26% 87.26%
=======================================
Files 117 117
Lines 4420 4421 +1
Branches 1021 1021
=======================================
+ Hits 3857 3858 +1
Misses 563 563 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This comment has been minimized.
This comment has been minimized.
Code Review — ClaudeVerdict: Approve ✅ This is a clean, well-evolved fix. The final approach — using
Summary
Observations (non-blocking)
LGTM — ship it! 🚢 Reviewed with Claude Code |
09b8398 to
37d6572
Compare
gpt-tokenizer uses dynamic imports with template literals
(gpt-tokenizer/encoding/${encodingName}) which cannot be resolved at
bundle time by rolldown. The package was missing from both the external
list in the bundle config and the Dockerfile runtime image, causing
"Cannot find package 'gpt-tokenizer'" errors during remote repository
processing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace dynamic template literal import (`gpt-tokenizer/encoding/${name}`)
with an explicit static import map so that rolldown can resolve and bundle
gpt-tokenizer directly. This eliminates the need to treat gpt-tokenizer as
an external dependency in the server bundle and Dockerfile.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…solveEncodingAsync Instead of maintaining a static import map of all 5 encodings in Repomix, use gpt-tokenizer's resolveEncodingAsync which already handles the encoding-to-BPE-data mapping with static import paths internally. This removes the need to update Repomix when gpt-tokenizer adds encodings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
37d6572 to
8bcd93a
Compare
Code Review — ClaudeVerdict: Approve ✅ Clean, minimal fix that solves the rolldown bundler incompatibility at its root. The final approach — using Summary
No issues found. Ship it! Detailed observations (non-blocking)
Reviewed with Claude Code |
Fix remote repository processing failure on the website caused by
gpt-tokenizer's dynamic import not being resolved by the bundler (rolldown).The original code used a template-literal dynamic import (
gpt-tokenizer/encoding/${encodingName}) which rolldown cannot statically analyze. This caused a runtime error:Solution
Replace the dynamic template-literal import with
gpt-tokenizer's ownresolveEncodingAsync+GptEncoding.getEncodingApi(). SinceresolveEncodingAsyncuses static import paths internally, rolldown can resolve and bundle the BPE data correctly.Changes
src/core/metrics/TokenCounter.ts: UseresolveEncodingAsyncto load BPE rank data andGptEncoding.getEncodingApi()to create encoder instances, instead of dynamic template-literal importsChecklist
npm run testnpm run lint