perf(core): Share pre-compiled tiktoken WASM module across worker threads#1243
perf(core): Share pre-compiled tiktoken WASM module across worker threads#1243
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR refactors tiktoken imports to use Changes
Sequence Diagram(s)sequenceDiagram
participant Main as Main Thread
participant Cache as WASM Cache
participant Worker as Worker Thread
participant Tiktoken as Tiktoken
Main->>Cache: getCompiledTiktokenWasmModule()
activate Cache
Cache->>Cache: Read tiktoken_bg.wasm binary
Cache->>Cache: WebAssembly.compile(binary)
Cache-->>Main: Compiled Module (cached)
deactivate Cache
Main->>Main: createWorkerPool(extraWorkerData: {tiktokenWasmModule})
Main->>Worker: Initialize with workerData
activate Worker
Worker->>Worker: initTiktokenWasm(workerData.tiktokenWasmModule)
Worker->>Tiktoken: init(precompiledModule)
Tiktoken-->>Worker: WASM initialized
deactivate Worker
Main->>Worker: countTokens(text, encoding)
activate Worker
Worker->>Worker: Await wasmInitPromise
Worker->>Tiktoken: get_encoding(encoding)
Tiktoken-->>Worker: Encoder instance
Worker-->>Main: Token count
deactivate Worker
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant performance enhancement by optimizing how the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a significant performance improvement by pre-compiling the tiktoken WASM module in the main thread and sharing it with worker threads. The implementation is solid, leveraging the tiktoken/init API and a new extraWorkerData option to pass the compiled module. The inclusion of a fallback mechanism and the clear benchmark results are commendable. My review includes one suggestion to enhance type safety in the worker thread by validating data received via workerData instead of using a type assertion.
| // Initialize tiktoken WASM with the pre-compiled module from the main thread. | ||
| // If workerData.tiktokenWasmModule is present, this avoids re-compiling the ~5.3MB | ||
| // WASM binary in each worker thread (~6ms instantiation vs ~250ms compile+instantiate). | ||
| const wasmInitPromise = initTiktokenWasm(workerData?.tiktokenWasmModule as WebAssembly.Module | undefined); |
There was a problem hiding this comment.
For improved type safety and clarity, it's better to validate the type of workerData.tiktokenWasmModule before using it, instead of relying on a type assertion. workerData can contain arbitrary data, so runtime checks are crucial. An instanceof check ensures the object is a WebAssembly.Module. This change also introduces a temporary variable for readability.
// Initialize tiktoken WASM. If a pre-compiled module is passed from the main thread,
// use it to avoid re-compiling. This is much faster (~6ms vs ~250ms).
const wasmModule = workerData?.tiktokenWasmModule;
const wasmInitPromise = initTiktokenWasm(
wasmModule instanceof WebAssembly.Module ? wasmModule : undefined,
);
Code Review —
|
| Scenario | Risk | Mitigation |
|---|---|---|
tiktoken/init API changes in a future version |
Medium | The tiktoken/init export is an official API; pin tiktoken version in lockfile |
WebAssembly.Module structured clone fails in edge Node.js versions |
Low | Supported since Node 12; the fallback path handles this gracefully |
child_process runtime used with extraWorkerData containing non-serializable data |
Low | child_process doesn't use workerData at all in this codebase — but extraWorkerData silently does nothing for child_process runtime. Consider documenting this limitation. |
| WASM compilation fails (corrupted binary, OOM) | Low | Error propagates to calculateMetrics caller — appropriate behavior, no silent failure |
Verdict: This is a solid, well-thought-out optimization. The suggestions above are minor improvements — none are blockers.
Reviewed by Claude
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1243 +/- ##
==========================================
- Coverage 87.18% 86.89% -0.30%
==========================================
Files 115 116 +1
Lines 4324 4358 +34
Branches 1002 1009 +7
==========================================
+ Hits 3770 3787 +17
- Misses 554 571 +17 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Deploying repomix with
|
| Latest commit: |
6a7a4a4
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://8249ea27.repomix.pages.dev |
| Branch Preview URL: | https://perf-tiktoken-wasm-module-sh.repomix.pages.dev |
Code Review (Updated) —
|
| Scenario | Risk | Status |
|---|---|---|
| Concurrent WASM compilation race | Low | Not fixed, but safe with current single call site |
child_process runtime ignores extraWorkerData silently |
Low | calculateMetrics hardcodes worker_threads — no issue in practice |
tiktoken/init API breakage |
Medium | Standard tiktoken export; mitigated by lockfile |
| Worker WASM init failure | Low | Graceful — falls back to per-worker compilation |
Verdict: The fix commit properly addressed the critical issues. The remaining suggestions are minor hardening. This PR is ready to merge.
Reviewed by Claude
…eads Each worker thread was independently compiling the ~5.3MB tiktoken WASM binary, adding ~250ms overhead per worker. This change compiles the WASM once in the main thread and passes the WebAssembly.Module to workers via workerData (structured clone), so workers only need to instantiate it (~6ms each). Benchmark (800 tasks, 3 workers): - Before: avg 1,429ms - After: avg 493ms (~65% faster, ~2.9x speedup) Key changes: - Use tiktoken/init API for explicit WASM initialization control - Add wasmModuleCache.ts to compile and cache the WASM module - Add extraWorkerData option to WorkerOptions for passing data to workers - Workers fall back to disk-based compilation if no pre-compiled module provided Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address PR review feedback: - Fix Tinypool workerData wrapping: access workerData[1] instead of workerData directly, as Tinypool wraps it in [privateData, userData] (pointed out by Devin — without this fix the optimization was inactive) - Add instanceof WebAssembly.Module runtime check for type safety (suggested by Gemini) - Only precompile WASM when deps.taskRunner is not provided, and wrap in try/catch to fall back to per-worker compilation on failure (suggested by CodeRabbit) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
c8dc1e6 to
6a7a4a4
Compare
Code Review (Third Pass) —
|
| # | Suggestion | Priority | Status |
|---|---|---|---|
| 1 | Cache the promise in wasmModuleCache.ts instead of the result to prevent theoretical race condition |
Low | Not addressed — safe with single call site |
| 2 | Spread extraWorkerData before reserved keys in processConcurrency.ts:85-88 to prevent accidental override |
Low | Not addressed — safe with current callers |
| 3 | Extract shared getTiktokenWasmPath() to reduce duplication between wasmModuleCache.ts and TokenCounter.ts |
Low | Intentional — avoids cross-module coupling in worker |
| 4 | Add unit tests for initTiktokenWasm (both paths) and wasmModuleCache — patch coverage is 51% |
Recommended | Not addressed |
Verdict
No new issues beyond what was previously flagged. The PR is well-structured with proper fallbacks, defensive instanceof checks, and correct Tinypool workerData handling. Ready to merge — the outstanding items can be addressed in follow-up if desired.
Reviewed by Claude
⚡ Performance BenchmarkPacking the repomix repository with
Details
|
Closing this PRCI ベンチマーク結果で改善が確認できませんでした:
WASM モジュール共有による Worker 初期化の高速化は理論的には有効ですが、E2E では改善が見られず、むしろ微劣化している環境もあります。現状の tiktoken の動作で問題がないため、この最適化は見送ります。 |
|
(Updated comment in English) Closing this PRCI benchmark results showed no improvement:
While sharing the pre-compiled WASM module across workers is theoretically sound for reducing initialization time, the E2E benchmarks show no measurable improvement — and slight regressions on macOS/Windows. Since the current tiktoken setup works well without issues, this optimization is not worth the added complexity. |
Each worker thread was independently loading the ~5.3MB tiktoken WASM binary via synchronous
readFileSync+new WebAssembly.Module()+new WebAssembly.Instance()at import time. This change compiles the WASM once in the main thread and passes theWebAssembly.Moduleto workers viaworkerData(structured clone), so workers only need to instantiate it.Approach
Uses the
tiktoken/initAPI (an official export in the tiktoken package) for explicit WASM initialization control:WebAssembly.compile(), cached inwasmModuleCache.tsWebAssembly.Moduleis passed to workers throughextraWorkerData(new option onWorkerOptions)WebAssembly.instantiate(module, imports)— instantiation only, no recompilationFallback: If
workerData.tiktokenWasmModuleis absent or precompilation fails, workers fall back to reading and compiling from disk (backwards compatible).Benchmark Results
Worker initialization (3 workers concurrent init, 10 runs, 2 warmup):
Changed Files
src/core/metrics/wasmModuleCache.tssrc/types/webassembly.d.tssrc/core/metrics/TokenCounter.tstiktoken/init, addinitTiktokenWasm()src/core/metrics/workers/calculateMetricsWorker.tsworkerData[1]src/core/metrics/calculateMetrics.tsextraWorkerDatasrc/shared/processConcurrency.tsextraWorkerDataoption toWorkerOptionstiktokentotiktoken/inittype import changetests/core/metrics/TokenCounter.test.tstiktoken/initChecklist
npm run testnpm run lint