-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
perf(core): Pre-load BPE data on main thread to eliminate redundant per-worker file I/O #1434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -1,6 +1,6 @@ | ||||||||||||||||||||||||||||||||||
| import { logger, setLogLevelByWorkerData } from '../../../shared/logger.js'; | ||||||||||||||||||||||||||||||||||
| import type { TokenEncoding } from '../TokenCounter.js'; | ||||||||||||||||||||||||||||||||||
| import { freeTokenCounters, getTokenCounter } from '../tokenCounterFactory.js'; | ||||||||||||||||||||||||||||||||||
| import { freeTokenCounters, getTokenCounter, initTokenCounterFromBpeRanks } from '../tokenCounterFactory.js'; | ||||||||||||||||||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Import the new guard function to check for initialization status.
Suggested change
|
||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| /** | ||||||||||||||||||||||||||||||||||
| * Token counting worker for metrics calculation. | ||||||||||||||||||||||||||||||||||
|
|
@@ -18,6 +18,10 @@ export interface TokenCountTask { | |||||||||||||||||||||||||||||||||
| content: string; | ||||||||||||||||||||||||||||||||||
| encoding: TokenEncoding; | ||||||||||||||||||||||||||||||||||
| path?: string; | ||||||||||||||||||||||||||||||||||
| /** Pre-serialized BPE rank data (JSON string) for fast worker initialization. | ||||||||||||||||||||||||||||||||||
| * When provided (typically in warmup tasks), the worker skips the expensive | ||||||||||||||||||||||||||||||||||
| * per-worker BPE file I/O (~105ms) and initializes from the pre-loaded data. */ | ||||||||||||||||||||||||||||||||||
| bpeRanksJson?: string; | ||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| export interface TokenCountBatchItem { | ||||||||||||||||||||||||||||||||||
|
|
@@ -37,6 +41,19 @@ export const countTokens = async (task: TokenCountTask): Promise<number> => { | |||||||||||||||||||||||||||||||||
| const processStartAt = process.hrtime.bigint(); | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| try { | ||||||||||||||||||||||||||||||||||
| // Initialize from pre-loaded BPE data if provided (warmup path). | ||||||||||||||||||||||||||||||||||
| // This avoids each worker independently loading the ~3.6MB BPE file from disk, | ||||||||||||||||||||||||||||||||||
| // saving ~105ms per worker by receiving the data via IPC instead. | ||||||||||||||||||||||||||||||||||
| // If parsing fails, getTokenCounter below falls back to disk loading. | ||||||||||||||||||||||||||||||||||
| if (task.bpeRanksJson) { | ||||||||||||||||||||||||||||||||||
| try { | ||||||||||||||||||||||||||||||||||
| const bpeRanks = JSON.parse(task.bpeRanksJson); | ||||||||||||||||||||||||||||||||||
| initTokenCounterFromBpeRanks(task.encoding, bpeRanks); | ||||||||||||||||||||||||||||||||||
| } catch { | ||||||||||||||||||||||||||||||||||
| // Fall through to getTokenCounter which loads from disk | ||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||
|
Comment on lines
+48
to
+55
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The worker currently parses the
Suggested change
|
||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
| const counter = await getTokenCounter(task.encoding); | ||||||||||||||||||||||||||||||||||
| const tokenCount = counter.countTokens(task.content, task.path); | ||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a potential race condition in how the warmup tasks are queued.
createMetricsTaskRunnerreturns thetaskRunnerimmediately, but the warmup tasks are only queued after the BPE data is loaded and serialized (which is an async operation). If the caller submits real tasks to the runner immediately after callingcreateMetricsTaskRunner, those tasks will likely be queued before the warmup tasks, causing the workers to perform the expensive disk I/O anyway.To ensure the optimization is effective, the caller must await the
warmupPromisebefore submitting real tasks, or the initialization logic should be adjusted to ensure warmup tasks are prioritized at the head of the queue.