-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
perf(core): Reduce output token counting IPC overhead #1438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -7,7 +7,7 @@ import { | |||||||||
| type TokenCountTask, | ||||||||||
| } from '../../../src/core/metrics/workers/calculateMetricsWorker.js'; | ||||||||||
| import { logger } from '../../../src/shared/logger.js'; | ||||||||||
| import type { WorkerOptions } from '../../../src/shared/processConcurrency.js'; | ||||||||||
| import { getProcessConcurrency, type WorkerOptions } from '../../../src/shared/processConcurrency.js'; | ||||||||||
|
|
||||||||||
| vi.mock('../../../src/shared/logger'); | ||||||||||
|
|
||||||||||
|
|
@@ -118,8 +118,9 @@ describe('calculateOutputMetrics', () => { | |||||||||
| taskRunner: mockParallelTaskRunner({ numOfTasks: 1, workerType: 'calculateMetrics', runtime: 'worker_threads' }), | ||||||||||
| }); | ||||||||||
|
|
||||||||||
| expect(chunksProcessed).toBeGreaterThan(1); // Should have processed multiple chunks | ||||||||||
| expect(result).toBe(chunksProcessed * 100); // chunks * 100 tokens per chunk | ||||||||||
| const expectedChunks = getProcessConcurrency(); | ||||||||||
| expect(chunksProcessed).toBe(expectedChunks); // Should match number of CPU cores | ||||||||||
|
Comment on lines
+121
to
+122
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the chunking logic in
Suggested change
|
||||||||||
| expect(result).toBe(expectedChunks * 100); // numChunks * 100 tokens per chunk | ||||||||||
| }); | ||||||||||
|
|
||||||||||
| it('should handle errors in parallel processing', async () => { | ||||||||||
|
|
@@ -173,14 +174,14 @@ describe('calculateOutputMetrics', () => { | |||||||||
| }), | ||||||||||
| }); | ||||||||||
|
|
||||||||||
| // With TARGET_CHARS_PER_CHUNK=200_000, 1.1M character content should produce 6 chunks | ||||||||||
| // Check that chunks are roughly equal in size | ||||||||||
| const expectedChunks = getProcessConcurrency(); | ||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||||||||||
| const chunkSizes = processedChunks.map((chunk) => chunk.length); | ||||||||||
|
|
||||||||||
| expect(processedChunks.length).toBe(6); | ||||||||||
| // All chunks except the last should be exactly TARGET_CHARS_PER_CHUNK | ||||||||||
| for (let i = 0; i < chunkSizes.length - 1; i++) { | ||||||||||
| expect(chunkSizes[i]).toBe(200_000); | ||||||||||
| } | ||||||||||
| expect(processedChunks.length).toBe(expectedChunks); // Should match number of CPU cores | ||||||||||
| // Last chunk may be smaller due to Math.ceil rounding | ||||||||||
| const maxDiff = Math.max(...chunkSizes) - Math.min(...chunkSizes); | ||||||||||
| expect(maxDiff).toBeLessThan(Math.ceil(content.length / expectedChunks)); // Chunks should be roughly equal | ||||||||||
| expect(processedChunks.join('')).toBe(content); // All content should be processed | ||||||||||
| }); | ||||||||||
| }); | ||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation may lead to excessive IPC overhead on systems with many CPU cores. For example, on a 64-core machine, a 1.1MB file would be split into 64 small chunks (~17KB each), resulting in 64
postMessagecalls.To minimize IPC overhead while still saturating available cores, it is better to ensure chunks don't fall below a reasonable size. Using the previously established 200KB "sweet spot" as a minimum chunk size (by capping the number of chunks) ensures that parallelization benefits aren't negated by message serialization costs.