perf(core): Reduce output token counting IPC overhead by yamadashy · Pull Request #1438 · yamadashy/repomix

yamadashy · 2026-04-09T23:44:41Z

Summary

Replace hardcoded TARGET_CHARS_PER_CHUNK=200K with CPU-core-based chunking via getProcessConcurrency()
Skip expensive process.memoryUsage() calls when log level is below DEBUG

Cherry-picked from 5de897c (PR #1428)

Test plan

All tests passing
Build clean

Replace hardcoded TARGET_CHARS_PER_CHUNK=200K with CPU-core-based chunking via getProcessConcurrency(). Skip expensive process.memoryUsage() calls when log level is below DEBUG. Cherry-picked from 5de897c (PR #1428) Co-Authored-By: Claude <noreply@anthropic.com>

coderabbitai · 2026-04-09T23:44:54Z

📝 Walkthrough

Walkthrough

The changes introduce dynamic parallel chunking based on process concurrency in token counting metrics, replace fixed-size chunk calculation with CPU-aware sizing, and add log-level gating to reduce overhead in memory utilities when detailed logging is disabled.

Changes

Cohort / File(s)	Summary
Dynamic Parallel Chunking `src/core/metrics/calculateOutputMetrics.ts`	Replaced fixed target size (200,000 characters) with dynamic chunk calculation derived from `getProcessConcurrency()`. Chunk count is `max(1, getProcessConcurrency())` and chunk size is `ceil(content.length / numChunks)`, maintaining existing parallel processing flow.
Memory Logging Optimization `src/shared/memoryUtils.ts`	Added log-level gating to `logMemoryUsage` and `withMemoryLogging` functions that early-exits unless log level is at least `DEBUG`, avoiding unnecessary memory capture and logging overhead when trace logging is disabled.
Test Updates `tests/core/metrics/calculateOutputMetrics.test.ts`	Updated test assertions to derive expected chunk count from `getProcessConcurrency()` instead of hardcoded values. Relaxed chunk size assertions to check for "roughly equal" distribution rather than exact per-chunk sizes, and tightened parallel execution verification.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

perf(metrics): Batch token counting IPC to reduce worker round-trip overhead #1411: Modifies src/core/metrics/calculateOutputMetrics.ts chunking strategy and worker invocation mechanism in the same file.
perf(metrics): Increase output token counting chunk size from 100KB to 200KB #1415: Adjusts the fixed TARGET_CHARS_PER_CHUNK value in calculateOutputMetrics.ts affecting the same chunking logic.
perf(metrics): Reduce output token counting chunks from ~1000 to ~10 #1373: Changes the chunking strategy in calculateOutputMetrics.ts for parallel token counting content splitting.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title 'perf(core): Reduce output token counting IPC overhead' directly and accurately summarizes the main performance improvement objective of replacing hardcoded chunking with CPU-aware concurrency and skipping expensive memory operations.
Description check	✅ Passed	The PR description includes a clear summary of the two main changes, a test plan with verification checkmarks, and covers the key objectives. It follows the repository's template structure with summary and test plan sections.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch perf/output-token-ipc-optimization

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

cloudflare-workers-and-pages · 2026-04-09T23:46:08Z

Deploying repomix with Cloudflare Pages

Latest commit:	`7400698`
Status:	✅ Deploy successful!
Preview URL:	https://74ed3363.repomix.pages.dev
Branch Preview URL:	https://perf-output-token-ipc-optimi.repomix.pages.dev

View logs

gemini-code-assist

Code Review

This pull request optimizes parallel metric calculation by scaling the number of chunks with available CPU cores and introduces log-level checks to memory utility functions to reduce overhead. Review feedback suggests refining the chunking logic to maintain a minimum chunk size of 200KB, preventing excessive IPC overhead on high-core systems, and updating the associated tests to reflect this change.

gemini-code-assist · 2026-04-09T23:47:00Z

src/core/metrics/calculateOutputMetrics.ts

+      const numChunks = Math.max(1, getProcessConcurrency());
+      const chunkSize = Math.ceil(content.length / numChunks);


The current implementation may lead to excessive IPC overhead on systems with many CPU cores. For example, on a 64-core machine, a 1.1MB file would be split into 64 small chunks (~17KB each), resulting in 64 postMessage calls.

To minimize IPC overhead while still saturating available cores, it is better to ensure chunks don't fall below a reasonable size. Using the previously established 200KB "sweet spot" as a minimum chunk size (by capping the number of chunks) ensures that parallelization benefits aren't negated by message serialization costs.

Suggested change

const numChunks = Math.max(1, getProcessConcurrency());

const chunkSize = Math.ceil(content.length / numChunks);

const numChunks = Math.min(getProcessConcurrency(), Math.ceil(content.length / 200_000));

const chunkSize = Math.ceil(content.length / numChunks);

gemini-code-assist · 2026-04-09T23:47:00Z

tests/core/metrics/calculateOutputMetrics.test.ts

+    const expectedChunks = getProcessConcurrency();
+    expect(chunksProcessed).toBe(expectedChunks); // Should match number of CPU cores


If the chunking logic in calculateOutputMetrics.ts is updated to cap the number of chunks based on a minimum size, this test expectation should be updated accordingly to reflect the actual number of chunks produced.

Suggested change

const expectedChunks = getProcessConcurrency();

expect(chunksProcessed).toBe(expectedChunks); // Should match number of CPU cores

const expectedChunks = Math.min(getProcessConcurrency(), Math.ceil(content.length / 200_000));

expect(chunksProcessed).toBe(expectedChunks);

gemini-code-assist · 2026-04-09T23:47:00Z

tests/core/metrics/calculateOutputMetrics.test.ts


-    // With TARGET_CHARS_PER_CHUNK=200_000, 1.1M character content should produce 6 chunks
+    // Check that chunks are roughly equal in size
+    const expectedChunks = getProcessConcurrency();


This expectation should also be updated to match the improved chunking logic that accounts for a minimum chunk size.

Suggested change

const expectedChunks = getProcessConcurrency();

const expectedChunks = Math.min(getProcessConcurrency(), Math.ceil(content.length / 200_000));

coderabbitai

🧹 Nitpick comments (1)

tests/core/metrics/calculateOutputMetrics.test.ts (1)

10-10: Stabilize these tests by mocking process concurrency.

These assertions currently depend on host CPU concurrency, which can vary across CI environments. Consider pinning getProcessConcurrency() to a fixed test value for deterministic behavior.

Proposed refactor

-import { getProcessConcurrency, type WorkerOptions } from '../../../src/shared/processConcurrency.js';
+import * as processConcurrency from '../../../src/shared/processConcurrency.js';
+import type { WorkerOptions } from '../../../src/shared/processConcurrency.js';

 vi.mock('../../../src/shared/logger');
+const MOCK_PROCESS_CONCURRENCY = 4;
+vi.spyOn(processConcurrency, 'getProcessConcurrency').mockReturnValue(MOCK_PROCESS_CONCURRENCY);

 ...
-    const expectedChunks = getProcessConcurrency();
+    const expectedChunks = MOCK_PROCESS_CONCURRENCY;
 ...
-    const expectedChunks = getProcessConcurrency();
+    const expectedChunks = MOCK_PROCESS_CONCURRENCY;

Also applies to: 121-124, 177-184

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/core/metrics/calculateOutputMetrics.test.ts` at line 10, The tests are
unstable because they rely on the real getProcessConcurrency() value; fix by
mocking getProcessConcurrency from '../../../src/shared/processConcurrency.js'
to return a fixed deterministic number (e.g. 4) for the duration of these tests
(use jest.spyOn or your test-runner's module-mocking API before importing/using
the code under test), update any places referencing WorkerOptions if they derive
expected values from concurrency, and ensure you restore the original
implementation after the tests to avoid cross-test pollution.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/core/metrics/calculateOutputMetrics.test.ts`:
- Line 10: The tests are unstable because they rely on the real
getProcessConcurrency() value; fix by mocking getProcessConcurrency from
'../../../src/shared/processConcurrency.js' to return a fixed deterministic
number (e.g. 4) for the duration of these tests (use jest.spyOn or your
test-runner's module-mocking API before importing/using the code under test),
update any places referencing WorkerOptions if they derive expected values from
concurrency, and ensure you restore the original implementation after the tests
to avoid cross-test pollution.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f27f6d0d-c3aa-45bb-9796-7699cebf2a4f

📥 Commits

Reviewing files that changed from the base of the PR and between eafa70a and 7400698.

📒 Files selected for processing (3)

src/core/metrics/calculateOutputMetrics.ts
src/shared/memoryUtils.ts
tests/core/metrics/calculateOutputMetrics.test.ts

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

github-actions · 2026-04-09T23:51:13Z

⚡ Performance Benchmark

Latest commit:	`7400698` perf(core): Reduce output token counting IPC overhead
Status:	✅ Benchmark complete!
Ubuntu:	1.40s (±0.02s) → 1.40s (±0.03s) · +0.00s (+0.0%)
macOS:	1.17s (±0.24s) → 1.34s (±0.25s) · +0.17s (+14.7%)
Windows:	1.91s (±0.40s) → 1.93s (±0.14s) · +0.02s (+1.1%)

Details

Packing the repomix repository with node bin/repomix.cjs
Warmup: 2 runs (discarded), interleaved execution
Measurement: 20 runs / 30 on macOS (median ± IQR)
Workflow run

codecov · 2026-04-09T23:51:23Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.74%. Comparing base (eafa70a) to head (7400698).
⚠️ Report is 8 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1438      +/-   ##
==========================================
- Coverage   87.32%   86.74%   -0.58%     
==========================================
  Files         117      117              
  Lines        4426     4430       +4     
  Branches     1022     1024       +2     
==========================================
- Hits         3865     3843      -22     
- Misses        561      587      +26

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

claude · 2026-04-09T23:53:47Z

Code Review — `perf(core): Reduce output token counting IPC overhead`

Good optimization direction overall. The memory logging guard is clean and correct. The chunking strategy change has a design issue worth discussing before merge.

High-core regression inverts the stated goal

The PR aims to reduce IPC overhead by creating fewer chunks, but on high-core machines (32–64+ cores, common in CI/cloud), the new approach creates more chunks than the old 200K strategy for content near the 1MB threshold:

Content size	Old (200K chunks)	New (64-core machine)
2MB	10 chunks	64 chunks
4MB	20 chunks	64 chunks
8MB	40 chunks	64 chunks

On a typical 8-core laptop, 2MB → 8 chunks vs old 10 — slight improvement. But the regression on high-core environments is significant and contradicts the PR title.

Suggested fix: Use core count as a cap rather than a mandate, preserving the empirically-benchmarked 200K minimum chunk size:

const TARGET_CHARS_PER_CHUNK = 200_000;
const numChunks = Math.min(
  Math.max(1, getProcessConcurrency()),
  Math.ceil(content.length / TARGET_CHARS_PER_CHUNK)
);

This gives fewer IPC round-trips on typical hardware while preventing over-chunking on high-core machines.

Tests depend on runtime CPU count

getProcessConcurrency() is called live (not mocked) in tests, making assertions non-deterministic across environments. On a 1-core CI container, expectedChunks = 1 collapses the parallel test into a degenerate single-chunk case that doesn't exercise chunking logic at all. CodeRabbit flagged this too — mocking to a fixed value (e.g., 4) would make tests reliable.

Minor observations

logMemoryDifference has no guard — safe today because it's only called from within the guarded withMemoryLogging, but inconsistent with the sibling functions. Consider adding a guard for consistency.
Chunk equality assertion is very loose — expect(maxDiff).toBeLessThan(Math.ceil(content.length / expectedChunks)) will always pass by construction. A tighter bound like toBeLessThanOrEqual(1) would be more meaningful.
Memory logging guard is correct — process.memoryUsage() is a syscall; skipping it at INFO level is a clean win.
Pre-existing: BPE tokens spanning chunk boundaries — splitting a string and tokenizing chunks independently inflates the total token count since BPE tokens can span boundaries. Not introduced by this PR, but worth noting as a known limitation.

🤖 Generated with Claude Code

gemini-code-assist bot reviewed Apr 9, 2026

View reviewed changes

coderabbitai bot reviewed Apr 9, 2026

View reviewed changes

devin-ai-integration bot reviewed Apr 9, 2026

View reviewed changes

coderabbitai bot approved these changes Apr 9, 2026

View reviewed changes

yamadashy closed this Apr 11, 2026

		const numChunks = Math.max(1, getProcessConcurrency());
		const chunkSize = Math.ceil(content.length / numChunks);

		const expectedChunks = getProcessConcurrency();
		expect(chunksProcessed).toBe(expectedChunks); // Should match number of CPU cores

	const expectedChunks = getProcessConcurrency();
	const expectedChunks = Math.min(getProcessConcurrency(), Math.ceil(content.length / 200_000));

Uh oh!

Conversation

yamadashy commented Apr 9, 2026 • edited by devin-ai-integration bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

coderabbitai bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

cloudflare-workers-and-pages bot commented Apr 9, 2026

Deploying repomix with Cloudflare Pages

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

github-actions bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚡ Performance Benchmark

Uh oh!

codecov bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

claude bot commented Apr 9, 2026

Code Review — perf(core): Reduce output token counting IPC overhead

High-core regression inverts the stated goal

Tests depend on runtime CPU count

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yamadashy commented Apr 9, 2026 •

edited by devin-ai-integration bot

Loading

coderabbitai bot commented Apr 9, 2026 •

edited

Loading

github-actions bot commented Apr 9, 2026 •

edited

Loading

codecov bot commented Apr 9, 2026 •

edited

Loading

Code Review — `perf(core): Reduce output token counting IPC overhead`