Skip to content

perf(metrics): Batch token counting IPC to reduce worker round-trip overhead#1411

Merged
yamadashy merged 2 commits intomainfrom
perf/batch-metrics-token-counting
Apr 6, 2026
Merged

perf(metrics): Batch token counting IPC to reduce worker round-trip overhead#1411
yamadashy merged 2 commits intomainfrom
perf/batch-metrics-token-counting

Conversation

@yamadashy
Copy link
Copy Markdown
Owner

@yamadashy yamadashy commented Apr 5, 2026

Batch token counting items into groups of 50 per worker task instead of sending individual files. This reduces worker thread IPC round-trips from ~991 to ~20 for a typical repository, significantly cutting thread creation and message-passing overhead.

Changes

  • Add TokenCountBatchTask / MetricsWorkerTask / MetricsWorkerResult types and countTokensBatch handler to calculateMetricsWorker
  • Introduce metricsWorkerRunner.ts with MetricsTaskRunner type alias and type-safe runTokenCount / runBatchTokenCount helpers — centralizes as casts in one place instead of scattering them across callers
  • Update calculateSelectiveFileMetrics with METRICS_BATCH_SIZE=50 batching
  • Update calculateOutputMetrics, calculateGitDiffMetrics, calculateGitLogMetrics to use runTokenCount helper
  • Update unifiedWorker task inference to recognize batch metrics tasks (items + encoding)
  • Update all related tests for new types and batch mode

Benchmark (from prior testing, repomix on itself, ~991 files)

  • Before: ~2147ms
  • After: ~1544ms
  • Improvement: ~28%

Checklist

  • Run npm run test
  • Run npm run lint

Open with Devin

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 5, 2026

⚡ Performance Benchmark

Latest commit:7f401d9 refactor(metrics): Use batchResults.flat() instead of manual loop
Status:✅ Benchmark complete!
Ubuntu:1.50s (±0.05s) → 1.49s (±0.02s) · -0.01s (-0.7%)
macOS:0.91s (±0.07s) → 0.89s (±0.05s) · -0.02s (-2.2%)
Windows:2.23s (±0.35s) → 2.05s (±0.30s) · -0.18s (-8.0%)
Details
  • Packing the repomix repository with node bin/repomix.cjs
  • Warmup: 2 runs (discarded), interleaved execution
  • Measurement: 20 runs / 30 on macOS (median ± IQR)
  • Workflow run
History

411ce28 refactor(metrics): Use batchResults.flat() instead of manual loop

Ubuntu:1.48s (±0.03s) → 1.47s (±0.02s) · -0.01s (-0.8%)
macOS:1.19s (±0.14s) → 1.19s (±0.16s) · -0.00s (-0.2%)
Windows:1.84s (±0.06s) → 1.83s (±0.06s) · -0.01s (-0.7%)

1c6b5a8 perf(metrics): Batch token counting IPC to reduce worker round-trip overhead

Ubuntu:1.48s (±0.02s) → 1.49s (±0.03s) · +0.01s (+0.6%)
macOS:0.88s (±0.08s) → 0.87s (±0.03s) · -0.01s (-1.1%)
Windows:2.36s (±0.07s) → 2.36s (±0.07s) · +0.00s (+0.0%)

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 5, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1c8cd792-5fdc-4709-a1d8-b58082ad97b8

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR introduces a new MetricsTaskRunner type abstraction and helper functions to standardize metrics worker task execution. The metrics worker now supports batch token counting alongside single-item tasks. All calculate*Metrics functions are refactored to use the new abstraction, and SelectiveFileMetrics implements file batching (size 50).

Changes

Cohort / File(s) Summary
Metrics Worker Type Abstraction
src/core/metrics/metricsWorkerRunner.ts
New file introducing MetricsTaskRunner type, runTokenCount, and runBatchTokenCount helper functions for standardized metrics task execution.
Metrics Worker Batch Support
src/core/metrics/workers/calculateMetricsWorker.ts
Worker now handles both TokenCountTask and TokenCountBatchTask, adding batch-mode processing with new countTokensBatch helper; branching logic routes to single or batch execution based on task shape.
Metrics Calculation Functions
src/core/metrics/calculateGitDiffMetrics.ts, src/core/metrics/calculateGitLogMetrics.ts, src/core/metrics/calculateOutputMetrics.ts
Updated to depend on MetricsTaskRunner instead of generic TaskRunner<TokenCountTask, number>; token execution delegated to runTokenCount helper instead of direct runner invocation.
Selective File Metrics Batching
src/core/metrics/calculateSelectiveFileMetrics.ts
Refactored to batch file processing in chunks of 50 using runBatchTokenCount; replaced per-file task runner calls with per-batch submission; progress reporting now tracks batch completion.
Metrics Orchestrator
src/core/metrics/calculateMetrics.ts
Updated MetricsTaskRunnerWithWarmup interface and createMetricsTaskRunner to use explicit MetricsTaskRunner, MetricsWorkerTask, MetricsWorkerResult types; adjusted fallback initialization accordingly.
Worker Type Inference
src/shared/unifiedWorker.ts
Expanded inferWorkerTypeFromTask for calculateMetrics to detect both single-mode (content + encoding) and batch-mode (items + encoding) task shapes.
Metrics Tests
tests/core/metrics/calculateGitDiffMetrics.test.ts, tests/core/metrics/calculateGitLogMetrics.test.ts, tests/core/metrics/calculateOutputMetrics.test.ts, tests/core/metrics/calculateSelectiveFileMetrics.test.ts
Updated mock task runners to use MetricsTaskRunner and accept MetricsWorkerTask inputs; batch test mocks added branching logic to handle items property; type imports adjusted to match new abstraction.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • #1302 — Modifies metrics worker/task-runner surface in calculateMetrics and packager wiring, directly overlapping with this refactoring.
  • #1380 — Converts per-file worker tasks to batched items-based tasks and updates worker inference logic, mirroring the batching pattern introduced here.
  • #1374 — Modifies createMetricsTaskRunner and MetricsTaskRunnerWithWarmup types in the same metrics orchestrator file.
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main performance optimization: batching token counting to reduce worker round-trip overhead, which is the primary focus of this changeset.
Description check ✅ Passed The pull request description comprehensively covers the changes, provides a detailed Changes section with file-level modifications, includes performance benchmarks, and confirms both test and lint checklist items are complete.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/batch-metrics-token-counting

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 5, 2026

Codecov Report

❌ Patch coverage is 71.79487% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.26%. Comparing base (9d5b928) to head (7f401d9).
⚠️ Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
src/core/metrics/workers/calculateMetricsWorker.ts 8.33% 11 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1411      +/-   ##
==========================================
- Coverage   87.42%   87.26%   -0.17%     
==========================================
  Files         116      117       +1     
  Lines        4397     4420      +23     
  Branches     1020     1021       +1     
==========================================
+ Hits         3844     3857      +13     
- Misses        553      563      +10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

gemini-code-assist[bot]

This comment was marked as resolved.

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Apr 5, 2026

Deploying repomix with  Cloudflare Pages  Cloudflare Pages

Latest commit: 7f401d9
Status: ✅  Deploy successful!
Preview URL: https://a66133f0.repomix.pages.dev
Branch Preview URL: https://perf-batch-metrics-token-cou.repomix.pages.dev

View logs

coderabbitai[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Open in Devin Review

yamadashy and others added 2 commits April 6, 2026 03:11
…verhead

Selective file metrics previously sent one IPC round-trip per file to
worker threads for token counting. With ~991 files and ~0.5ms overhead
per round-trip, this added ~495ms of pure IPC waste.

This change introduces batch mode for the metrics worker, grouping files
into batches of 50 before sending to workers. This reduces round-trips
from 991 to 20.

Type safety improvement over the original approach: instead of scattering
`as number` casts across all callers, a new metricsWorkerRunner module
centralizes the type narrowing in two helper functions (runTokenCount and
runBatchTokenCount), keeping all other modules fully type-safe.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@yamadashy yamadashy force-pushed the perf/batch-metrics-token-counting branch from 411ce28 to 7f401d9 Compare April 5, 2026 18:12
@yamadashy yamadashy merged commit ffe6770 into main Apr 6, 2026
59 checks passed
@yamadashy yamadashy deleted the perf/batch-metrics-token-counting branch April 6, 2026 05:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant