Skip to content

perf(core): Reduce worker thread contention for faster pipeline execution#1409

Merged
yamadashy merged 5 commits intomainfrom
perf/reduce-worker-thread-contention
Apr 5, 2026
Merged

perf(core): Reduce worker thread contention for faster pipeline execution#1409
yamadashy merged 5 commits intomainfrom
perf/reduce-worker-thread-contention

Conversation

@yamadashy
Copy link
Copy Markdown
Owner

@yamadashy yamadashy commented Apr 5, 2026

Add maxWorkerThreads option to WorkerOptions for explicit thread count capping, then use it to reduce CPU contention when metrics and security worker pools run concurrently during the pipeline overlap phase.

Changes

  • processConcurrency.ts: Add optional maxWorkerThreads to WorkerOptions. getWorkerThreadCount and createWorkerPool respect this cap when provided.
  • calculateMetrics.ts: Cap metrics workers at processConcurrency - 1 in createMetricsTaskRunner. Warmup thread count now matches the actual pool size.
  • securityCheck.ts: Cap security workers at floor(processConcurrency / 2). Security check uses coarse-grained batches (50 files), so fewer workers still provide sufficient parallelism.

Rationale

During the pipeline overlap phase, the metrics pool (gpt-tokenizer warmup) and security pool run concurrently. On a 4-core machine, this previously created 8 threads (4+4) competing for 4 cores, causing significant context-switching overhead. With the caps, total concurrent threads drop to 5 (3+2), reducing contention while maintaining throughput.

The implementation uses an explicit maxWorkerThreads parameter rather than manipulating numOfTasks, keeping the API clean and ensuring warmup thread count stays in sync with the actual pool size.

Checklist

  • Run npm run test
  • Run npm run lint

Open with Devin

…tion

Add maxWorkerThreads option to WorkerOptions for explicit thread count
capping, then use it to reduce CPU contention when metrics and security
worker pools run concurrently during the pipeline overlap phase.

- Metrics pool: capped at (processConcurrency - 1)
- Security pool: capped at floor(processConcurrency / 2)

On a 4-core machine this reduces concurrent threads from 8 (4+4) to 5
(3+2), avoiding context-switching overhead during gpt-tokenizer warmup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 5, 2026

⚡ Performance Benchmark

Latest commit:7de6923 refactor(core): Inject getProcessConcurrency via deps in securityCheck
Status:✅ Benchmark complete!
Ubuntu:1.52s (±0.03s) → 1.49s (±0.02s) · -0.04s (-2.4%)
macOS:0.90s (±0.06s) → 0.90s (±0.06s) · +0.00s (+0.3%)
Windows:1.93s (±0.03s) → 1.88s (±0.06s) · -0.05s (-2.4%)
Details
  • Packing the repomix repository with node bin/repomix.cjs
  • Warmup: 2 runs (discarded), interleaved execution
  • Measurement: 20 runs / 30 on macOS (median ± IQR)
  • Workflow run
History

f49a5d0 fix(core): Add early return for empty input and restore numOfTasks comment

Ubuntu:1.53s (±0.04s) → 1.50s (±0.02s) · -0.03s (-2.0%)
macOS:0.86s (±0.04s) → 0.86s (±0.05s) · +0.00s (+0.0%)
Windows:1.90s (±0.03s) → 1.85s (±0.05s) · -0.06s (-2.9%)

46a0a7b perf(core): Remove metrics worker cap to avoid regression on low-core machines

Ubuntu:1.57s (±0.02s) → 1.52s (±0.03s) · -0.05s (-3.1%)
macOS:0.89s (±0.06s) → 0.87s (±0.10s) · -0.02s (-2.1%)
Windows:1.84s (±0.03s) → 1.80s (±0.02s) · -0.04s (-2.4%)

aff7aa9 refactor(core): Cap security workers at 2 instead of floor(cores/2)

Ubuntu:1.54s (±0.02s) → 1.40s (±0.02s) · -0.14s (-9.1%)
macOS:0.92s (±0.07s) → 0.97s (±0.06s) · +0.06s (+6.4%)
Windows:1.86s (±0.05s) → 1.74s (±0.03s) · -0.12s (-6.7%)

2a87942 perf(core): Reduce worker thread contention for faster pipeline execution

Ubuntu:1.51s (±0.01s) → 1.36s (±0.02s) · -0.14s (-9.6%)
macOS:0.87s (±0.05s) → 0.94s (±0.08s) · +0.07s (+7.7%)
Windows:1.86s (±0.03s) → 1.72s (±0.03s) · -0.13s (-7.2%)

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 5, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 77164e78-eb2d-485e-a47b-69b2fec1baaf

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The PR introduces a worker thread concurrency capping mechanism by adding an optional maxWorkerThreads parameter to the worker initialization chain. The metrics and security check modules now compute process-aware concurrency caps and pass them through initTaskRunner to limit actual worker thread allocation in their respective task pools.

Changes

Cohort / File(s) Summary
Concurrency Infrastructure
src/shared/processConcurrency.ts
Added optional maxWorkerThreads property to WorkerOptions interface. Updated getWorkerThreadCount function signature to accept optional maxWorkerThreads parameter and compute effectiveConcurrency as the minimum of detected parallelism and the cap. Modified createWorkerPool to destructure and propagate maxWorkerThreads to getWorkerThreadCount.
Metrics & Security Modules
src/core/metrics/calculateMetrics.ts, src/core/security/securityCheck.ts
Metrics: compute maxMetricsWorkers = max(1, getProcessConcurrency() - 1) and pass via maxWorkerThreads option to initTaskRunner. Updated warmup thread calculation to use the capped limit. Security: compute maxSecurityWorkers = max(1, floor(getProcessConcurrency()/2)) and pass via maxWorkerThreads option to initTaskRunner.
Test Updates
tests/core/security/securityCheck.test.ts, tests/shared/processConcurrency.test.ts
Security test: added mock for getProcessConcurrency returning 4. Concurrency tests: added three test cases validating getWorkerThreadCount respects maxWorkerThreads cap when provided, and preserves existing behavior when undefined.

Sequence Diagram(s)

sequenceDiagram
    participant Caller as Metrics/<br/>Security Module
    participant PC as ProcessConcurrency<br/>Module
    participant TR as TaskRunner/<br/>initTaskRunner
    participant WP as createWorkerPool
    participant WT as getWorkerThreadCount

    Caller->>PC: getProcessConcurrency()
    PC-->>Caller: CPU count
    Caller->>Caller: compute max cap<br/>(e.g., CPUs - 1)
    Caller->>TR: initTaskRunner({<br/>maxWorkerThreads: cap<br/>})
    TR->>WP: pass maxWorkerThreads
    WP->>WT: getWorkerThreadCount(<br/>numOfTasks,<br/>maxWorkerThreads)
    WT->>WT: effectiveConcurrency =<br/>min(detected, cap)
    WT->>WT: maxThreads =<br/>min(effectiveConcurrency,<br/>task-derived limit)
    WT-->>WP: { minThreads, maxThreads }
    WP->>WP: configure tinypool<br/>with capped maxThreads
    WP-->>TR: WorkerPool instance
    TR-->>Caller: TaskRunner ready
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: reducing worker thread contention through explicit thread capping for improved pipeline performance.
Description check ✅ Passed The description covers required sections with changes, rationale, and a completed checklist, though it includes non-required elements (Devin badge) and goes beyond the minimal template.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/reduce-worker-thread-contention

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.42%. Comparing base (5599736) to head (7de6923).
⚠️ Report is 13 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1409      +/-   ##
==========================================
+ Coverage   87.40%   87.42%   +0.01%     
==========================================
  Files         116      116              
  Lines        4392     4397       +5     
  Branches     1018     1020       +2     
==========================================
+ Hits         3839     3844       +5     
  Misses        553      553              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Apr 5, 2026

Deploying repomix with  Cloudflare Pages  Cloudflare Pages

Latest commit: 7de6923
Status: ✅  Deploy successful!
Preview URL: https://3af6c4d1.repomix.pages.dev
Branch Preview URL: https://perf-reduce-worker-thread-co.repomix.pages.dev

View logs

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/core/security/securityCheck.ts (1)

74-87: Skip worker pool initialization when there are no items.

A fast return when totalItems === 0 avoids unnecessary worker setup/teardown on empty inputs.

♻️ Suggested change
   const allItems = [...fileItems, ...gitDiffItems, ...gitLogItems];
   const totalItems = allItems.length;
+
+  if (totalItems === 0) {
+    return [];
+  }
 
   // Cap security workers at half the available CPU cores to reduce contention with the
   // metrics worker pool that runs concurrently. The security check uses coarse-grained
   // batches (BATCH_SIZE=50), so fewer workers still provide sufficient parallelism.
   const maxSecurityWorkers = Math.max(1, Math.floor(getProcessConcurrency() / 2));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/security/securityCheck.ts` around lines 74 - 87, Check totalItems
(computed from allItems) before initializing the worker pool and return early
when totalItems === 0 to avoid unnecessary init/teardown; specifically, add a
fast-path that when totalItems is 0 skips calling deps.initTaskRunner (the
current code that creates taskRunner using maxSecurityWorkers and workerType
'securityCheck') and returns the appropriate empty result (or resolves the
surrounding function) instead of initializing the worker pool and continuing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/core/security/securityCheck.ts`:
- Around line 74-87: Check totalItems (computed from allItems) before
initializing the worker pool and return early when totalItems === 0 to avoid
unnecessary init/teardown; specifically, add a fast-path that when totalItems is
0 skips calling deps.initTaskRunner (the current code that creates taskRunner
using maxSecurityWorkers and workerType 'securityCheck') and returns the
appropriate empty result (or resolves the surrounding function) instead of
initializing the worker pool and continuing.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 26dd26d4-3910-464e-b0e7-d226ad223484

📥 Commits

Reviewing files that changed from the base of the PR and between 4a0eb98 and 2a87942.

📒 Files selected for processing (5)
  • src/core/metrics/calculateMetrics.ts
  • src/core/security/securityCheck.ts
  • src/shared/processConcurrency.ts
  • tests/core/security/securityCheck.test.ts
  • tests/shared/processConcurrency.test.ts

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to limit the number of worker threads for metrics calculation and security checks to reduce CPU contention during concurrent execution. It updates the getWorkerThreadCount utility to support an optional thread cap and adjusts the task runners in calculateMetrics.ts and securityCheck.ts accordingly. Unit tests have been added to verify the capping logic. I have no feedback to provide.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

Security check uses coarse-grained batches (50 files/batch), so 2
workers provide sufficient parallelism even for large repos. Fixing
the cap at 2 avoids over-allocation on high-core machines (e.g., 8
threads on 16 cores) while keeping the formula simple and predictable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 5, 2026

Claude Code Review — PR #1409

Overall: Clean, well-motivated PR with solid benchmarks on Ubuntu/Windows. The maxWorkerThreads API design is the right approach — explicit and composable. A few items worth discussing before merge.

Noteworthy Findings

1. macOS benchmark regression (+7.7%) warrants investigation

The benchmark shows Ubuntu -9.6% and Windows -7.2% improvements, but macOS regressed from 0.87s to 0.94s. On a 2-core macOS CI runner, maxMetricsWorkers = Math.max(1, 2-1) = 1 (single-threaded metrics) while maxSecurityWorkers = Math.min(2, 2) = 2. This means metrics parallelism is halved without meaningful contention reduction. It would be worth understanding the macOS runner's core count and whether the caps are too aggressive for low-core machines.

2. Removed comment about numOfTasks vs batches.length in securityCheck.ts

The original code had an important comment explaining why numOfTasks: totalItems is passed instead of batches.length. The new comment explains the cap logic but drops this clarification. Since totalItems is still passed and this remains non-obvious, consider preserving that context alongside the new comment — even a brief note like:

// numOfTasks uses totalItems (not batches.length) to avoid under-sizing the pool.

3. getProcessConcurrency not injected via deps in securityCheck.ts

securityCheck.ts already has a deps parameter for initTaskRunner, but getProcessConcurrency is imported directly. This makes it harder to test with different concurrency values without mocking the module. The security test already needed to add getProcessConcurrency: vi.fn(() => 4) to the module mock. Consider adding it to the deps object for consistency.

Additional observations (non-blocking)
  • Test coverage is adequate — 3 new tests for getWorkerThreadCount cover the cap, task-limit precedence, and undefined cases. The maxWorkerThreads=0 edge case is implicitly handled by Math.max(minThreads, ...) ensuring at least 1 thread, but a dedicated test would document this guarantee.
  • Warmup thread count stays in sync — The warmup in createMetricsTaskRunner correctly uses getWorkerThreadCount(numOfTasks, maxMetricsWorkers) to match the actual pool size. Good attention to detail.
  • calculateMetrics fallback path — When deps.taskRunner is not provided in calculateMetrics() (line 85-89), the fallback initTaskRunner call does NOT use maxWorkerThreads. This is fine since it's only used when createMetricsTaskRunner is not called, but worth noting as a divergence.
  • Commit messages follow conventions — Both commits use proper type(scope): Description format.
  • All files remain under the 250-line limit.

Bot Comment Evaluation

No AI bot inline comments required priority judgment — CodeRabbit's review was skipped (auto-incremental disabled), and Gemini/Devin found no issues. CodeRabbit did leave a nitpick suggesting an early return for totalItems === 0 in securityCheck.ts — this is reasonable but low priority since empty input just produces an empty batch array and exits cleanly.


Reviewed with Claude Code

yamadashy and others added 2 commits April 6, 2026 01:16
… machines

CI benchmarks showed macOS (3 vCPU) regressed by ~6-7% with the
processConcurrency-1 cap, while Ubuntu/Windows (4 vCPU) improved.

On 3 cores, capping metrics from 3→2 workers is a 33% parallelism
reduction that outweighs contention savings. The main improvement
comes from capping security at 2 workers, which is sufficient since
the security pool uses coarse batches (50 files) and finishes quickly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…mment

Address PR review feedback:
- Add early return when totalItems === 0 to skip worker pool init
- Restore comment explaining why numOfTasks uses totalItems not batches.length

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@yamadashy
Copy link
Copy Markdown
Owner Author

Thanks for the thorough review!

1. macOS regression — Addressed in 46a0a7b by removing the metrics worker cap (P-1). The security cap alone (min(2, P)) now shows consistent improvements across all platforms without macOS regression.

2. Removed numOfTasks comment — Restored in f49a5d0 as a brief inline note.

3. getProcessConcurrency not in deps — Noted. The module mock approach works for now, but could be revisited if testability becomes an issue.

Also added early return for totalItems === 0 (CodeRabbit nitpick) in the same commit. 🤖

Move getProcessConcurrency from a direct module import to the deps
parameter for consistency with initTaskRunner. This makes it easier
to test with different concurrency values without module-level mocking.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@yamadashy
Copy link
Copy Markdown
Owner Author

Addressed the remaining feedback:

getProcessConcurrency injected via deps — Moved from direct module import to the deps parameter in runSecurityCheck, consistent with how initTaskRunner is already injected. Tests updated to pass it explicitly. (7de6923) 🤖

@yamadashy yamadashy merged commit 9d5b928 into main Apr 5, 2026
59 checks passed
@yamadashy yamadashy deleted the perf/reduce-worker-thread-contention branch April 5, 2026 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant