Skip to content

perf(security): Batch security check tasks to reduce IPC overhead#1380

Merged
yamadashy merged 3 commits intomainfrom
perf/batch-security-check-tasks
Apr 3, 2026
Merged

perf(security): Batch security check tasks to reduce IPC overhead#1380
yamadashy merged 3 commits intomainfrom
perf/batch-security-check-tasks

Conversation

@yamadashy
Copy link
Copy Markdown
Owner

@yamadashy yamadashy commented Apr 3, 2026

Batch security check items into groups of 500 per worker task instead of sending individual files. This reduces worker thread IPC round-trips from ~990 to ~2 for a typical repository, significantly cutting thread creation and message-passing overhead.

Changes

  • Introduce SecurityCheckItem and batched SecurityCheckTask interfaces
  • Worker processes arrays of items per task instead of single files
  • Update unified worker task inference for new batched task structure
  • Update tests to reflect batched progress callback behavior

Benchmark (repomix on itself, 990 files, 4 CPU cores, 10 runs trimmed avg)

  • Before: 1507ms
  • After: 1373ms
  • Improvement: 134ms (8.9%)

Security check stage specifically:

  • Before (individual tasks): ~640ms for 990 IPC round-trips
  • After (batched): ~170ms for 2 batched round-trips

Checklist

  • Run npm run test
  • Run npm run lint

Open with Devin

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

⚡ Performance Benchmark

Latest commit:d30ad69 perf(security): Reduce batch size from 500 to 50 for better parallelism
Status:✅ Benchmark complete!
Ubuntu:1.57s (±0.02s) → 1.53s (±0.02s) · -0.04s (-2.8%)
macOS:1.71s (±0.25s) → 1.71s (±0.28s) · +0.00s (+0.0%)
Windows:1.88s (±0.03s) → 1.85s (±0.03s) · -0.04s (-1.9%)
Details
  • Packing the repomix repository with node bin/repomix.cjs
  • Warmup: 2 runs (discarded), interleaved execution
  • Measurement: 20 runs / 30 on macOS (median ± IQR)
  • Workflow run
History

fbb925b fix(security): Add numOfTasks comment and fix test batch size references

Ubuntu:1.49s (±0.02s) → 1.42s (±0.02s) · -0.07s (-4.8%)
macOS:0.86s (±0.03s) → 0.84s (±0.03s) · -0.02s (-2.6%)
Windows:1.92s (±0.04s) → 1.84s (±0.05s) · -0.07s (-3.9%)

1c99f96 perf(security): Batch security check tasks to reduce IPC overhead

Ubuntu:1.53s (±0.02s) → 1.45s (±0.02s) · -0.08s (-5.2%)
macOS:1.29s (±0.29s) → 1.28s (±0.25s) · -0.01s (-0.5%)
Windows:1.93s (±0.03s) → 1.85s (±0.05s) · -0.08s (-4.0%)

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 3, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 720dbb8c-0508-4f4a-b518-0ce0223302bd

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The security check system transitions from processing individual items to batch processing. SecurityCheckTask restructures to hold an items array instead of a single item, and the worker processes multiple items per invocation. Task detection logic updates to identify batched tasks by the items field presence, and progress reporting shifts to batch-level granularity.

Changes

Cohort / File(s) Summary
Security check batching
src/core/security/securityCheck.ts, src/core/security/workers/securityCheckWorker.ts
Refactored to build SecurityCheckItem arrays and process them in batches via worker calls. SecurityCheckTask now wraps items array. Worker processes all items per batch and returns aggregated results. Added BATCH_SIZE constant and SecurityCheckType re-export.
Worker type detection
src/shared/unifiedWorker.ts
Updated securityCheck task detection to identify tasks by items field presence while excluding objects with encoding field, replacing prior detection based on filePath, content, and type fields.
Test updates
tests/core/security/securityCheck.test.ts, tests/shared/unifiedWorker.test.ts
Modified test expectations to reflect batch-level progress reporting (one callback per batch) and updated task shape assertions to match new SecurityCheckItem array structure in SecurityCheckTask.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • PR #309: Introduces the Piscina-based worker/task-runner model for security checks that this PR adapts for batch processing.
  • PR #307: Introduced worker-based security checking with single-file tasks; this PR directly modifies the SecurityCheckTask API to support batched items.
  • PR #235: Modifies runSecurityCheck logic in the same file, adding per-file logging and timing that interacts with the batching refactor.
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: batching security check tasks to reduce IPC overhead, which aligns with the core optimization described throughout the changeset.
Description check ✅ Passed The description includes all required template sections, provides detailed context about changes and benchmarks, and both checklist items are marked as completed.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/batch-security-check-tasks

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 3, 2026

Codecov Report

❌ Patch coverage is 93.93939% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.39%. Comparing base (a579381) to head (d30ad69).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
src/core/security/securityCheck.ts 95.83% 1 Missing ⚠️
src/core/security/workers/securityCheckWorker.ts 87.50% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1380      +/-   ##
==========================================
+ Coverage   87.37%   87.39%   +0.02%     
==========================================
  Files         115      115              
  Lines        4371     4378       +7     
  Branches     1015     1015              
==========================================
+ Hits         3819     3826       +7     
  Misses        552      552              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Apr 3, 2026

Deploying repomix with  Cloudflare Pages  Cloudflare Pages

Latest commit: d30ad69
Status: ✅  Deploy successful!
Preview URL: https://b39f4d82.repomix.pages.dev
Branch Preview URL: https://perf-batch-security-check-ta.repomix.pages.dev

View logs

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces batching for security check tasks to reduce IPC overhead by grouping files, git diffs, and git logs into batches before processing them in worker threads. The implementation includes updates to the worker task structure, inference logic, and associated tests. A review comment suggests reducing the BATCH_SIZE from 500 to 100 to improve parallelism on multi-core systems and provide more granular progress updates for a better user experience.

@claude

This comment has been minimized.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
tests/shared/unifiedWorker.test.ts (1)

9-12: Consider updating mock return type to match batched response.

The mock returns null, but the actual worker now returns (SuspiciousFileResult | null)[]. While this doesn't break the current tests (which only verify the handler is called with correct args), it could cause issues if future tests rely on the return value.

💡 Suggested fix
 vi.mock('../../src/core/security/workers/securityCheckWorker.js', () => ({
-  default: vi.fn().mockResolvedValue(null),
+  default: vi.fn().mockResolvedValue([null]),
   onWorkerTermination: vi.fn(),
 }));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/shared/unifiedWorker.test.ts` around lines 9 - 12, The mock for the
securityCheckWorker's default export returns null but the real worker returns a
batched response type (array of SuspiciousFileResult | null); update the vi.mock
so the default mockResolvedValue returns an array (e.g., an empty array)
matching (SuspiciousFileResult | null)[] and, if needed in TypeScript, add the
appropriate type assertion/cast to (SuspiciousFileResult | null)[] to satisfy
typings; target the default export in securityCheckWorker.js in the existing
vi.mock call.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/core/security/securityCheck.ts`:
- Around line 95-106: The progress can jump because completedItems is
incremented per-batch as they finish out-of-order; fix by tracking per-batch
completions and reporting the cumulative sum of finished batches so progress is
monotonic: in the batches.map callback (the block using batches.map,
taskRunner.run, completedItems, progressCallback, logger.trace) capture the
batch index, maintain an array like finishedCounts initialized to zeros, and
when a batch resolves set finishedCounts[index] = batch.length then compute
cumulativeCompleted = finishedCounts.reduce((s,n)=>s+n,0); use that
cumulativeCompleted for progressCallback and logger.trace (and assign
completedItems = cumulativeCompleted) so progress only increases as more batches
finish.
- Around line 77-81: The task runner is being initialized with numOfTasks:
totalItems which overstates concurrent work and oversizes the worker pool;
change the init call in the security check where
deps.initTaskRunner<SecurityCheckTask, (SuspiciousFileResult | null)[]> is
invoked to pass the actual number of batches (batches.length) instead of
totalItems so pool sizing (Math.ceil(numOfTasks / 100)) reflects real
concurrency; update the parameter near the taskRunner variable initialization to
use batches.length.

In `@tests/core/security/securityCheck.test.ts`:
- Around line 71-72: The test comments incorrectly state "batch size 100" while
the actual BATCH_SIZE constant in securityCheck.ts is 500; update the comment in
securityCheck.test.ts to say "batch size 500" and similarly correct the other
misleading comments in this test file (the ones describing batch behavior) so
they reference BATCH_SIZE = 500; look for references to BATCH_SIZE and the tests
around the progress callback assertions (in the same test suite) to find and
update each comment.

---

Nitpick comments:
In `@tests/shared/unifiedWorker.test.ts`:
- Around line 9-12: The mock for the securityCheckWorker's default export
returns null but the real worker returns a batched response type (array of
SuspiciousFileResult | null); update the vi.mock so the default
mockResolvedValue returns an array (e.g., an empty array) matching
(SuspiciousFileResult | null)[] and, if needed in TypeScript, add the
appropriate type assertion/cast to (SuspiciousFileResult | null)[] to satisfy
typings; target the default export in securityCheckWorker.js in the existing
vi.mock call.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 08c885ad-e38b-4102-97a1-ada5f146e5b3

📥 Commits

Reviewing files that changed from the base of the PR and between a579381 and 1c99f96.

📒 Files selected for processing (5)
  • src/core/security/securityCheck.ts
  • src/core/security/workers/securityCheckWorker.ts
  • src/shared/unifiedWorker.ts
  • tests/core/security/securityCheck.test.ts
  • tests/shared/unifiedWorker.test.ts

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Open in Devin Review

yamadashy and others added 2 commits April 4, 2026 00:23
- Add comment explaining why numOfTasks uses totalItems instead of
  batches.length (passing batches.length would yield maxThreads=1,
  forcing sequential execution)
- Fix test comments that incorrectly referenced batch size 100
  when actual BATCH_SIZE is 500

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
A batch size of 50 still reduces IPC round-trips by ~98% (990 → 20)
while producing enough batches to utilize all available CPU cores
on multi-core systems.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 3, 2026

Code Review (Update)

Previous review feedback on numOfTasks and test comment batch sizes has been addressed — thank you! The explanatory comment on pool sizing is clear and the BATCH_SIZE=50 is a good choice for parallelism.

One new finding worth addressing before merge:

Partial batch failure silently skips remaining items (correctness)

In securityCheckWorker.ts, the for...of loop processes items sequentially inside a single try/catch. If runSecretLint throws on item N, items N+1 through the end of that batch are never security-checked — and the error propagates up, potentially aborting the entire security scan.

Previously each file was its own task, so a failure affected only one file. Batching increases the blast radius to up to 50 files per failure. For a security-critical code path, per-item error isolation would be safer:

for (const item of task.items) {
  try {
    results.push(await runSecretLint(item.filePath, item.content, item.type, config));
  } catch (itemError) {
    logger.error(`Error checking security on ${item.filePath}:`, itemError);
    results.push(null); // or flag as suspicious
  }
}
Additional observations

Task inference — prefer positive check: The 'items' in taskObj && !('encoding' in taskObj) heuristic in unifiedWorker.ts:89 uses a negative discriminant. Adding Array.isArray(taskObj.items) would make the check positive-space and more self-documenting, reducing misroute risk if future task types also use an items field.

Missing test coverage for multi-batch path: All tests use ≤4 items with BATCH_SIZE=50, so the batching loop, cross-batch Promise.all, progressive counter, and .flat() concatenation remain untested. Consider either injecting BATCH_SIZE via the deps parameter or adding a test with >50 mock items.

gitLogResult path untested: Every test passes undefined for gitLogResult. There's no test covering a populated git log being included in the security scan.

unifiedWorker.test.ts mock return type mismatch: Line 10 mocks securityCheckWorker returning null, but the real worker now returns (SuspiciousFileResult | null)[]. Should be mockResolvedValue([null]) to match the new contract.


Overall, the architecture is sound and the benchmark results are compelling. The per-item error isolation in the worker is the main item I'd recommend addressing before merge — everything else is minor.

🤖 Generated with Claude Code

@yamadashy yamadashy merged commit 2a16ede into main Apr 3, 2026
83 checks passed
@yamadashy yamadashy deleted the perf/batch-security-check-tasks branch April 3, 2026 15:56
yamadashy pushed a commit that referenced this pull request Apr 4, 2026
Batch file token counting tasks into groups of 50 before dispatching
to worker threads, reducing IPC round-trips by ~95% (e.g. 990 → 20
for a typical repo). This follows the same batching pattern already
applied to security checks in #1380.

Changes:
- Redesign metrics worker to accept batch tasks (TokenCountBatchTask)
  instead of individual items, processing multiple files per IPC
  round-trip
- Update calculateSelectiveFileMetrics to group files into batches
  of 50 before dispatching to worker pool
- Adapt all metrics callers (output, git diff, git log) to use the
  new batch interface
- Update unified worker task inference to distinguish batched metrics
  tasks (items with encoding) from security check tasks (items with
  type)

Benchmark (repomix on its own repo, 990 files, 4 CPU cores):
  Before: 1246ms avg pack time
  After:  1092ms avg pack time
  Improvement: ~155ms savings (12.4%)

The improvement comes from eliminating per-file IPC message passing
overhead. Each round-trip involves structured clone serialization of
file content, message dispatch, and result deserialization. Batching
amortizes this cost across 50 files per message.

https://claude.ai/code/session_019tAfah6yMKnauVTNgK3wyQ
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant