perf(security): Batch security check tasks to reduce IPC overhead#1380
perf(security): Batch security check tasks to reduce IPC overhead#1380
Conversation
⚡ Performance Benchmark
Details
Historyfbb925b fix(security): Add numOfTasks comment and fix test batch size references
1c99f96 perf(security): Batch security check tasks to reduce IPC overhead
|
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThe security check system transitions from processing individual items to batch processing. SecurityCheckTask restructures to hold an items array instead of a single item, and the worker processes multiple items per invocation. Task detection logic updates to identify batched tasks by the items field presence, and progress reporting shifts to batch-level granularity. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1380 +/- ##
==========================================
+ Coverage 87.37% 87.39% +0.02%
==========================================
Files 115 115
Lines 4371 4378 +7
Branches 1015 1015
==========================================
+ Hits 3819 3826 +7
Misses 552 552 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Deploying repomix with
|
| Latest commit: |
d30ad69
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://b39f4d82.repomix.pages.dev |
| Branch Preview URL: | https://perf-batch-security-check-ta.repomix.pages.dev |
There was a problem hiding this comment.
Code Review
This pull request introduces batching for security check tasks to reduce IPC overhead by grouping files, git diffs, and git logs into batches before processing them in worker threads. The implementation includes updates to the worker task structure, inference logic, and associated tests. A review comment suggests reducing the BATCH_SIZE from 500 to 100 to improve parallelism on multi-core systems and provide more granular progress updates for a better user experience.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
tests/shared/unifiedWorker.test.ts (1)
9-12: Consider updating mock return type to match batched response.The mock returns
null, but the actual worker now returns(SuspiciousFileResult | null)[]. While this doesn't break the current tests (which only verify the handler is called with correct args), it could cause issues if future tests rely on the return value.💡 Suggested fix
vi.mock('../../src/core/security/workers/securityCheckWorker.js', () => ({ - default: vi.fn().mockResolvedValue(null), + default: vi.fn().mockResolvedValue([null]), onWorkerTermination: vi.fn(), }));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/shared/unifiedWorker.test.ts` around lines 9 - 12, The mock for the securityCheckWorker's default export returns null but the real worker returns a batched response type (array of SuspiciousFileResult | null); update the vi.mock so the default mockResolvedValue returns an array (e.g., an empty array) matching (SuspiciousFileResult | null)[] and, if needed in TypeScript, add the appropriate type assertion/cast to (SuspiciousFileResult | null)[] to satisfy typings; target the default export in securityCheckWorker.js in the existing vi.mock call.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/core/security/securityCheck.ts`:
- Around line 95-106: The progress can jump because completedItems is
incremented per-batch as they finish out-of-order; fix by tracking per-batch
completions and reporting the cumulative sum of finished batches so progress is
monotonic: in the batches.map callback (the block using batches.map,
taskRunner.run, completedItems, progressCallback, logger.trace) capture the
batch index, maintain an array like finishedCounts initialized to zeros, and
when a batch resolves set finishedCounts[index] = batch.length then compute
cumulativeCompleted = finishedCounts.reduce((s,n)=>s+n,0); use that
cumulativeCompleted for progressCallback and logger.trace (and assign
completedItems = cumulativeCompleted) so progress only increases as more batches
finish.
- Around line 77-81: The task runner is being initialized with numOfTasks:
totalItems which overstates concurrent work and oversizes the worker pool;
change the init call in the security check where
deps.initTaskRunner<SecurityCheckTask, (SuspiciousFileResult | null)[]> is
invoked to pass the actual number of batches (batches.length) instead of
totalItems so pool sizing (Math.ceil(numOfTasks / 100)) reflects real
concurrency; update the parameter near the taskRunner variable initialization to
use batches.length.
In `@tests/core/security/securityCheck.test.ts`:
- Around line 71-72: The test comments incorrectly state "batch size 100" while
the actual BATCH_SIZE constant in securityCheck.ts is 500; update the comment in
securityCheck.test.ts to say "batch size 500" and similarly correct the other
misleading comments in this test file (the ones describing batch behavior) so
they reference BATCH_SIZE = 500; look for references to BATCH_SIZE and the tests
around the progress callback assertions (in the same test suite) to find and
update each comment.
---
Nitpick comments:
In `@tests/shared/unifiedWorker.test.ts`:
- Around line 9-12: The mock for the securityCheckWorker's default export
returns null but the real worker returns a batched response type (array of
SuspiciousFileResult | null); update the vi.mock so the default
mockResolvedValue returns an array (e.g., an empty array) matching
(SuspiciousFileResult | null)[] and, if needed in TypeScript, add the
appropriate type assertion/cast to (SuspiciousFileResult | null)[] to satisfy
typings; target the default export in securityCheckWorker.js in the existing
vi.mock call.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 08c885ad-e38b-4102-97a1-ada5f146e5b3
📒 Files selected for processing (5)
src/core/security/securityCheck.tssrc/core/security/workers/securityCheckWorker.tssrc/shared/unifiedWorker.tstests/core/security/securityCheck.test.tstests/shared/unifiedWorker.test.ts
- Add comment explaining why numOfTasks uses totalItems instead of batches.length (passing batches.length would yield maxThreads=1, forcing sequential execution) - Fix test comments that incorrectly referenced batch size 100 when actual BATCH_SIZE is 500 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
A batch size of 50 still reduces IPC round-trips by ~98% (990 → 20) while producing enough batches to utilize all available CPU cores on multi-core systems. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Code Review (Update)Previous review feedback on One new finding worth addressing before merge: Partial batch failure silently skips remaining items (correctness)In Previously each file was its own task, so a failure affected only one file. Batching increases the blast radius to up to 50 files per failure. For a security-critical code path, per-item error isolation would be safer: for (const item of task.items) {
try {
results.push(await runSecretLint(item.filePath, item.content, item.type, config));
} catch (itemError) {
logger.error(`Error checking security on ${item.filePath}:`, itemError);
results.push(null); // or flag as suspicious
}
}Additional observationsTask inference — prefer positive check: The Missing test coverage for multi-batch path: All tests use ≤4 items with
Overall, the architecture is sound and the benchmark results are compelling. The per-item error isolation in the worker is the main item I'd recommend addressing before merge — everything else is minor. 🤖 Generated with Claude Code |
Batch file token counting tasks into groups of 50 before dispatching to worker threads, reducing IPC round-trips by ~95% (e.g. 990 → 20 for a typical repo). This follows the same batching pattern already applied to security checks in #1380. Changes: - Redesign metrics worker to accept batch tasks (TokenCountBatchTask) instead of individual items, processing multiple files per IPC round-trip - Update calculateSelectiveFileMetrics to group files into batches of 50 before dispatching to worker pool - Adapt all metrics callers (output, git diff, git log) to use the new batch interface - Update unified worker task inference to distinguish batched metrics tasks (items with encoding) from security check tasks (items with type) Benchmark (repomix on its own repo, 990 files, 4 CPU cores): Before: 1246ms avg pack time After: 1092ms avg pack time Improvement: ~155ms savings (12.4%) The improvement comes from eliminating per-file IPC message passing overhead. Each round-trip involves structured clone serialization of file content, message dispatch, and result deserialization. Batching amortizes this cost across 50 files per message. https://claude.ai/code/session_019tAfah6yMKnauVTNgK3wyQ
Batch security check items into groups of 500 per worker task instead of sending individual files. This reduces worker thread IPC round-trips from ~990 to ~2 for a typical repository, significantly cutting thread creation and message-passing overhead.
Changes
SecurityCheckItemand batchedSecurityCheckTaskinterfacesBenchmark (repomix on itself, 990 files, 4 CPU cores, 10 runs trimmed avg)
Security check stage specifically:
Checklist
npm run testnpm run lint