perf(core): Cache empty directory paths to avoid redundant file search#1356
perf(core): Cache empty directory paths to avoid redundant file search#1356
Conversation
⚡ Performance Benchmark
Details
History76b73c6 fix(core): Add PermissionError handling to directory globby and test emptyDirPaths cache
1766de9 perf(core): Cache empty directory paths and parallelize directory globby
bb73580 perf(core): Cache empty directory paths and parallelize directory globby
|
📝 WalkthroughWalkthroughThis PR refactors Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1356 +/- ##
==========================================
+ Coverage 86.96% 87.05% +0.09%
==========================================
Files 116 116
Lines 4425 4433 +8
Branches 1025 1029 +4
==========================================
+ Hits 3848 3859 +11
+ Misses 577 574 -3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Deploying repomix with
|
| Latest commit: |
96a6a7c
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://138288ed.repomix.pages.dev |
| Branch Preview URL: | https://perf-cache-empty-dir-paths.repomix.pages.dev |
There was a problem hiding this comment.
Code Review
This pull request optimizes the file and directory search process by executing globbing operations in parallel and caching empty directory paths for reuse during output generation. By passing pre-computed directory paths through the packaging pipeline, the implementation avoids redundant file system scans when building the directory structure. I have no feedback to provide.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/core/packager/produceOutput.test.ts (1)
125-140: Please exercise a realemptyDirPathsvalue in the split-output test.Line 134 still passes
undefined, and this test never inspectsmockDeps.generateOutput.mock.calls, so it would miss a regression in the new empty-directory threading. Seed a non-emptyemptyDirPathsarray here and assert every splitgenerateOutputcall receives it unchanged.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/core/packager/produceOutput.test.ts` around lines 125 - 140, The test calls produceOutput with undefined for the emptyDirPaths parameter, so it won't catch regressions in passing empty-directory info to split outputs; update the produceOutput call in produceOutput.test.ts to pass a non-empty array (e.g., ['path/to/emptyDir']) as the emptyDirPaths argument, then add assertions that mockDeps.generateOutput was called for each split and that each call received the same emptyDirPaths array (inspect mockDeps.generateOutput.mock.calls and verify the argument matching emptyDirPaths) to ensure empty-directory threading is preserved.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/core/file/fileSearch.ts`:
- Around line 209-213: The directory scan promise (dirPromise) doesn't convert
EACCES/EPERM into the PermissionError like filePromise does, so
enableEmptyDirectories permission failures are treated as generic errors; update
the dirPromise creation in fileSearch.ts to mirror the filePromise error
mapping: wrap the globby call used for dirPromise with the same catch/transform
logic that checks for error.code === 'EACCES' || 'EPERM' and throws
PermissionError (same class/constructor used by filePromise), then await
Promise.all([filePromise, dirPromise]) as before so both paths yield consistent
PermissionError behavior.
---
Nitpick comments:
In `@tests/core/packager/produceOutput.test.ts`:
- Around line 125-140: The test calls produceOutput with undefined for the
emptyDirPaths parameter, so it won't catch regressions in passing
empty-directory info to split outputs; update the produceOutput call in
produceOutput.test.ts to pass a non-empty array (e.g., ['path/to/emptyDir']) as
the emptyDirPaths argument, then add assertions that mockDeps.generateOutput was
called for each split and that each call received the same emptyDirPaths array
(inspect mockDeps.generateOutput.mock.calls and verify the argument matching
emptyDirPaths) to ensure empty-directory threading is preserved.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 840f9d8a-54dc-4cf7-826c-0a0c2a6258f6
📒 Files selected for processing (12)
src/core/file/fileSearch.tssrc/core/output/outputGenerate.tssrc/core/output/outputSplit.tssrc/core/packager.tssrc/core/packager/produceOutput.tstests/core/output/diffsInOutput.test.tstests/core/output/flagFullDirectoryStructure.test.tstests/core/output/outputGenerate.test.tstests/core/output/outputGenerateDiffs.test.tstests/core/packager.test.tstests/core/packager/produceOutput.test.tstests/core/packager/splitOutput.test.ts
bb73580 to
1766de9
Compare
Code ReviewThe optimization is well-motivated and correctly implemented — eliminating the redundant 1. Function signature bloat (positional parameters)
interface OutputGenerationHints {
filePathsByRoot?: FilesByRoot[];
emptyDirPaths?: string[];
}This would reduce arity, improve call-site readability, and provide a natural home for future pre-computed data without further signature growth. Could be a follow-up. 2. Test coverage for the new behaviorDetailsAll test changes are mechanical
Suggested additions:
3.
|
When includeEmptyDirectories is enabled, buildOutputGeneratorContext called searchFiles a second time just to obtain emptyDirPaths, despite these already being computed during the initial file search in packager. Changes: - Capture emptyDirPaths from the initial searchFiles result in packager and thread them through the pipeline (packager → produceOutput → generateOutput/outputSplit → buildOutputGeneratorContext) - Guard emptyDirPaths processing with includeEmptyDirectories check to skip unnecessary work when the feature is disabled - Fix split output path which was not receiving emptyDirPaths despite the parameter being declared in produceOutput's signature - Add tests for cache hit (searchFiles not called) and fallback paths Local benchmark (repomix on itself, includeEmptyDirectories: true): main: 696.6ms ± 4.2ms branch: 637.1ms ± 2.6ms Improvement: ~60ms (~8.5%) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
76b73c6 to
96a6a7c
Compare
When
includeEmptyDirectoriesis enabled,buildOutputGeneratorContextcalledsearchFilesa second time just to obtainemptyDirPaths— despite these already being computed during the initial file search inpackager. This PR eliminates the redundant search by threading the pre-computed paths through the output pipeline.Changes
emptyDirPathsfrom initial search: Thread pre-computed paths through the pipeline (packager→produceOutput→generateOutput/outputSplit→buildOutputGeneratorContext), skipping the redundantsearchFilescallincludeEmptyDirectoriescheck: SkipemptyDirPathsdedup/sort when the feature is disabled (the common case), avoiding unnecessary allocationsemptyDirPathswas not being passed through togenerateSplitOutputParts— now it isPipeline flow
Benchmark
Local benchmark (repomix on itself,
includeEmptyDirectories: true, 10 runs):Note: CI benchmark shows no change because the improvement only applies when
includeEmptyDirectoriesis enabled.Checklist
npm run test(1100 tests pass)npm run lint(clean)