perf(core): Reuse emptyDirPaths from initial searchFiles to eliminate redundant filesystem scan#1272
perf(core): Reuse emptyDirPaths from initial searchFiles to eliminate redundant filesystem scan#1272
Conversation
… redundant filesystem scan When `includeEmptyDirectories` is enabled, `buildOutputGeneratorContext` was calling `searchFiles` a second time solely to retrieve `emptyDirPaths` that had already been computed (and discarded) by the initial `searchFiles` call in `pack()`. This change preserves `emptyDirPaths` from the initial search and threads it through the output generation pipeline (`pack` → `produceOutput` → `generateOutput` → `buildOutputGeneratorContext`), eliminating the redundant filesystem scan. Benchmark results (5-run average on Repomix's own codebase, 963 files): - produceOutput: 360ms → 52ms (-86%) - Total pipeline: 3106ms → 2660ms (-14%) A fallback path is retained for direct callers (e.g., packSkill) that do not provide the pre-computed emptyDirPaths. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the reduce-based merge that unnecessarily accumulated filePaths (immediately discarded) with a cleaner flatMap approach, consistent with the allEmptyDirPaths computation in packager.ts. Addresses Gemini review feedback on PR #1244. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
⚡ Performance Benchmark
Details
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (11)
📝 WalkthroughWalkthroughThis PR threads an optional Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves performance by optimizing how empty directory paths are collected and utilized. Instead of performing a redundant filesystem scan during output generation, the system now reuses the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request effectively improves performance by reusing emptyDirPaths from the initial searchFiles call, which successfully eliminates a redundant filesystem scan. The implementation correctly threads the emptyDirPaths parameter through various functions, from packager down to outputGenerate. Additionally, the fallback logic for generating these paths has been refactored to use flatMap, which is a nice simplification over the previous reduce implementation, improving code clarity. The associated tests have been updated to accommodate these signature changes. The changes are well-executed and represent a good optimization.
Code ReviewOverall: Clean, well-scoped performance optimization. The approach of threading Positives
Observations1. Growing positional parameter listsFunctions like This PR isn't the right place to fix it, but a follow-up to consolidate these into an options object would improve maintainability. For example: interface GenerateOutputOptions {
rootDirs: string[];
config: RepomixConfigMerged;
processedFiles: ProcessedFile[];
allFilePaths: string[];
gitDiffResult?: GitDiffResult;
gitLogResult?: GitLogResult;
filePathsByRoot?: FilesByRoot[];
emptyDirPaths?: string[];
}2. packSkill doesn't benefit from the optimization
3. Minor: early computation of emptyDirPathsIn
The cost is negligible (a VerdictLooks good to merge. No bugs or correctness issues found. The change is well-contained and the fallback ensures backward compatibility for all callers. 🤖 Generated with Claude Code |
Summary
emptyDirPathsalready collected during the initialsearchFilescall in the packager, threading them throughproduceOutput→generateOutput→buildOutputGeneratorContextsearchFilescall inbuildOutputGeneratorContextthat was re-scanning the filesystem solely to obtain empty directory pathsflatMapinstead of manualreduceThis removes an unnecessary filesystem traversal that duplicated work already done during the file collection phase.
Checklist
npm run testnpm run lint