perf(core): Cherry-pick low-risk micro-optimizations from #1295#1299
perf(core): Cherry-pick low-risk micro-optimizations from #1295#1299
Conversation
Move regex patterns from inside function bodies to module-level constants to avoid repeated compilation on every file processed. For a repo with 1000 files, this eliminates 7000 regex compilations per run. - Hoist dataUriPattern, standaloneBase64Pattern to module scope - Hoist base64ValidCharsPattern, hasNumbers/UpperCase/LowerCase/SpecialChars - Add lastIndex reset for global-flag regexes before each use Cherry-picked optimization from PR #1295 (3/3 reviewer consensus). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace O(n²) string concatenation with O(n) array accumulation pattern in treeToString and treeToStringWithLineCounts. For repos with 1000+ files, the old code copied the entire accumulated string on each append, while the new code pushes fragments and joins once at the end. - Extract treeToStringInner/treeToStringWithLineCountsInner helpers - Move sortTreeNodes call into generateFileTree for single sort at build time - Retain sort guard in treeToString/_isRoot for direct callers Cherry-picked optimization from PR #1295 (3/3 reviewer consensus). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ntrustedFiles Replace O(n*m) Array.some() linear scan with Set.has() for O(n+m) filtering. Pre-builds a Set of suspicious file paths for constant-time lookups during the filter pass. Cherry-picked optimization from PR #1295 (3/3 reviewer consensus). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace three separate .filter() passes over security results with a single for-of loop using switch statement. Also skip filterOutUntrustedFiles entirely when no suspicious files are found (the common ~99% case). - Change let to const for result arrays (populated via push) - Short-circuit avoids Set construction + filter over all raw files Cherry-picked optimization from PR #1295 (2/3 reviewer consensus). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace (content.match(/\n/g) || []).length with an indexOf loop in calculateFileLineCounts. The regex approach allocates an array of all newline matches; the indexOf loop uses zero allocations. For 500 files with 10,000 lines each, this eliminates 5 million tiny string allocations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Avoid assignment in while condition per Biome noAssignInExpressions rule. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces several low-risk, high-consensus micro-optimizations aimed at improving the performance of core functionalities. These changes, independently evaluated by review agents, focus on reducing computational complexity in string operations, regex handling, and data filtering, leading to more efficient code execution. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
⚡ Performance Benchmark
Details
History
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1299 +/- ##
==========================================
- Coverage 87.18% 87.05% -0.13%
==========================================
Files 115 115
Lines 4331 4373 +42
Branches 1007 1013 +6
==========================================
+ Hits 3776 3807 +31
- Misses 555 566 +11 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Deploying repomix with
|
| Latest commit: |
0cc1f78
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://f4258b7d.repomix.pages.dev |
| Branch Preview URL: | https://perf-cherry-pick-optimizatio.repomix.pages.dev |
There was a problem hiding this comment.
Code Review
This pull request focuses on performance optimizations and code refactoring across several core utilities. Key changes include optimizing file tree generation by sorting once at build time and refactoring string conversion functions for efficiency. Regex patterns in truncateBase64.ts are now hoisted to module scope to prevent recompilation, and file filtering in filterOutUntrustedFiles.ts is improved using a Set for faster lookups. Additionally, validateFileSafety.ts now partitions security check results in a single pass and includes a short-circuit for filtering. A review comment highlights a potential redundancy in fileTreeGenerate.ts, noting that sortTreeNodes is called again within treeToString and treeToStringWithLineCounts despite the tree already being sorted during generation, suggesting this leads to unnecessary double sorting and makes the _isRoot parameter redundant.
| export const treeToString = (node: TreeNode, prefix = '', _isRoot = true): string => { | ||
| if (_isRoot) { | ||
| sortTreeNodes(node); | ||
| } |
There was a problem hiding this comment.
The sortTreeNodes(node) call within treeToString is now redundant. The pull request description states that sortTreeNodes was moved to generateFileTree to sort once at build time. Since generateFileTree already sorts the tree, calling sortTreeNodes again here leads to unnecessary double sorting. The _isRoot parameter also becomes unnecessary.
export const treeToString = (node: TreeNode, prefix = ''): string => {
const parts: string[] = [];
treeToStringInner(node, prefix, parts);
return parts.join('');
};Convert static imports of initAction, mcpAction, remoteAction, and versionAction to dynamic import() at their use sites. The default pack path (95%+ of invocations) now avoids loading MCP server, git clone, and init action module trees entirely. Also inline isExplicitRemoteUrl prefix check to avoid loading git-url-parse module for non-remote runs. PR #1295 reports -66% module import time (358ms → 123ms). Cherry-picked optimization (4/5 reviewer consensus). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove log-update dependency (and its wrap-ansi → string-width chain, ~49ms module load) in favor of direct process.stderr.write with ANSI \x1B[2K\r for single-line in-place updates. The spinner only ever writes single lines, so log-update's multi-line and terminal-width handling was unnecessary overhead. Cherry-picked optimization from PR #1295 (4/5 reviewer consensus). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…arse Add Promise-based Map cache to isGitRepository() keyed by directory. When getGitDiffs and getGitLogs run concurrently, both call isGitRepository on the same directory — the cache ensures only one git rev-parse process is spawned instead of multiple. Cache is bypassed when custom deps are provided (test mocks). Cherry-picked optimization from PR #1295 (4/5 reviewer consensus). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove file content from the worker→main process IPC response since the main process only uses processedFiles[].path for the token count tree reporter. For a typical repo with 1000 files averaging 4KB each, this avoids ~4MB of structured clone serialization. Cherry-picked optimization from PR #1295 (4/5 reviewer consensus). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wrap sequential getGitDiffs() and getGitLogs() calls in Promise.all() since both are independent git subprocess operations. Saves the duration of the shorter call (~5-20ms) by overlapping their I/O. Cherry-picked optimization from PR #1295 (3/5 reviewer consensus). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cherry-picks 5 low-risk, high-consensus micro-optimizations from #1295 (automated performance tuning PR).
Each optimization was independently evaluated by 3 review agents. Only changes where 2+ agents agreed were included. All dependency swaps, architectural changes, and accuracy trade-offs were excluded.
Changes
Hoist regex patterns to module scope (
truncateBase64.ts) — 3/3 consensusArray push+join instead of string += (
fileTreeGenerate.ts) — 3/3 consensustreeToString/treeToStringWithLineCountssortTreeNodestogenerateFileTreeto sort once at build timeSet-based lookup for suspicious files (
filterOutUntrustedFiles.ts) — 3/3 consensusArray.some()withSet.has()for O(n+m) filteringSingle-pass result partitioning (
validateFileSafety.ts) — 2/3 consensus.filter()passes with singlefor-of+switchfilterOutUntrustedFileswhen no suspicious files found (~99% of runs)indexOf newline counting (
outputGenerate.ts) — no agent objected(content.match(/\n/g) || []).lengthwith zero-allocationindexOfloopWhat was NOT included (and why)
localeCompare→</>in sortChecklist
npm run testnpm run lint