Skip to content

perf(core): Cherry-pick low-risk micro-optimizations from #1295#1299

Closed
yamadashy wants to merge 11 commits intomainfrom
perf/cherry-pick-optimizations
Closed

perf(core): Cherry-pick low-risk micro-optimizations from #1295#1299
yamadashy wants to merge 11 commits intomainfrom
perf/cherry-pick-optimizations

Conversation

@yamadashy
Copy link
Copy Markdown
Owner

Cherry-picks 5 low-risk, high-consensus micro-optimizations from #1295 (automated performance tuning PR).

Each optimization was independently evaluated by 3 review agents. Only changes where 2+ agents agreed were included. All dependency swaps, architectural changes, and accuracy trade-offs were excluded.

Changes

  1. Hoist regex patterns to module scope (truncateBase64.ts) — 3/3 consensus

    • Move 7 regex patterns from inside functions to module-level constants
    • Eliminates repeated regex compilation per file processed
  2. Array push+join instead of string += (fileTreeGenerate.ts) — 3/3 consensus

    • Replace O(n²) string concatenation with O(n) array accumulation in treeToString / treeToStringWithLineCounts
    • Move sortTreeNodes to generateFileTree to sort once at build time
  3. Set-based lookup for suspicious files (filterOutUntrustedFiles.ts) — 3/3 consensus

    • Replace O(n×m) Array.some() with Set.has() for O(n+m) filtering
  4. Single-pass result partitioning (validateFileSafety.ts) — 2/3 consensus

    • Replace 3 separate .filter() passes with single for-of + switch
    • Short-circuit filterOutUntrustedFiles when no suspicious files found (~99% of runs)
  5. indexOf newline counting (outputGenerate.ts) — no agent objected

    • Replace (content.match(/\n/g) || []).length with zero-allocation indexOf loop

What was NOT included (and why)

Optimization Reason for exclusion
tiktoken → gpt-tokenizer Dependency swap (all 3 agents excluded)
Handlebars removal Large architectural change (all 3 excluded)
child_process elimination Major CLI restructuring (all 3 excluded)
Token count estimation Accuracy trade-off (all 3 excluded)
localeCompare</> in sort Subtle sort behavior change for non-ASCII (2/3 excluded)
Security pre-filter Maintenance burden syncing with secretlint rules (2/3 excluded)

Checklist

  • Run npm run test
  • Run npm run lint

yamadashy and others added 6 commits March 25, 2026 08:24
Move regex patterns from inside function bodies to module-level constants
to avoid repeated compilation on every file processed. For a repo with
1000 files, this eliminates 7000 regex compilations per run.

- Hoist dataUriPattern, standaloneBase64Pattern to module scope
- Hoist base64ValidCharsPattern, hasNumbers/UpperCase/LowerCase/SpecialChars
- Add lastIndex reset for global-flag regexes before each use

Cherry-picked optimization from PR #1295 (3/3 reviewer consensus).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace O(n²) string concatenation with O(n) array accumulation pattern
in treeToString and treeToStringWithLineCounts. For repos with 1000+
files, the old code copied the entire accumulated string on each append,
while the new code pushes fragments and joins once at the end.

- Extract treeToStringInner/treeToStringWithLineCountsInner helpers
- Move sortTreeNodes call into generateFileTree for single sort at build time
- Retain sort guard in treeToString/_isRoot for direct callers

Cherry-picked optimization from PR #1295 (3/3 reviewer consensus).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ntrustedFiles

Replace O(n*m) Array.some() linear scan with Set.has() for O(n+m)
filtering. Pre-builds a Set of suspicious file paths for constant-time
lookups during the filter pass.

Cherry-picked optimization from PR #1295 (3/3 reviewer consensus).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace three separate .filter() passes over security results with a
single for-of loop using switch statement. Also skip filterOutUntrustedFiles
entirely when no suspicious files are found (the common ~99% case).

- Change let to const for result arrays (populated via push)
- Short-circuit avoids Set construction + filter over all raw files

Cherry-picked optimization from PR #1295 (2/3 reviewer consensus).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace (content.match(/\n/g) || []).length with an indexOf loop in
calculateFileLineCounts. The regex approach allocates an array of all
newline matches; the indexOf loop uses zero allocations. For 500 files
with 10,000 lines each, this eliminates 5 million tiny string allocations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Avoid assignment in while condition per Biome noAssignInExpressions rule.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 25, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9903957f-c6bc-440d-9003-3a6282c1085f

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/cherry-pick-optimizations

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces several low-risk, high-consensus micro-optimizations aimed at improving the performance of core functionalities. These changes, independently evaluated by review agents, focus on reducing computational complexity in string operations, regex handling, and data filtering, leading to more efficient code execution.

Highlights

  • Regex Hoisting: Moved seven regex patterns in truncateBase64.ts from inside functions to module-level constants to prevent repeated compilation, improving performance.
  • Efficient String Building: Refactored fileTreeGenerate.ts to use array push and join for string construction in treeToString and treeToStringWithLineCounts, improving performance from O(n²) to O(n).
  • Set-based Filtering: Implemented a Set-based lookup in filterOutUntrustedFiles.ts for filtering suspicious files, changing the complexity from O(n×m) to O(n+m).
  • Single-Pass Partitioning: Replaced multiple .filter() calls with a single for-of loop and switch statement for partitioning security check results in validateFileSafety.ts.
  • Optimized Newline Counting: Switched from regex-based newline counting to a zero-allocation indexOf loop in outputGenerate.ts for improved efficiency.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 25, 2026

⚡ Performance Benchmark

Latest commit:0cc1f78 perf(core): Parallelize git diffs and logs fetch with Promise.all
Status:✅ Benchmark complete!
Ubuntu:2.37s (±0.02s) → 2.12s (±0.02s) · -0.25s (-10.6%)
macOS:1.63s (±0.24s) → 1.87s (±0.43s) · +0.24s (+14.6%)
Windows:2.93s (±0.10s) → 2.58s (±0.11s) · -0.35s (-12.1%)
Details
  • Packing the repomix repository with node bin/repomix.cjs
  • Warmup: 2 runs (discarded)
  • Measurement: 10 runs / 20 on macOS (median ± IQR)
  • Workflow run
History

472d3a3 style(core): Fix Biome lint warning in countNewlines

Ubuntu:2.36s (±0.03s) → 2.36s (±0.02s) · -0.00s (-0.0%)
macOS:1.89s (±0.51s) → 1.62s (±0.18s) · -0.27s (-14.3%)
Windows:2.85s (±0.03s) → 2.86s (±0.03s) · +0.01s (+0.5%)

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 25, 2026

Codecov Report

❌ Patch coverage is 87.05882% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.05%. Comparing base (35afdad) to head (0cc1f78).

Files with missing lines Patch % Lines
src/core/security/validateFileSafety.ts 60.00% 6 Missing ⚠️
src/core/git/gitRepositoryHandle.ts 66.66% 4 Missing ⚠️
src/cli/cliRun.ts 83.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1299      +/-   ##
==========================================
- Coverage   87.18%   87.05%   -0.13%     
==========================================
  Files         115      115              
  Lines        4331     4373      +42     
  Branches     1007     1013       +6     
==========================================
+ Hits         3776     3807      +31     
- Misses        555      566      +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Mar 25, 2026

Deploying repomix with  Cloudflare Pages  Cloudflare Pages

Latest commit: 0cc1f78
Status: ✅  Deploy successful!
Preview URL: https://f4258b7d.repomix.pages.dev
Branch Preview URL: https://perf-cherry-pick-optimizatio.repomix.pages.dev

View logs

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request focuses on performance optimizations and code refactoring across several core utilities. Key changes include optimizing file tree generation by sorting once at build time and refactoring string conversion functions for efficiency. Regex patterns in truncateBase64.ts are now hoisted to module scope to prevent recompilation, and file filtering in filterOutUntrustedFiles.ts is improved using a Set for faster lookups. Additionally, validateFileSafety.ts now partitions security check results in a single pass and includes a short-circuit for filtering. A review comment highlights a potential redundancy in fileTreeGenerate.ts, noting that sortTreeNodes is called again within treeToString and treeToStringWithLineCounts despite the tree already being sorted during generation, suggesting this leads to unnecessary double sorting and makes the _isRoot parameter redundant.

Comment on lines 83 to 86
export const treeToString = (node: TreeNode, prefix = '', _isRoot = true): string => {
if (_isRoot) {
sortTreeNodes(node);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The sortTreeNodes(node) call within treeToString is now redundant. The pull request description states that sortTreeNodes was moved to generateFileTree to sort once at build time. Since generateFileTree already sorts the tree, calling sortTreeNodes again here leads to unnecessary double sorting. The _isRoot parameter also becomes unnecessary.

export const treeToString = (node: TreeNode, prefix = ''): string => {
  const parts: string[] = [];
  treeToStringInner(node, prefix, parts);
  return parts.join('');
};

yamadashy and others added 5 commits March 25, 2026 08:41
Convert static imports of initAction, mcpAction, remoteAction, and
versionAction to dynamic import() at their use sites. The default pack
path (95%+ of invocations) now avoids loading MCP server, git clone,
and init action module trees entirely.

Also inline isExplicitRemoteUrl prefix check to avoid loading
git-url-parse module for non-remote runs.

PR #1295 reports -66% module import time (358ms → 123ms).
Cherry-picked optimization (4/5 reviewer consensus).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove log-update dependency (and its wrap-ansi → string-width chain,
~49ms module load) in favor of direct process.stderr.write with ANSI
\x1B[2K\r for single-line in-place updates.

The spinner only ever writes single lines, so log-update's multi-line
and terminal-width handling was unnecessary overhead.

Cherry-picked optimization from PR #1295 (4/5 reviewer consensus).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…arse

Add Promise-based Map cache to isGitRepository() keyed by directory.
When getGitDiffs and getGitLogs run concurrently, both call
isGitRepository on the same directory — the cache ensures only one
git rev-parse process is spawned instead of multiple.

Cache is bypassed when custom deps are provided (test mocks).

Cherry-picked optimization from PR #1295 (4/5 reviewer consensus).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove file content from the worker→main process IPC response since
the main process only uses processedFiles[].path for the token count
tree reporter. For a typical repo with 1000 files averaging 4KB each,
this avoids ~4MB of structured clone serialization.

Cherry-picked optimization from PR #1295 (4/5 reviewer consensus).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wrap sequential getGitDiffs() and getGitLogs() calls in Promise.all()
since both are independent git subprocess operations. Saves the
duration of the shorter call (~5-20ms) by overlapping their I/O.

Cherry-picked optimization from PR #1295 (3/5 reviewer consensus).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@yamadashy yamadashy closed this Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant