Skip to content

perf(core): Prefetch git sort data to overlap with file search and collection#1467

Merged
yamadashy merged 2 commits intomainfrom
perf/prefetch-sort-data
Apr 14, 2026
Merged

perf(core): Prefetch git sort data to overlap with file search and collection#1467
yamadashy merged 2 commits intomainfrom
perf/prefetch-sort-data

Conversation

@yamadashy
Copy link
Copy Markdown
Owner

@yamadashy yamadashy commented Apr 14, 2026

Summary

  • Add prefetchSortData() to launch git --version + git log --name-only subprocesses early in the pipeline, so that sortOutputFiles hits the in-memory cache instead of spawning ~15ms of subprocess overhead on the critical path
  • The prefetch runs concurrently with searchFiles and collectFiles, hiding the git subprocess cost

Test plan

  • npm run lint passes
  • npm run test — 1115/1115 tests pass
  • node --run bench shows improvement compared to main

🤖 Generated with Claude Code


Open with Devin

…llection

Launch git --version + git log --name-only subprocesses early in the
pipeline so that sortOutputFiles hits the in-memory cache instead of
spawning ~15ms of subprocess overhead on the critical path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 14, 2026

⚡ Performance Benchmark

Latest commit:f4a318c fix(core): Add trace log for prefetch sort data failure
Status:✅ Benchmark complete!
Ubuntu:1.38s (±0.02s) → 1.37s (±0.03s) · -0.01s (-0.6%)
macOS:0.98s (±0.43s) → 1.00s (±0.30s) · +0.01s (+1.3%)
Windows:1.77s (±0.03s) → 1.77s (±0.05s) · -0.00s (-0.2%)
Details
  • Packing the repomix repository with node bin/repomix.cjs
  • Warmup: 2 runs (discarded), interleaved execution
  • Measurement: 20 runs / 30 on macOS (median ± IQR)
  • Workflow run
History

d6333ae perf(core): Prefetch git sort data to overlap with file search and collection

Ubuntu:1.34s (±0.03s) → 1.34s (±0.02s) · +0.00s (+0.2%)
macOS:0.78s (±0.07s) → 0.79s (±0.07s) · +0.01s (+1.3%)
Windows:1.71s (±0.06s) → 1.69s (±0.07s) · -0.02s (-1.3%)

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 14, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ac330a82-28fd-4031-ba3a-e07c7c7bf1ea

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR introduces a prefetch mechanism for file change count data used in sorting. A new prefetchSortData function conditionally pre-populates an in-memory cache when sorting by changes is enabled. The packager invokes this function before the sorting operation to warm up the cache, enabling the sorting logic to reuse cached data instead of fetching it during sort.

Changes

Cohort / File(s) Summary
Cache Prefetch Function
src/core/output/outputSort.ts
Added new exported async function prefetchSortData that conditionally calls getFileChangeCounts to populate cache ahead of sort operation based on config.output.git?.sortByChanges setting.
Packager Integration
src/core/packager.ts
Updated imports to include prefetchSortData, added promise-based prefetch call with error suppression before sortOutputFiles invocation to ensure cache is populated for sorting.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • PR #1013: Introduces the underlying file-change count caching and getFileChangeCounts logic that the new prefetchSortData function directly depends on.
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'perf(core): Prefetch git sort data to overlap with file search and collection' clearly summarizes the main change—prefetching git sort data to improve performance by overlapping subprocess execution.
Description check ✅ Passed The description includes a summary of changes, test plan with checkmarks indicating tests passed, and references both lint and test requirements from the template.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/prefetch-sort-data

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 14, 2026

Codecov Report

❌ Patch coverage is 87.50000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 87.18%. Comparing base (4c356f7) to head (f4a318c).
⚠️ Report is 3 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/core/packager.ts 66.66% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1467   +/-   ##
=======================================
  Coverage   87.18%   87.18%           
=======================================
  Files         117      117           
  Lines        4465     4473    +8     
  Branches     1032     1034    +2     
=======================================
+ Hits         3893     3900    +7     
- Misses        572      573    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 14, 2026

Code Review

Overall: Looks good ✅ — Clean, minimal, and well-reasoned performance optimization. The fire-and-forget prefetch pattern is correct, and the change is architecturally consistent with the existing codebase.

Feedback

1. Silent error swallowing — consider adding a trace log (minor)
const sortDataPromise = deps.prefetchSortData(config).catch(() => {});

Since getFileChangeCounts already returns null on failure internally, prefetchSortData shouldn't throw in practice. However, if it ever does, the failure is completely invisible. A logger.trace in the catch would help debugging without changing behavior:

const sortDataPromise = deps.prefetchSortData(config).catch((error) => {
  logger.trace('Prefetch sort data failed:', error);
});
2. No unit tests for `prefetchSortData` (non-blocking)

The new exported function has no test coverage in tests/core/output/outputSort.test.ts or tests/core/packager.test.ts. Since it's a thin wrapper over already-tested getFileChangeCounts, this isn't critical, but a couple of tests would be nice:

  • Returns early when sortByChanges is disabled
  • Calls getFileChangeCounts when enabled
  • The packager mock/injection of the new dep
3. `packager.ts` reaches ~263 lines, slightly over the 250-line guideline (pre-existing)

This PR adds ~6 lines to an already borderline file. Not a blocker for this PR, but worth noting for future cleanup.

No security concerns, no race conditions with the shared cache, and the concurrency approach correctly overlaps git subprocess I/O with file search/collection.

🤖 Generated with Claude Code

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Apr 14, 2026

Deploying repomix with  Cloudflare Pages  Cloudflare Pages

Latest commit: f4a318c
Status: ✅  Deploy successful!
Preview URL: https://caa85cf3.repomix.pages.dev
Branch Preview URL: https://perf-prefetch-sort-data.repomix.pages.dev

View logs

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

Open in Devin Review

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/core/packager.ts (1)

81-84: Don’t suppress prefetch failures silently.

Line 83 hides all failures; add a trace log so prefetch regressions remain observable without affecting runtime behavior.

Patch suggestion
 import { logMemoryUsage, withMemoryLogging } from '../shared/memoryUtils.js';
+import { logger } from '../shared/logger.js';
@@
-  const sortDataPromise = deps.prefetchSortData(config).catch(() => {});
+  const sortDataPromise = deps.prefetchSortData(config).catch((error) => {
+    logger.trace('Failed to prefetch sort data; falling back to on-demand git fetch.', error);
+  });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/packager.ts` around lines 81 - 84, The current prefetch call assigns
sortDataPromise = deps.prefetchSortData(config).catch(() => {}) which swallows
all errors; change the catch to log the error at trace level while preserving
the swallow (e.g., .catch(err => { logger.trace?.('deps.prefetchSortData
failed', err); })) so regressions remain observable; locate the call to
deps.prefetchSortData and replace the empty catch with a trace-level log using
your module's logger (or console.trace if no logger is available) referencing
sortDataPromise and config.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/core/packager.ts`:
- Around line 81-84: The current prefetch call assigns sortDataPromise =
deps.prefetchSortData(config).catch(() => {}) which swallows all errors; change
the catch to log the error at trace level while preserving the swallow (e.g.,
.catch(err => { logger.trace?.('deps.prefetchSortData failed', err); })) so
regressions remain observable; locate the call to deps.prefetchSortData and
replace the empty catch with a trace-level log using your module's logger (or
console.trace if no logger is available) referencing sortDataPromise and config.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f8d7d9de-2f95-42ab-82f4-1cb0c0d98955

📥 Commits

Reviewing files that changed from the base of the PR and between 4c356f7 and d6333ae.

📒 Files selected for processing (2)
  • src/core/output/outputSort.ts
  • src/core/packager.ts

gemini-code-assist[bot]

This comment was marked as resolved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 14, 2026

Code Review (Update)

Overall: Looks good ✅ — Ready to merge. This is a clean, focused performance optimization with correct concurrency handling.

What changed since last review

The trace logging suggestion from the previous review has been addressed in commit f4a318c. The .catch() now logs via logger.trace, making prefetch failures observable.

Reviewer consensus

All six review dimensions checked out:

Area Verdict
Code Quality ✅ Clean — naming, DRY, and API design are consistent with the codebase
Security ✅ No concerns — no new attack surface; same safe execFile pattern as existing code
Performance ✅ Sound approach — overlap is correct; marginal benchmarks reflect small test repo, not a flaw
Test Coverage ⚠️ Minor gap (non-blocking) — see below
Conventions ✅ Fully compliant with CLAUDE.md guidelines
Holistic ✅ Ready to merge

Remaining feedback

1. No direct unit tests for prefetchSortData (non-blocking)

Codecov reports coverage via integration paths, but there are no explicit unit tests for prefetchSortData in outputSort.test.ts. Two simple tests would add confidence:

  • Early return when sortByChanges is disabled
  • Calls getFileChangeCounts with correct args when enabled

Not a merge blocker since the function is a thin wrapper over already-tested internals.

2. Potential concurrent call duplication (non-blocking, pre-existing)

The getFileChangeCounts cache stores resolved values, not promises. If two calls with the same cache key overlap before the first resolves, both will spawn git subprocesses. In this PR's single-pack() flow, prefetchSortData is always awaited before sortOutputFiles, so this race cannot occur. However, in a long-lived MCP server handling parallel requests for the same repo, it could. This is a pre-existing design concern in the caching layer, not introduced by this PR — worth a follow-up issue if MCP concurrency matters.

3. packager.ts at 266 lines (pre-existing)

Slightly over the 250-line guideline. Not introduced by this PR (+10 lines to an already borderline file). Worth noting for future cleanup.

No blocking issues. Nice optimization! 🤖

🤖 Generated with Claude Code

@yamadashy yamadashy merged commit 1c01d69 into main Apr 14, 2026
61 checks passed
@yamadashy yamadashy deleted the perf/prefetch-sort-data branch April 14, 2026 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant