chore(upstream): add find-reset-candidates helper#9902
Conversation
Bulk-finds files that have drifted insignificantly from the last merged upstream and (optionally) resets them. Extracts shared clean/changed marker helpers into utils/markers.ts and shared reset logic into utils/reset.ts so both the single-file reset helper and the new finder use the same classification pipeline.
Replace the subprocess-based line diff in classifyDrift with an in-process multiset diff (approxDiff) so concurrent classifications don't deadlock on big text files like sprite.svg via Bun $.quiet() pipe-buffer stalls. Also filter non-code assets (SVG, PNG, fonts, archives, lock files, etc.) before classification so they don't bloat the report or stress git subprocesses. Switch kilo-only path excludes to glob pathspecs so every packages/kilo-*/ and **/kilocode/** dir is excluded without maintaining a hand list. Rename the 'whitespace-only' bucket to 'cosmetic-only' — with the multiset diff it also catches line reordering, so the old label was misleading.
|
the upstream annotation script is actually wrong here |
Fixes hang when running the finder from the repo root without a path scope. Root cause: upstream-only packages (packages/console, packages/web, packages/desktop, etc. in config.skipFiles) contain multi-megabyte media assets that stall concurrent Bun `git show` subprocesses on the pipe buffer. Changes: - Batch upstream blob sizes in one `git cat-file --batch-check` subprocess before classifying anything. Files with missing upstream land directly in `upstream-missing`; files > 256 KB land in the new `too-large` bucket. Only sane-sized survivors get per-file `git show`. - Filter out files matching the merge config's `keepOurs` and `skipFiles` globs (.github workflows, translated READMEs, upstream- only packages, etc.) — these are intentionally preserved or removed in Kilo and would otherwise pollute the report and tempt incorrect resets. Full-repo dry-run now completes in ~2s.
| }) | ||
| const entries = [...preBucketed, ...classified] | ||
|
|
||
| if (!opts.dryRun) { |
There was a problem hiding this comment.
WARNING: Auto-apply can overwrite existing local edits
The candidate list comes from git diff <upstream>..HEAD, but classification and reset read/write the current working tree. If a user has unstaged edits in any candidate file, running without --dry-run can replace those edits with upstream content and lose work before they ever get a chance to inspect the resulting diff. Consider aborting when candidate files are dirty, or at least requiring a clean working tree before entering the auto-reset block.
Code Review SummaryStatus: 1 Issue Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
Other Observations (not in diff)Issues found in unchanged code that cannot receive inline comments:
Files Reviewed (12 files)
Reviewed by gpt-5.5-20260423 · 398,896 tokens |
Adds a bulk finder/resetter that walks files which have drifted from the last merged upstream, classifies each as identical / markers-only / whitespace-only / small-diff / large-diff / upstream-missing / etc., and auto-applies the safe buckets to the working tree.
Builds on #9889. Extracts
clean/changedmarker helpers intoutils/markers.tsand the single-file reset logic intoutils/reset.tssofix-kilocode-markers.ts,reset-to-upstream.ts, and the newfind-reset-candidates.tsshare one classification pipeline. The--review-limitflag controls how many non-marker, non-whitespace diff lines still auto-reset (default 5);--dry-runreports only.