fix(host-service): count untracked file lines in getStatus#3701
Conversation
Untracked files were pushed into `unstaged` with `additions: 0`, so the sidebar LOC delta missed every newly-added file in the working tree (`useDiffStats` sums against-base + staged + unstaged).
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughUntracked-file handling and rename detection in Git status were reworked: staged numstat now runs with rename/copy collapse flags and sets Changes
Sequence Diagram(s)sequenceDiagram
participant Client as "Host Service API"
participant Router as "git.ts (router)"
participant Helper as "git-helpers"
participant FS as "Filesystem"
participant Git as "git (CLI)"
participant Temp as "Temp Index"
Client->>Router: getStatus request
Router->>Helper: run staged numstat (-M -C) & build staged entries
Router->>Helper: collect untracked paths
Helper->>FS: resolve worktree & validate untracked paths
Helper->>FS: countUntrackedFileLines(untrackedPaths)
FS-->>Helper: additions per untracked file
Helper->>Temp: copy real index to temp GIT_INDEX_FILE
Helper->>Temp: set GIT_INDEX_FILE env
Helper->>Git: git add --intent-to-add ...untrackedPaths (temp index)
Git-->>Helper: temp index updated
Helper->>Git: git diff --name-status -M -C and git diff --numstat -M -C (temp index)
Git-->>Helper: rename/copy detections + numstat info
Helper->>Helper: merge detected renames into unstaged list (filter consumed entries)
Helper->>Temp: cleanup temp index
Helper-->>Router: merged unstaged list
Router-->>Client: final status response (staged + merged unstaged)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR fixes a LOC delta undercount in Confidence Score: 5/5Safe to merge — all remaining findings are P2 style/quality suggestions that do not affect correctness or security. The fix is targeted and correct: object references are shared between untrackedFiles and unstaged so in-place mutations propagate properly, the path-traversal guards are sound (lexical pre-check + double realpath check), and the line-counting formula handles empty files, LF, and CRLF correctly. Both open findings are P2 (unbounded concurrency and binary-file line noise) — neither causes wrong data to be persisted or crashes under normal use. packages/host-service/src/trpc/router/git/utils/git-helpers.ts — the two P2 concerns live here but are non-blocking.
|
| Filename | Overview |
|---|---|
| packages/host-service/src/trpc/router/git/utils/git-helpers.ts | Adds countUntrackedFileLines with path-traversal guards (double realpath check), 1 MB file-size cap, and correct CRLF-aware line-counting logic; two P2 concerns: unbounded Promise.all concurrency and silent binary-file line counting. |
| packages/host-service/src/trpc/router/git/git.ts | Correctly separates untracked entries into a dedicated untrackedFiles array (sharing object references with unstaged) and calls countUntrackedFileLines to mutate additions in-place before returning. |
Sequence Diagram
sequenceDiagram
participant Client
participant getStatus
participant git
participant countUntrackedFileLines
participant fs
Client->>getStatus: query(workspaceId)
getStatus->>git: status()
getStatus->>git: diff --numstat -z --cached (staged)
getStatus->>git: diff --numstat -z (unstaged)
git-->>getStatus: status.files, stagedNumstat, unstagedNumstat
loop for each status.files entry
alt index=="?" && working_dir=="?"
getStatus->>getStatus: push entry to untrackedFiles[] + unstaged[]
else tracked & modified
getStatus->>getStatus: push to staged[] or unstaged[] with numstat
end
end
getStatus->>countUntrackedFileLines: (worktreePath, untrackedFiles[])
loop for each untracked file (Promise.all)
countUntrackedFileLines->>fs: realpath(worktreePath)
countUntrackedFileLines->>fs: realpath(absolutePath)
countUntrackedFileLines->>fs: stat(fileReal)
alt size <= 1MB && isFile
countUntrackedFileLines->>fs: readFile(fileReal, utf-8)
fs-->>countUntrackedFileLines: content
countUntrackedFileLines->>countUntrackedFileLines: count lines → file.additions
end
end
countUntrackedFileLines-->>getStatus: mutations applied in-place
getStatus-->>Client: staged, unstaged (with additions), currentBranch, ...
Prompt To Fix All With AI
This is a comment left during a code review.
Path: packages/host-service/src/trpc/router/git/utils/git-helpers.ts
Line: 227-250
Comment:
**Unbounded concurrency may exhaust file descriptors**
`Promise.all` launches one `realpath` + `stat` + `readFile` for every untracked file simultaneously. On a fresh checkout where build artifacts are not gitignored there could be thousands of untracked files, which can hit the process's open-file-descriptor limit (`ulimit -n`, typically 1024 on Linux). Individual file operations will throw and be swallowed by the empty `catch {}`, so additions silently stay at 0 rather than crashing, but the counts will be silently wrong. A concurrency limiter (e.g. `p-limit` with ~64 slots) would prevent this without meaningfully increasing wall-clock time.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: packages/host-service/src/trpc/router/git/utils/git-helpers.ts
Line: 241-247
Comment:
**Binary files under 1 MB will be read and "counted"**
`readFile(fileReal, "utf-8")` succeeds for binary files; Node.js replaces invalid byte sequences with the Unicode replacement character (`\uFFFD`) rather than throwing. A `.png`, `.lock`, or compiled object file under the 1 MB cap will get a non-zero `additions` value that is meaningless. Checking `stats.size === 0` (already covered) doesn't help here; you may want to also guard against files that look binary, e.g. by inspecting the first few bytes for null bytes before counting lines. This is a LOC quality concern, not a correctness or security issue.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix(host-service): count untracked file ..." | Re-trigger Greptile
| await Promise.all( | ||
| files.map(async (file) => { | ||
| try { | ||
| const absolutePath = resolve(worktreePath, file.path); | ||
| if (!isPathWithinWorktree(worktreePath, absolutePath)) return; | ||
|
|
||
| const fileReal = await realpath(absolutePath); | ||
| if (!isPathWithinWorktree(worktreeReal, fileReal)) return; | ||
|
|
||
| const stats = await stat(fileReal); | ||
| if (!stats.isFile() || stats.size > MAX_UNTRACKED_LINE_COUNT_SIZE) { | ||
| return; | ||
| } | ||
|
|
||
| const content = await readFile(fileReal, "utf-8"); | ||
| file.additions = | ||
| content === "" | ||
| ? 0 | ||
| : content.endsWith("\n") | ||
| ? content.split(/\r?\n/).length - 1 | ||
| : content.split(/\r?\n/).length; | ||
| } catch {} | ||
| }), | ||
| ); |
There was a problem hiding this comment.
Unbounded concurrency may exhaust file descriptors
Promise.all launches one realpath + stat + readFile for every untracked file simultaneously. On a fresh checkout where build artifacts are not gitignored there could be thousands of untracked files, which can hit the process's open-file-descriptor limit (ulimit -n, typically 1024 on Linux). Individual file operations will throw and be swallowed by the empty catch {}, so additions silently stay at 0 rather than crashing, but the counts will be silently wrong. A concurrency limiter (e.g. p-limit with ~64 slots) would prevent this without meaningfully increasing wall-clock time.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/host-service/src/trpc/router/git/utils/git-helpers.ts
Line: 227-250
Comment:
**Unbounded concurrency may exhaust file descriptors**
`Promise.all` launches one `realpath` + `stat` + `readFile` for every untracked file simultaneously. On a fresh checkout where build artifacts are not gitignored there could be thousands of untracked files, which can hit the process's open-file-descriptor limit (`ulimit -n`, typically 1024 on Linux). Individual file operations will throw and be swallowed by the empty `catch {}`, so additions silently stay at 0 rather than crashing, but the counts will be silently wrong. A concurrency limiter (e.g. `p-limit` with ~64 slots) would prevent this without meaningfully increasing wall-clock time.
How can I resolve this? If you propose a fix, please make it concise.| const content = await readFile(fileReal, "utf-8"); | ||
| file.additions = | ||
| content === "" | ||
| ? 0 | ||
| : content.endsWith("\n") | ||
| ? content.split(/\r?\n/).length - 1 | ||
| : content.split(/\r?\n/).length; |
There was a problem hiding this comment.
Binary files under 1 MB will be read and "counted"
readFile(fileReal, "utf-8") succeeds for binary files; Node.js replaces invalid byte sequences with the Unicode replacement character (\uFFFD) rather than throwing. A .png, .lock, or compiled object file under the 1 MB cap will get a non-zero additions value that is meaningless. Checking stats.size === 0 (already covered) doesn't help here; you may want to also guard against files that look binary, e.g. by inspecting the first few bytes for null bytes before counting lines. This is a LOC quality concern, not a correctness or security issue.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/host-service/src/trpc/router/git/utils/git-helpers.ts
Line: 241-247
Comment:
**Binary files under 1 MB will be read and "counted"**
`readFile(fileReal, "utf-8")` succeeds for binary files; Node.js replaces invalid byte sequences with the Unicode replacement character (`\uFFFD`) rather than throwing. A `.png`, `.lock`, or compiled object file under the 1 MB cap will get a non-zero `additions` value that is meaningless. Checking `stats.size === 0` (already covered) doesn't help here; you may want to also guard against files that look binary, e.g. by inspecting the first few bytes for null bytes before counting lines. This is a LOC quality concern, not a correctness or security issue.
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
🧹 Nitpick comments (3)
packages/host-service/src/trpc/router/git/git.ts (1)
139-164: Shared object reference acrossunstagedanduntrackedFilesis an implicit contract.The same
entryobject is pushed into both arrays, so the subsequentcountUntrackedFileLines(worktreePath, untrackedFiles)call mutates objects that are also referenced byunstaged. This is what makes the fix work, but a future reader could reasonably assumeuntrackedFilesis independent ofunstagedand introduce a bug (e.g., by cloning, filtering, or reassigning). Consider either a short comment pinning the invariant, or buildingunstagedfrom the finaluntrackedFilesstate after counting to make the data flow explicit.♻️ Option: make the mutation explicit with a comment
const unstaged: ChangedFile[] = []; const untrackedFiles: ChangedFile[] = []; for (const file of status.files) { const wd = file.working_dir; if (file.index === "?" && wd === "?") { const entry: ChangedFile = { path: file.path, status: "untracked", additions: 0, deletions: 0, }; + // Same object reference — mutations by countUntrackedFileLines + // below propagate into `unstaged`. untrackedFiles.push(entry); unstaged.push(entry);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/host-service/src/trpc/router/git/git.ts` around lines 139 - 164, The code currently pushes the same object reference "entry" into both untrackedFiles and unstaged, relying on countUntrackedFileLines(worktreePath, untrackedFiles) to mutate those objects for unstaged; make this invariant explicit or remove the implicit sharing: either add a short comment above the loop noting that untrackedFiles and unstaged intentionally share object references so countUntrackedFileLines can update line counts, or change the flow to keep untrackedFiles independent (push a new object to untrackedFiles, call countUntrackedFileLines(worktreePath, untrackedFiles), then build unstaged from the updated untrackedFiles) and use mapGitStatus(file.working_dir) as before to set status — reference symbols: untrackedFiles, unstaged, entry, countUntrackedFileLines, mapGitStatus, worktreePath.packages/host-service/src/trpc/router/git/utils/git-helpers.ts (2)
236-247: Sub-1MB binary files still contribute misleading LOC.
stats.isFile()+ size cap filters out very large non-code files, but any binary file under 1 MiB (small PNG, compiled artifact, pdf, etc.) is still read as utf-8 and split on newlines, producing an essentially random "additions" count in the sidebar. A cheap guard is to bail when the first read chunk contains a NUL byte — matching howgititself detects binary content.♻️ Proposed guard
const content = await readFile(fileReal, "utf-8"); + // Match git's binary heuristic: NUL byte in the content. + if (content.includes("\0")) return; file.additions = content === "" ? 0 : content.endsWith("\n") ? content.split(/\r?\n/).length - 1 : content.split(/\r?\n/).length;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/host-service/src/trpc/router/git/utils/git-helpers.ts` around lines 236 - 247, The current logic uses stat(...) and readFile(fileReal, "utf-8") which still miscounts small binary files; modify the flow in the block that sets file.additions (using stat, fileReal, MAX_UNTRACKED_LINE_COUNT_SIZE, file.additions) to first read a small binary buffer of the file (e.g., first few KB) and check for a NUL (0x00) byte — if found, return/do not set additions; only if no NUL is detected then read the file as UTF-8 and compute additions as before. Ensure the NUL-check happens before any utf-8 decoding to avoid miscounting binary files.
227-250: Unbounded parallel filesystem reads on untracked files.
Promise.all(files.map(...))fans out one concurrentrealpath+stat+readFileper untracked file with no concurrency cap. In pathological repos (e.g., a freshly-scaffolded workspace, build outputs that aren't gitignored yet, or a repo state after an unpacked archive), this can hit the per-process FD limit (EMFILE) or spike memory when many <1MB files are read simultaneously. Consider batching with a small concurrency limit (e.g., 8–16).♻️ Simple batching example
- await Promise.all( - files.map(async (file) => { - try { - const absolutePath = resolve(worktreePath, file.path); - if (!isPathWithinWorktree(worktreePath, absolutePath)) return; - - const fileReal = await realpath(absolutePath); - if (!isPathWithinWorktree(worktreeReal, fileReal)) return; - - const stats = await stat(fileReal); - if (!stats.isFile() || stats.size > MAX_UNTRACKED_LINE_COUNT_SIZE) { - return; - } - - const content = await readFile(fileReal, "utf-8"); - file.additions = - content === "" - ? 0 - : content.endsWith("\n") - ? content.split(/\r?\n/).length - 1 - : content.split(/\r?\n/).length; - } catch {} - }), - ); + const CONCURRENCY = 16; + for (let i = 0; i < files.length; i += CONCURRENCY) { + await Promise.all( + files.slice(i, i + CONCURRENCY).map(async (file) => { + /* same body */ + }), + ); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/host-service/src/trpc/router/git/utils/git-helpers.ts` around lines 227 - 250, The current Promise.all(files.map(...)) in git-helpers.ts performs unbounded concurrent realpath/stat/readFile calls and can exhaust file descriptors; replace this flat Promise.all fan-out with a bounded concurrency approach (e.g., use a p-limit with concurrency 8–16 or implement an async worker queue) to process the files array, keeping the same per-file logic (resolve(worktreePath, file.path), isPathWithinWorktree checks, realpath, stat with MAX_UNTRACKED_LINE_COUNT_SIZE guard, readFile, and setting file.additions) and preserving the existing try/catch behavior for each file; reference the async block around realpath/stat/readFile and variables worktreePath, worktreeReal, isPathWithinWorktree, realpath, stat, readFile, MAX_UNTRACKED_LINE_COUNT_SIZE, and file.additions when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@packages/host-service/src/trpc/router/git/git.ts`:
- Around line 139-164: The code currently pushes the same object reference
"entry" into both untrackedFiles and unstaged, relying on
countUntrackedFileLines(worktreePath, untrackedFiles) to mutate those objects
for unstaged; make this invariant explicit or remove the implicit sharing:
either add a short comment above the loop noting that untrackedFiles and
unstaged intentionally share object references so countUntrackedFileLines can
update line counts, or change the flow to keep untrackedFiles independent (push
a new object to untrackedFiles, call countUntrackedFileLines(worktreePath,
untrackedFiles), then build unstaged from the updated untrackedFiles) and use
mapGitStatus(file.working_dir) as before to set status — reference symbols:
untrackedFiles, unstaged, entry, countUntrackedFileLines, mapGitStatus,
worktreePath.
In `@packages/host-service/src/trpc/router/git/utils/git-helpers.ts`:
- Around line 236-247: The current logic uses stat(...) and readFile(fileReal,
"utf-8") which still miscounts small binary files; modify the flow in the block
that sets file.additions (using stat, fileReal, MAX_UNTRACKED_LINE_COUNT_SIZE,
file.additions) to first read a small binary buffer of the file (e.g., first few
KB) and check for a NUL (0x00) byte — if found, return/do not set additions;
only if no NUL is detected then read the file as UTF-8 and compute additions as
before. Ensure the NUL-check happens before any utf-8 decoding to avoid
miscounting binary files.
- Around line 227-250: The current Promise.all(files.map(...)) in git-helpers.ts
performs unbounded concurrent realpath/stat/readFile calls and can exhaust file
descriptors; replace this flat Promise.all fan-out with a bounded concurrency
approach (e.g., use a p-limit with concurrency 8–16 or implement an async worker
queue) to process the files array, keeping the same per-file logic
(resolve(worktreePath, file.path), isPathWithinWorktree checks, realpath, stat
with MAX_UNTRACKED_LINE_COUNT_SIZE guard, readFile, and setting file.additions)
and preserving the existing try/catch behavior for each file; reference the
async block around realpath/stat/readFile and variables worktreePath,
worktreeReal, isPathWithinWorktree, realpath, stat, readFile,
MAX_UNTRACKED_LINE_COUNT_SIZE, and file.additions when making the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fe52e57e-5504-4fe1-bcd4-a99ddd4a9c46
📒 Files selected for processing (2)
packages/host-service/src/trpc/router/git/git.tspackages/host-service/src/trpc/router/git/utils/git-helpers.ts
🧹 Preview Cleanup CompleteThe following preview resources have been cleaned up:
Thank you for your contribution! 🎉 |
There was a problem hiding this comment.
1 issue found across 2 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="packages/host-service/src/trpc/router/git/utils/git-helpers.ts">
<violation number="1" location="packages/host-service/src/trpc/router/git/utils/git-helpers.ts:227">
P1: Unbounded concurrency on file I/O: `Promise.all` opens `realpath` + `stat` + `readFile` for every untracked file simultaneously. In workspaces with thousands of untracked files (the PR description explicitly mentions this scenario), this will hit the OS file-descriptor limit (`EMFILE`). Because errors are swallowed by the empty `catch {}`, additions silently stay at `0` — defeating the purpose of this change. Use a concurrency limiter (e.g. `p-limit(64)`) to cap parallel file operations.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Run git's real rename/copy detection over the working tree by copying .git/index to a temp file, marking untracked files intent-to-add against that copy, and running `git diff -M -C`. Real index is never mutated. Merges matched deleted+untracked pairs into single rename entries with correct additions/deletions, and renders `old → new` in the v2 sidebar file row. Catches: - mv tracked → untracked (rename) - cp tracked → untracked (copy, sourced from the tracked blob) - mv with edits (rename with non-zero stats) Falls back silently to the unmerged deleted+untracked listing on any error (missing index, copy failure, intent-to-add reject).
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/desktop/src/renderer/routes/_authenticated/_dashboard/v2-workspace/$workspaceId/components/WorkspaceSidebar/hooks/useChangesTab/components/ChangesFileList/components/FileRow/FileRow.tsx (1)
46-75:⚠️ Potential issue | 🟡 MinorCross-directory renames render a misleading path.
dircomes from the new path, butoldBasenameis only the leaf ofoldPath. If a file was renamedsrc/foo.ts → lib/bar.ts, this renders aslib/ foo.ts → bar.ts, which reads as if the old file lived inlib/too. For the case rename detection is most useful for (moves across directories), the old location is lost in the UI.Either show the full old relative path, or split and render the old dir when it differs:
🛠️ One possible shape
- const { dir, basename } = splitPath(file.path); - const oldBasename = - file.oldPath && (file.status === "renamed" || file.status === "copied") - ? splitPath(file.oldPath).basename - : null; + const { dir, basename } = splitPath(file.path); + const old = + file.oldPath && (file.status === "renamed" || file.status === "copied") + ? splitPath(file.oldPath) + : null;- {oldBasename && ( - <span className="truncate text-muted-foreground"> - {oldBasename} - <span className="px-1">→</span> - </span> - )} + {old && ( + <span className="truncate text-muted-foreground"> + {old.dir !== dir && old.dir} + {old.basename} + <span className="px-1">→</span> + </span> + )}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/desktop/src/renderer/routes/_authenticated/_dashboard/v2-workspace/`$workspaceId/components/WorkspaceSidebar/hooks/useChangesTab/components/ChangesFileList/components/FileRow/FileRow.tsx around lines 46 - 75, oldBasename only shows the old path's leaf which loses directory info for cross-directory renames; update the logic around oldBasename (and where splitPath(file.oldPath) is used) to compute oldDir and oldRelPath (e.g., splitPath(file.oldPath).dir and basename) and then render the old path's directory when it differs from the new dir (or render the full old relative path) inside the FileRow render path that currently uses dir and oldBasename; ensure on the UI you show the old directory segment (or full old relative path) before the arrow so renames like src/foo.ts → lib/bar.ts read correctly.
🧹 Nitpick comments (3)
packages/host-service/src/trpc/router/git/git.ts (1)
165-173: Independent work — can run in parallel.
countUntrackedFileLinesonly touchesuntrackedFiles[].additionson the filesystem anddetectUnstagedRenamesonly shells out to git in a throwaway index; neither reads the other's output. APromise.allwould knock the added getStatus latency down noticeably on workspaces with many new files.⚙️ Suggested tweak
- await countUntrackedFileLines(worktreePath, untrackedFiles); - - const hasDeletions = unstaged.some((f) => f.status === "deleted"); - const renames = await detectUnstagedRenames( - git, - worktreePath, - untrackedFiles.map((f) => f.path), - hasDeletions, - ); + const hasDeletions = unstaged.some((f) => f.status === "deleted"); + const [, renames] = await Promise.all([ + countUntrackedFileLines(worktreePath, untrackedFiles), + detectUnstagedRenames( + git, + worktreePath, + untrackedFiles.map((f) => f.path), + hasDeletions, + ), + ]);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/host-service/src/trpc/router/git/git.ts` around lines 165 - 173, countUntrackedFileLines and detectUnstagedRenames are independent and currently run sequentially, adding unnecessary latency; modify the code around countUntrackedFileLines(...) and detectUnstagedRenames(git, worktreePath, untrackedFiles.map(...), hasDeletions) to execute them in parallel (e.g., via Promise.all) while still preserving the computed hasDeletions and the untrackedFiles input, and await both results before proceeding.packages/host-service/src/trpc/router/git/utils/git-helpers.ts (2)
244-255: Consider a binary heuristic to skip non-text files.Files under 1 MiB that are binary (images, small pdfs, sqlite dbs,
.lockblobs, etc.) still getreadFile(..., "utf-8")and split on\r?\n, producing an arbitraryadditionsbased on how many\nbytes happen to appear. That contradicts the comment's own rationale ("LOC signal isn't useful"), and the value still ends up in the sidebar delta. A cheap NUL-byte sniff on the first read chunk (or a filename/extension skip list) would match the stated intent without adding meaningful cost.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/host-service/src/trpc/router/git/utils/git-helpers.ts` around lines 244 - 255, The current logic in git-helpers.ts reads any file under MAX_UNTRACKED_LINE_COUNT_SIZE with readFile(..., "utf-8") and computes file.additions by splitting on newlines, which miscounts binary blobs; before calling readFile (after stat and before computing file.additions), perform a cheap binary sniff (e.g., read the first chunk or N bytes and check for a NUL byte) or match against a small binary-extension whitelist, and if the sniff indicates binary, skip reading/parsing and set file.additions to 0 or undefined; update the code paths around the stat -> readFile -> file.additions logic to use this binary check so binary files are not treated as text.
307-317:git add --intent-to-add -- ...untrackedPathscan hit ARG_MAX on bulk-scaffolded repos.For workspaces with a very large untracked set (e.g., first commit of a scaffolded project, or a
node_modules-adjacent folder that isn't yet ignored), spreading every path into a single argv crosses the OS argument length limit (~128 KB on Linux) andgit addfails withE2BIG. The whole function then silently returns[], so users lose unstaged rename detection in exactly the scenario where it matters most.Two cheap mitigations:
- Batch the paths into chunks (e.g., 500–1000 per
git addinvocation), or- Pipe paths via stdin with
git add --intent-to-add --pathspec-from-file=- --pathspec-file-nul -and feed NUL-separated paths.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/host-service/src/trpc/router/git/utils/git-helpers.ts` around lines 307 - 317, The current call in git-helpers (tempGit.raw(["add","--intent-to-add","--", ...untrackedPaths])) can exceed ARG_MAX for large untrackedPaths; change the add invocation to feed NUL-separated paths via stdin instead: call tempGit.raw with args ["add","--intent-to-add","--pathspec-from-file=-","--pathspec-file-nul","--"] and pass a NUL-terminated Buffer of untrackedPaths (e.g., Buffer.from(untrackedPaths.join("\0") + "\0")) as stdin to avoid E2BIG; update the code around tempGit, worktreePath and tempIndex usages to use this stdin approach (or as a fallback implement chunked batches of paths if you prefer) so the function no longer silently fails on very large untracked sets.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In
`@apps/desktop/src/renderer/routes/_authenticated/_dashboard/v2-workspace/`$workspaceId/components/WorkspaceSidebar/hooks/useChangesTab/components/ChangesFileList/components/FileRow/FileRow.tsx:
- Around line 46-75: oldBasename only shows the old path's leaf which loses
directory info for cross-directory renames; update the logic around oldBasename
(and where splitPath(file.oldPath) is used) to compute oldDir and oldRelPath
(e.g., splitPath(file.oldPath).dir and basename) and then render the old path's
directory when it differs from the new dir (or render the full old relative
path) inside the FileRow render path that currently uses dir and oldBasename;
ensure on the UI you show the old directory segment (or full old relative path)
before the arrow so renames like src/foo.ts → lib/bar.ts read correctly.
---
Nitpick comments:
In `@packages/host-service/src/trpc/router/git/git.ts`:
- Around line 165-173: countUntrackedFileLines and detectUnstagedRenames are
independent and currently run sequentially, adding unnecessary latency; modify
the code around countUntrackedFileLines(...) and detectUnstagedRenames(git,
worktreePath, untrackedFiles.map(...), hasDeletions) to execute them in parallel
(e.g., via Promise.all) while still preserving the computed hasDeletions and the
untrackedFiles input, and await both results before proceeding.
In `@packages/host-service/src/trpc/router/git/utils/git-helpers.ts`:
- Around line 244-255: The current logic in git-helpers.ts reads any file under
MAX_UNTRACKED_LINE_COUNT_SIZE with readFile(..., "utf-8") and computes
file.additions by splitting on newlines, which miscounts binary blobs; before
calling readFile (after stat and before computing file.additions), perform a
cheap binary sniff (e.g., read the first chunk or N bytes and check for a NUL
byte) or match against a small binary-extension whitelist, and if the sniff
indicates binary, skip reading/parsing and set file.additions to 0 or undefined;
update the code paths around the stat -> readFile -> file.additions logic to use
this binary check so binary files are not treated as text.
- Around line 307-317: The current call in git-helpers
(tempGit.raw(["add","--intent-to-add","--", ...untrackedPaths])) can exceed
ARG_MAX for large untrackedPaths; change the add invocation to feed
NUL-separated paths via stdin instead: call tempGit.raw with args
["add","--intent-to-add","--pathspec-from-file=-","--pathspec-file-nul","--"]
and pass a NUL-terminated Buffer of untrackedPaths (e.g.,
Buffer.from(untrackedPaths.join("\0") + "\0")) as stdin to avoid E2BIG; update
the code around tempGit, worktreePath and tempIndex usages to use this stdin
approach (or as a fallback implement chunked batches of paths if you prefer) so
the function no longer silently fails on very large untracked sets.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5e96475b-cd08-435b-80e5-fc2ef3e15ad6
📒 Files selected for processing (3)
apps/desktop/src/renderer/routes/_authenticated/_dashboard/v2-workspace/$workspaceId/components/WorkspaceSidebar/hooks/useChangesTab/components/ChangesFileList/components/FileRow/FileRow.tsxpackages/host-service/src/trpc/router/git/git.tspackages/host-service/src/trpc/router/git/utils/git-helpers.ts
There was a problem hiding this comment.
1 issue found across 3 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="packages/host-service/src/trpc/router/git/utils/git-helpers.ts">
<violation number="1" location="packages/host-service/src/trpc/router/git/utils/git-helpers.ts:340">
P3: Do not silently swallow temp-dir cleanup failures; log a warning with context so cleanup issues are observable.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
- Cap parallel file I/O at 64 workers in countUntrackedFileLines so workspaces with thousands of untracked files don't EMFILE - Sniff first 8KB for NUL bytes before counting lines so binary files under the 1MB cap (PNGs, lockfiles, compiled artifacts) don't get a meaningless utf-8 line count - Log temp-dir cleanup failures instead of silently swallowing - Add -M -C to staged numstat and propagate file.from as oldPath so staged renames collapse to a single 0/0 entry (matches the working- tree rename detection added in the previous commit)
Summary
getStatuspushed untracked (new, working-tree) files intounstagedwithadditions: 0, so the sidebar LOC delta inuseDiffStatsmissed every newly-added file. Committed/staged new files were already correct viagit diff --numstat.countUntrackedFileLines(worktreePath, files)ingit-helpers.tsand called it after the staged/unstaged loop. Reads each untracked file directly with realpath symlink-escape protection and a 1MB cap (skips multi-GB non-code files).apps/desktop/.../git-task-handlers.ts) already does this viaapplyUntrackedLineCount; this brings v2 to parity.Why not
git diff --no-index --numstat /dev/null <file>: it would spawn a subprocess per untracked file. Reading files directly is significantly cheaper for workspaces with many untracked files (e.g. fresh checkout with un-gitignored build artifacts).git add -Nwould mutate the index inside a read-onlygetStatus, which we don't want.Linear: SUPER-472
Test plan
git adda new file. Sidebar LOC reflects it (already worked, regression check).Summary by cubic
Fixes v2 sidebar LOC and file list accuracy by counting untracked file lines and detecting/collapsing renames and copies in
host-servicegetStatus. Brings v2 to parity with v1 and addresses Linear SUPER-472; the sidebar now shows old → new for renames.countUntrackedFileLines(worktreePath, files); reads untracked files with realpath checks, a 1 MB cap, 64-file I/O concurrency, and NUL-byte sniffing to skip binaries; setsadditionssouseDiffStatsincludes new files.git diff -M -Cto merge deleted+untracked into singlerenamed/copiedentries with correct stats; also apply-M -Cto staged diffs and propagateoldPathso staged renames collapse to one 0/0 row.Written for commit b0fbff1. Summary will update on new commits.
Summary by CodeRabbit
New Features
Bug Fixes